Blog Posts

Blog Posts: 18
Items per page
1 2 Previous Next

Last weekend, I presented a paper on storage workload characterization and consolidation at VPACT '09. In the paper, we did extensive characterization of some of the well known enterprise workloads such as MS exchange, OLTP, DVDStore, Decision support, in terms of various parameters such as seek distances, IO sizes, read/write ratios, outstanding IOs etc.

One of the main conclusion was that: Most workloads exhibit random access pattern. Furthermore, sequential cases such as decision support, backup or virus scanner, seem to be mainly the alternative workloads running on the same data-sets which are commonly accessed in random manner by actual applications. Detailed tracing also revealed that most of them were quite bursty in terms of arrivals.

Based on the two main observations, we tested the effect of consolidation on two different workloads, where they are first run on separate raid-groups and then on  a single raid-group consisting of the physical disk from both of them. The results from consolidation were quite encouraging as the workloads saw a big reduction in 90th percentile latency values. This is mainly because higher degree of burstiness leads to larger gains from statistical multiplexing and combining the underlying physical disks helps a great deal in absorbing the peaks during bursty intervals.

This suggests that administrators should re-think their rule of thumb of placing workloads on different spindles for better performance isolation. This does lead to better predictability in most cases but potentially causes severe over-provisioning as well. Isolation based on software mechanisms both at OS and array level would be much more desirable and cost-effective.

There is also an issue of semantic information gap, where array vendors may not get workload specific information in the request stream that they see. Hence the overall solution may involve cooperation between both end-points of the SAN or Ethernet and some sort of interface between OS and array vendors for stronger isolation.

Here is something to chew: What sort of support or interface should array vendors expose to support various performance isolation requirements for applications in terms of throughput, latency or perhaps burstiness? Are weights (or shares) good enough or we need other parameters such as latency specifications?

1,518 Views 0 Comments Permalink Tags: performance, vmware, io, storage, management, isolation, workloads, workload, characterization, resource

To VDI or not to VDI in Chris Roan's Blog

Posted by Chris Roan Apr 23, 2009 Views:2,825

Although I agree that VDI is becoming more popular there is still a fundamental issue that will either be the success or failure of this offering: price.

The biggest challenge companies' face today, like ours, is how to deliver a virtual desktop service at a cost effective price comparable to a traditional desktop or laptop.

Most financial managers who are forecasting and budgeting the bottom dollar do not care about the long term saves in OpEx (Operating Expense) they are focused on the saves they can show today. They don't care about the long term labor savings; maintenance cost avoidance and upgrade expenses. They need to show that they pay $1200 for a laptop today but tomorrow the cost for a VDI will be less than $600.

This is an interesting topic and I will be expanding on it in my blog in a few short weeks after I complete some additional fact finding for my own company's offering.

2,825 Views 1 Comments Permalink Tags: vmware, virtualization, vdi

FAST 09: Congestion aware NFS paper in Guls Byte

Posted by Ajay Gulati Apr 2, 2009 Views:1,632

This paper was presented by Alexandros from NetApp. The main idea is to adaptively schedule asynchronous requests in order to favor synchronous ones to improve application's performance. For example, when the server is congested, the clients would delay asynchronous writes, read-ahead and send only the reads, metadata requests to the server. This obviously requires the client to consume more memory to keep those dirty pages for longer interval. To handle the pressure on other resources, authors propose a mechanism that can look at utilization of all resources, assign a cost to them and compare those to make such decisions.

The main issue is deciding the cost of an operation based on multiple resources.
They use a previously known result which says that increasing the price exponentially with increase in utilization of a resource leads to a competitive ratio of (log k), where k is the ratio of benefit obtained by and offline vs an online algorithm. They use the pricing function that takes a number K_i per resource and utilization u_i to compute the price. The price is computed for each resource based on its utilization and the highest value is used to depict the current bottleneck. The price calculated by server is sent to clients as part of FSSTAT call. The prices are computed every second and FSSTAT is send along with every tenth read/write request. This is a non-blocking call and doesn't impact clients' performance.
Finally they showed using both micro-benchmarks and real apps that this can lead to about 20% better performance.

 

Here is something to chew : Can we extend this mechanism to provide prioritization among different clients, in terms of proportionate shares or weights?  

http://www.usenix.org/events/fast09/tech/full_papers/batsakis/batsakis.pdf

1,632 Views 2 Comments Permalink Tags: storage, fast, nfs, congestion, netapp

I've been giving some thought about how VMware can play a part in supporting researchers and instructors during the current economic strain. Everybody's budgets are being cut and less and less funding is available from both government and industry. So, what can industrial partners to the academic community do to help? Cold hard cash is always good, but here are a few other ideas to think about:

- Free software licenses - VMware makes most of their software freely available to support research projects and classroom activities that depend on virtualization tools. Even the really pricey stuff is free if the project qualifies as a virtualization or related effort. It would be great if other industrial partners did the same.

- Discounted/used hardware - once and awhile, a not-so-old server, SAN or even network gear needs to come out of a data center somewhere. Industrial partners need to remember that many research and teaching labs would benefit from something that is only a year or so old. Make a match! Post something on govirtual.org's Grants and Funding Opportunities discussion group if you know of available equipment or know of someone in need of such equipment.

- Pro-bono support - many researchers and faculty are using software that was purchased via grant/gift money, but don't always have the high-end support packages necessary to help with set-up, configuration and on-going troubleshooting. Even a couple of hours of on-site or telephone support free of charge can be quite useful to a research team. Industrial partners should consider adding this type of work to their community service agenda in 2009.

These are just some of my ponderings. Please blog or add to the Grants and Funding Opportunities discussion group if you have other ideas.

Meanwhile, we will continue to explore how VMware's academic program (VMAP) will increase its support of research and education funding. Look for more details in early 2009.

Have a safe and happy holiday season!

1,868 Views 0 Comments 0 References Permalink Tags: projects, funding, research, grants, classroom

The approaches that DynamoRIO and Pin take to

instrumentation are fundamentally different.  Pin's

interface is not so different from tools that modify the

original code, such as ATOM, Dyninst, Vulcan, or Detours.

The only instrumentation option in Pin is to insert a

callout or trampoline to instrumentation code at a

particular point in the application's code stream.  The

application code stream itself cannot be modified directly.

 

Pin does use its code cache to improve over non-code-cache

tools that modify original application code in two respects:

code discovery and transparency.  The code cache design

allows for incremental dynamic discovery of code, making it

simple to examine all code that is executed, while

non-code-cache tools must statically (or at load time)

determine which code to instrument based on the application

binary and library files.  The code cache also eliminates

safety issues of modifying variable-length IA-32/AMD64 code:

direct modification by inserting a 5-byte jump instruction

can overwrite an entry point midway through those 5 bytes,

as well as suffer from races with multiple threads and

complications if those 5 bytes are later examined or

modified by the application (essentially, the displaced 5

bytes are a "miniature code cache" that incur all the same

complications as a regular code cache).

 

DynamoRIO leverages its code cache to provide a much broader

interface.  DynamoRIO allows modification of the runtime

code stream of the application: modifying, inserting, or

removing individual instructions.  This is in stark contrast

to Pin's observe-and-callout model.  DynamoRIO's interface

is only possible with a code cache, and it takes full

advantage of the cache's power.  Not only does this general

interface let DynamoRIO support non-observational-only uses,

such as optimization or translation, but it also gives full

control over instrumentation to the tool writer.  While Pin

does automatically inline and optimize simple callouts, the

client has little control or guarantee over the final

performance, and a minor change in the instrumentation

routine can have an order of magnitude impact in performance

if it prevents inlining.

 

For a real-world example, consider the PiPA memory profiler

and cache simulator:

 

   

Q. Zhao, I. Cutcutache, and W.F. Wong, "PiPA:

  Pipelined Profiling and Analysis on Multi-core Systems".

  Proceedings of The 2008 International Symposium on Code

  Generation and Optimization (CGO 08), pp. 185-194. Boston,

  MA, U.S.A. Apr 2008.

 

  Talk slides: http://www.comp.nus.edu.sg/~ioana/files/PiPA.pps

   

 

Implemented as a DynamoRIO tool, PiPA improved the

performance of the Pin dcache tool by a factor of 3.27x.

Implemented in Pin the speedup was only 2.6x, due to the

inability to fully optimize the instrumentation.

 

 

As requested I have gathered updated numbers for the bbcount

comparison using the new Pin release 22117, which has

improved inlining support.  Several benchmarks such as

perlbmk, crafty, and eon improve noticeably.  The harmonic

means are now 226% (a 5% improvement from 231%), 233%, and

185%, respectively.

 

2,089 Views 1 Comments 0 References Permalink Tags: performance, dynamorio, tools

As VDI turns from beeing a hype to getting a defacto standard, there are some questions and open issues I'm thinking about.

Today's "server based" computing technolgies allow you to virtualize applications, operating systems or whole servers. So far so good. What what if we think about the issues we are dealing each day with? Some of them might still resist even in 10 years?

- User profile management

- Printing

- Network troughput and latency issues

- Data and application to be in the near together

So if we take todays competitor's vision (and there are a bunch of them), in a few years everyone can start his desktop from anywhere at any time using any kind of application providing any kind of data.

Nice and easy.. or not? I always keep the few issues above in mind and guess what? They still resist even in a supercloud. The data has to move to the application or vice versa, the VDI has to move close to the user, if you like to work as locally, especially when you are a asian guy who is suffering the 200 to 500ms round trip times.

At my opinion, todays VDI technologies aren't ready for solving these issues. But I'm very interested where we wanna go tomorrow

1,347 Views 1 Comments 0 References Permalink Tags: vdi

I've been using VMware in the typical classroom setting now for two years. I first starting using it to overcome the issue of how to teach a Linux class without having access to a server. Becuase of an HP grant that I was apart of, our college was awarded 22 TabletPC's. The Tablet's were first used in a Digital Electronics class, Telecommunications Class and a Calculus 1 class. What a great opportunity to now use the Tablet's with VMware.

At first the class was a bit hesitant, but when I told them they could take the Tablet's home for the entire semester, they said.. sign me up.

We installed VMwareserver along with a condensed virtual appliance of Fedora. What a great learning experience. A few minor issues arose, installing IIS.. but basically it went smoothly.

Now I have taken the procedure to my on-line class. So far it is working. Along with using Linux Labsim, I am having the students from around the country install VMware, the virtual appliance of Fedora, and a 4 disk version of Red Hat.

So far, about eight students have tried the task. I must say, there are more issues when you can't see the other person's computer. But so far I have been able to guide them through the process.

1,181 Views 0 Comments 0 References Permalink Tags: performance, vmware, lecture, open_source, linux, software, distance_learning, hp

The encoding tweak that improved the bbcount tool's performance in my Part 2 post also improves the base DynamoRIO performance, to the point where the results on Core2 look similar to those on Pentium 4.  We now have the following base DynamoRIO performance:

 

 

 

DynamoRIO's average execution time for the SPECCPU2000 integer benchmarks is about 121% on both platforms.  We'll get this improvement out in a new DynamoRIO release later this week.

 

1,273 Views 0 Comments 0 References Permalink Tags: performance, dynamorio, tools

To continue the comparison between DynamoRIO and Pin, below I show the memory usage of the two on the same benchmark suite, with no tool:

 

 

 

The average additional working set (resident memory) beyond the native working set of the application is 14.6MB for Pin and 2.6MB for DynamoRIO.

 

Both Pin and DynamoRIO in these experiments have unlimited code caches, so that is not a factor.  A significant portion of Pin's usage probably comes from its per-indirect-branch hashtables and function cloning.

 

DynamoRIO by default reserves 128MB of address space up front.  This is to ensure it does not run out of memory when targeting applications like MS SqlServer or Exchange that like to gobble up all available address space.  However, on Linux, DynamoRIO's reserved address space also commits swap space (one area where Linux's features are not as rich as Windows, even with MAP_NORESERVE).  For measuring virtual memory usage, I disabled this (which future versions of DynamoRIO may do by default, at least for 32-bit where reachability is not a concern) with the runtime option "-no_vm_reserve".  (Note that this undocumented option should not be used with the 9601 build on multithreaded Linux applications.)

 

It does not look like Pin is also reserving up front, given that it uses little memory on tiny test programs.

 

DynamoRIO's primary data structures have been compressed to save as much space as possible without sacrificing performance.  DynamoRIO's -opt_memory runtime option further reduces data structure usage significantly by eliminating all per-basic-block data structures.  However, we have not yet integrated -opt_memory with a performance optimization that also happens to save memory.  That means that in this 9601 release, on applications with little code (which includes most of SPECCPU2000), -opt_memory's savings are minimal since they disable that performance optimization.

 

1,618 Views 0 Comments 0 References Permalink Tags: performance, dynamorio, tools

In Part 1, I compiled the Pin bbcount tool with -fPIC, since that is what the Pin Makefiles used when I built a sample tool that came with the distribution.  -fPIC is also shown in the sample build line in the Pin documentation.  As it turns out, -fPIC for 32-bit Pin tools thwarts inlining of bbcount's instrumentation routine.  (Re-building that sample I just mentioned with "TARGET=ia32" does end up omitting the -fPIC.  I would recommend that the Pin folks add this detail to their documentation as it makes a huge difference in performance.  This also leads to questions about inlining for 64-bit Pin tools, where -fPIC is apparently required.)

 

Re-running the experiments with the new non-PIC bbcount Pin tool we see much better performance.  I also re-ran the DynamoRIO bbcount experiments, this time with an optimization that will be in the next build of DynamoRIO that tweaks the register spilling to use an encoding that is more performant on the Core and Core 2 processors.  Here are the results:

 

 

 

I also ran the same benchmark suite on the Netburst microarchitecture: an Intel Xeon MP CPU 3.00GHz on a machine running RHEL4 with 8GB RAM.  On this machine I used the pin-2.5-20751-gcc.3.4.6-ia32_intel64-linux release of Pin since the machine's gcc version is 3.4.3.  Once again, gcc was used to build all of the benchmarks and tools.

 

Here are the results with no tool (just measuring the base infrastructure):

 

 

 

DynamoRIO's harmonic mean is 121% while Pin's is 160%.

 

Paralleling the Core 2 results, here are the results with the same basic block execution count tools (with no -fPIC on the Pin tool: using -fPIC resulted in an average nine times slowdown):

 

 

 

The harmonic means are 213% for Pin with inlining, 183% for DynamoRIO always saving the flags, and 156% for DynamoRIO saving only when necessary.

 

1,201 Views 1 Comments 0 References Permalink Tags: performance, dynamorio, tools

I just got back from the second annual VMAP workshop hosted at VMware's annual user conference, VMworld in Las Vegas. The user conference boasted over 14,000 attendees - which included our members, student posters and university representatives using virtualization to support complex and often diverse campus infrastructures. I think the VMAP workshop attendees would agree that the conference was "eye opening" to say the least! The countless number of vendors developing software solutions to leverage virtualization was staggering. Be on the lookout for VMAP member blogs talking about their experiences from the conference. We'll also send a link out to the poster abstracts and presentation materials in the next couple of weeks.

Meanwhile, a few thoughts of mine about this year's event...

The student posters got first-hand experience explaining their scientific approach to complex problems to industry participants at VMworld's opening reception on Monday night. I enjoyed listening to our rising stars in academia trying to distill complicated problems into a language that the non-scientific, yet technical, observers can understand. Several industry attendees at the conference told me that they were pleased to see students able to present their ideas in a way that allowed them to appreciate how hard some of these problems are to solve.

At this year's full-day workshop, we had faculty from Georgia Tech, Columbia University, Northeastern, PolyTechnic/NYU, Cambridge University, George Mason, and University of Toronto. Carnegie Mellon, MIT and Olin College of Engineering were also represented by our student posters. Industry researchers from VMware and AMD were also present.

After lunch, which included a closer look at the student posters, we had three terrific speakers. We invited each speaker based on their varied use of virtualization at their respective schools. Ada Gavrilovska of Georgia Institute of Technology discussed her work on managed virtualized platforms. This work began using Xen and is now using VMware's ESX source code as a proof of concept to validate performance statistics of power management techniques. Discussion after her talk highlighted the question of cloud vs. grid and how large clusters should be leveraged for research and computation.

Jeffrey Dwoskin of Princeton University, talked with the group about his use of virtualization to test his security architecture research. As Jeff explained, virtualization software resolves the need to reproduce hardware environments to test the architecture he is developing.

Jason Nieh of Columbia University came from a completely different perspective - using and teaching virtualization in computer science. Jason painted an interesting picture of how Columbia has increased enrollment in the operating systems course over the past nine years and as a result has produced several interesting research projects/papers from students much earlier than typical in their academic careers. He attributes virtualization as a means to get more hands on experience and appreciation for systems. Students are not bogged down with crashing machines when doing kernel hacking and they don't have to worry about complexities of different operating systems. Jason also talked about the success of Columbia's first virtualization course offered this past spring. A complete set of the materials from this course can be found in govirtual.org under courseware.

Overall, I'd have to say that I am quite pleased with how our second annual event turned out and I'll share some additional thoughts on discussions from the workshop in my next blog. Also, I want to extend a special thanks to Melissa Wood, VMAP Program Manager extraordinaire, for her tireless work pulling together the poster session and workshop. For those who didn't know, she also helped me coordinate the attendance and schedules of over 200 engineers from VMware for the VMworld conference. Hats off to you Mel!

4,920 Views 0 Comments 0 References Permalink Tags: courseware, vmware, source_code, conference

Since DynamoRIO has just been re-released (DynamoRIO Dynamic Instrumentation Tool Platform, DynamoRIO Dynamic Instrumentation Tool Platform for Linux) let's re-evaluate where it stands performance-wise versus another popular dynamic instrumentation system, Pin.

 

Pin's 2005 PLDI paper (Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation) gives performance numbers for DynamoRIO version 0.9.3 and Valgrind version 2.2.0 versus the latest Pin at the time.  The paper shows base performance as well as performance with a simple basic block execution count tool.  Let's model our experiments on those measurements.

 

I'm using DynamoRIO version 0.9.6 build 9601 and Pin pin-2.5-20751-gcc.4.0.0-ia32_intel64-linux.  My compiler is:

gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC)

 

I ran the SPECCPU2000 integer benchmarks, just like the paper (also because I have yet to get a copy of SPECCPU2006 set up...).  I used an Intel Core 2 Q9300 quad-core processor at 2.50GHz on a machine with 4GB RAM running Fedora 9.  The benchmarks and tools were all 32-bit.

 

Here are the results with no tool (just measuring the base infrastructure):

 

 

 

DynamoRIO is consistently faster than Pin, with an average slowdown of 34% versus Pin's 71%.

 

And here are the results with the basic block execution count tools (source code displayed below):

 

 

 

For DynamoRIO, two different versions of the tool are shown.  The first always saves the arithmetic flags, as was done in the Pin paper.  The second performs a simple analysis and only saves the flags when necessary.  Both outperform the Pin tool.

 

Pin Client

 

I followed the advice at http://rogue.colorado.edu/Pin/docs/20751/Pin/html/index.html#PERFORMANCE and used IPOINT_ANYWHERE and PIN_FAST_ANALYSIS_CALL.  Note that both of these clients ignore racy increments to the counter from multiple threads.

 

#include "pin.H"
#include <iostream>

int bbcount;

VOID PIN_FAST_ANALYSIS_CALL docount() { bbcount++; }

VOID Trace(TRACE trace, VOID *v) {
    for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl)) {
        BBL_InsertCall(bbl, IPOINT_ANYWHERE, AFUNPTR(docount),
                       IARG_FAST_ANALYSIS_CALL, IARG_END);
    }
}

VOID Fini(int, VOID * v) {
#ifdef SHOW_RESULTS
    cout << "Count is " << bbcount << endl;
#endif
}

int main(int argc, CHAR *argv[]) {
    PIN_InitSymbols();
    PIN_Init(argc, argv);
    TRACE_AddInstrumentFunction(Trace, 0);
    PIN_AddFiniFunction(Fini, 0);
    PIN_StartProgram();
    return 0;
}

 

DynamoRIO Client

 

Note that both of these clients ignore racy increments to the counter from multiple threads.  For DynamoRIO, adding a LOCK prefix to the inc instruction is easy to do; however, it has a significant performance impact (a three times slowdown in a quick test).  Using thread-private caches and aggregating the count at the end would be more performant.

 

#include "dr_api.h"

#define TESTALL(mask, var) (((mask) &amp; (var)) == (mask))
#define TESTANY(mask, var) (((mask) &amp; (var)) != 0)

static int global_count;

static dr_emit_flags_t
event_basic_block(void *drcontext, void *tag, instrlist_t *bb,
                  bool for_trace, bool translating)
{
    instr_t *instr, *first = instrlist_first(bb);
    uint flags;
    /* Our inc can go anywhere, so find a spot where flags are dead. */
    for (instr = first; instr != NULL; instr = instr_get_next(instr)) {
        flags = instr_get_arith_flags(instr);
        /* OP_inc doesn't write CF but not worth distinguishing */
        if (TESTALL(EFLAGS_WRITE_6, flags) &amp;&amp; !TESTANY(EFLAGS_READ_6, flags))
            break;
    }
    if (instr == NULL)
        dr_save_arith_flags(drcontext, bb, first, SPILL_SLOT_1);
    instrlist_meta_preinsert
        (bb, (instr == NULL) ? first : instr,
         INSTR_CREATE_inc(drcontext, OPND_CREATE_ABSMEM
                          ((byte *)&amp;global_count, OPSZ_4)));
    if (instr == NULL)
        dr_restore_arith_flags(drcontext, bb, first, SPILL_SLOT_1);
    return DR_EMIT_DEFAULT;
}

static void event_exit(void)
{
#ifdef SHOW_RESULTS
    dr_printf("Count is %d\n", global_count);
#endif
}

DR_EXPORT void dr_init(client_id_t id)
{
    dr_register_exit_event(event_exit);
    dr_register_bb_event(event_basic_block);
}
2,955 Views 4 Comments 0 References Permalink Tags: performance, dynamorio, tools

Security and virtualization often do not come hand-in-hand; merely running a virtualized environment does not automatically provide a guarantee of increased security over dedicated hardware.

 

As we have mentioned before, even the basic isolation properties of a VM framework are questionable. Relying on a piece of software to enforce isolation on the x86 platform is risky; it is entirely unclear whether a VMM is going to get that sort of job (a job that OS designers have been grappling with for years) right, especially in the absence of hardware primitives that would make things a lot easier (this argument is the essence of our recent VMSec paper).

 

This KernelTrap thread captures a vigorous discussion of this exact point.

 

1,409 Views 2 Comments 0 References Permalink Tags: x86, isolation, openbsd

VM frameworks (VM guests, hosts, hypervisors, VMMs, etc.) help support system security in a variety of ways. Projects involving aspects of virtual machines and security range from those that show how a VM or VM framework can provide or enchance security functionality intrinsically to those that use VMs as containers to form part of a larger security system.

 

The former type of project looks at what functionality can be added to the VM framework's code to implement things like access control, trusted computing (TPM support), isolation, malware reverse engineering, virus scanning, network content filters, information flow analysis, and host or network anomaly detection.

 

The latter type of project employs to VMs to provide a convenient disposable container to examine the execution of a guest application or OS. Some examples of this use include opening potentially infected emails or web pages, testing out patches or other software fixes, and recording application state for replay.

 

903 Views 0 Comments 0 References Permalink Tags: applications_of

From a certain point of view, the amalgam of virtualization technology and information security techniques represents a rather strange blend. What does emulation or multiplexing of physical devices have to do with the enforcement of a wide variety of security properties?

The easiest answer (and the most traditional one) is that virtualization provides an effective means of isolating execution environments; virtualization seems like a natural way to provide isolation between execution containers. As we've seen previously, however, customizing the communication between such containers --- in many situations, they do need to communicate: complete isolation is the exception rather than the rule --- presents a challenge. Thus, even the "obvious" security application of virtualization is fraught with difficulty.

As organizations increase their adoption of virtualization environments, and with the current industry focus on information security, it is natural to wonder just how a virtualization framework might pull double duty by improving a security posture as well as easing management burden and infrastructure costs.

Besides isolation, virtualization frameworks seem to provide a natural place to implement a reference monitor: a formal, well-defined security construct. A reference monitor provides a low-complexity, trusted (and trustworthy) environment from which to observe the execution of another system and measure a certain set of security-related properties.

Unfortunately, it appears that little thought has been given to what the best way is to combine the twin roles of resource provider and reference monitor within a single virtualization framework. As a result, virtualization environments can find themselves attempting to measure security-relevant properties of a system in
ways that are both creative and convoluted. In essence, the set of events that are interesting from a security viewpoint (and this depends on what type of "security" you're interested in measuring...from integrity of control flow or data items to information flow to authorization and access control) are not necessarily the set of events that the virtualization framework was built to intercept and observe with a minimal performance impact.

Karger and Safford's article in the upcoming issue of IEEE S&P magazine details the I/O complexities of most of the popular approaches to providing virtualization. I and my colleagues Bratus, Ramaswamy, and Smith have a paper at the upcoming VMSec workshop (held with ACM CCS 2008) identifying the problem of designing an efficient event trapping system of use for both security policy enforcement and virtualization.

While the suggestion that the design of current virtualization solutions is actually a hindrance to providing security solutions may not sit well with folks interested in touting a particular virtualization solution's security capabilities, I would argue that we have a unique opportunity to make sure that VM platforms are designed to do the things we're asking them to do. Now is also a good time to note that the stunning complexity of VMM I/O subsystems, the performance hacks therein, and the backdoor management interface all suggest that even the basic isolation story rests on somewhat shaky ground.

We find ourselves at a unique point in time: we can try to identify the right design for doing these two disparate tasks at once, or we can muddle through by abusing a framework meant for resource multiplexing rather than program supervision. In either case, we still must balance the tradeoff between the virtualization framework's I/O architecture and subsystems and the trustworthiness of the reference monitor. Ironically, as we depend on VM frameworks to implement more security functionality, these systems become less trustworthy even as they become more trusted.

860 Views 0 Comments 0 References Permalink Tags: i/o, event_trapping, security_properties, reference_monitor, resource_provider
RSS feed of this list 1 2 Previous Next