I didn’t know this until today, but according to KB 1687 both soft and hard page faults will cause a context switch into the virtualization layer, which in turn causes additional overhead. This was clued in by a VMware engineer looking over a case of horrid performance that we’ve been encountering.
Hard page faults involve disk I/O and impact performance. Soft page faults also impact performance, but may not result in heavy performance loss in a physical environment. VMware software does not cause the guest operating system running in the virtual machine to see additional page faults, but VMware software must virtualize the page faults that originate from within the virtual machine. Both soft and hard page faults in a virtual machine cause a context switch into the virtualization layer and some additional processing to virtualize memory management data structures. As on native hardware, hard page faults in a virtual machine also require disk I/O to the page file. For best performance, avoid page faults whenever possible. You can investigate if your Windows application is generating page faults by using the Performance Monitor console (perfmon), which shows you the cumulative number of page faults on the system. Generally, if the rate of paging is slow, then the application is generating hard page faults. You can investigate the paging rate by monitoring the “page faults per second” counter.
So what do I mean by virtualization layer, and why is that actually a bad thing? VMware ESX will natively schedule user mode processes directly onto the processor in Ring 3. Context switching, at least in our limited case, means when the Guest OS of the virtual machine needs to go from running a user mode process to running a kernel mode process. The VM thinks it is executing this directly on Ring0, however, VMware ESX 3.5 does not yet utilize the virtualized Ring 0 provided by AMD & Intel’s Hardware virtualization. Instead, it must thunk the process from user mode on the process and into the vmkernel, and virtualize the Ring 0 execution of the process.
So why is this bad? Because you are no longer running native on the processor, you are running virtualized within the vmkernel, which has more overhead and a higher performance penalty. Doing this on occasion is not bad, but when it is happening in the tens of thousands of times per second… it can really slow things to a crawl.
Not too bad.
Umm… WOW!
1. VMware does not use VT by default (except for 64 bit guests) since in most cases it will give slightly worse performance on VMware: http://www.vmware.com/pdf/asplos235_adams.pdf
2. With newer AMD cpus (Barcelona & Shanghai) you also have a virtualized MMU, and this is the technology that help you solving most of the context switching problem.
3. According to this recent blog post RVI needs to be forced to be fully utilized: http://www.yellow-bricks.com/2009/03/06/virtual…
3. Even though RVI isn't fully enabled it seems to help context switching loads. See the screen shot at the bottom of my article here: http://vmfaq.com/entry/42/
Lars
1. VMware does not use VT by default (except for 64 bit guests) since in most cases it will give slightly worse performance on VMware: http://www.vmware.com/pdf/asplos235_adams.pdf
2. With newer AMD cpus (Barcelona & Shanghai) you also have a virtualized MMU, and this is the technology that help you solving most of the context switching problem.
3. According to this recent blog post RVI needs to be forced to be fully utilized: http://www.yellow-bricks.com/2009/03/06/virtual…
3. Even though RVI isn't fully enabled it seems to help context switching loads. See the screen shot at the bottom of my article here: http://vmfaq.com/entry/42/
Lars