Virtual GPUs, Monolithic Hypervisors, Slim Hypervisors, Oh my!

Because I have a lot of experience with HyperV, am now very quickly gaining additional expertise on ESX/ESXi, and it is such a misunderstood subject, I find I am frequently explaining the basic principal of the virtual GPU (also known as the VDI holy grail).  Since it comes up a lot, I thought it would be useful to capture the basic spiel here so I can just point people to it rather than repeat the same thing ad nauseum.

GPU support within a guest OS requires either virtualization of the GPU functions within the host OS, or allowing the guest OS to own the physical device. In client virtualization, or “hosted virtualization” in VMware terminology, this is reasonably straightforward, but not particularly performant. The virtualization of the GPU is accomplished by creating a para-virtualization device. Similar to what is done with advanced HBA’s, a virtual GPU is created and installed in the guest OS and the virtual machine monitor maps driver calls to the para-virtualized device to physical calls within the host OS API/driver stack.

In hypervisor scenarios, there may or may not be a host OS to map to. In XEN, Windows HyperV, or other “slim hypervisor” scenarios, a “parent OS” partition is present which provides hardware support for the guest OS’s running on the hypervisor. In this case, it is feasible that a similar approach to client virtualization could be taken. Microsoft acquired Calista for this purpose and Windows Server 2008 R2 SP1 HyperV allows GPU virtualization.

VMware ESX/ESXi are monolithic hypervisors married to a Linux based host OS. The drivers are compiled into the hypervisor which is why VMware is so picky with hardware. There is no native support for GPUs within VMware ESX/ESXi. Perhaps this is something, particularly with VDI, that VMware will be forced to tackle to stay competitive, but for now, instead ESXi relies on DirectPathIO.

DirectPath IO is simply direct mapping of a PCI/PCI-E device to a guest OS through the hypervisor. It is enabled by AMD-Vi and Intel VT-D. Both of these hardware virtualization technologies enabled IOMMU remapping which facilitates a guest OS taking direct control of a PCI-E device.

The catch is that *both* the CPU and chipset *must* support it, as well as the hypervisor, and the relationship is 1:1. There needs to be some major architectural change (in progress though) to the overall PC IO spec to allow sharing. For now, if a guest takes control of the PCI-E device, it is offline. The guest owns it. This means the host needs some alternate video device and any other guests would need their own dedicated GPU if pass through was required for them as well. Virtual display adapters would, of course, continue to function fine.

I worked at MSFT for 10 years and now work for VMware. VMware, in many ways, is far ahead of HyperV and, in many ways, the monolithic hypervisor model is superior, but flexibility in supporting hardware is one area where the parent/child approach of XEN/HyperV always had an advantage. It never mattered all that much, but GPU is one strong area where there is a big difference.

Net net… If you have a need for virtual GPU, and don’t want to do hosted virtualization, you can try ESXi 4.1 with DirectPath IO pass through of the GPU if your hardware has the capabilities and is supported. You can also give hyperv 2008 R2 SP1 a go as this is a free product also.  Obviously, as per my entries, I will be trying the former.  I know the latter works as I have already seen it in action 😉


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s