Passing a GPU through to a Virtualized Machine

Passing a GPU through to a Virtualized Machine

We live in a virtualized world…and sometimes you want better GPU performance in your virtual machine (VM). Although it seems like this would be a cakewalk, there are a few steps that must be taken before it is possible. The goal of this guide is to put together a stable VM with a GPU passed through. To get started, you will need a PC with IOMMU support. For obvious reasons, I will focus on Intel hardware with Nvidia GPUs, because that is what I have available to me. Intel’s implementation of IOMMU is called VT-d. Note that VT-d is for directed I/O and is more sophisticated than just VT-x (which would not work for this purpose).

This guide will also make use of the latest stable release of the Debian OS with Linux kernel (9, Stretch), and associated packages from the official repositories. I will be using the simplest setup which includes the hypervisor called Kernel based Virtual Machine (KVM) along with QEMU (Quick Emulator). I am using the default Linux kernel for this OS: version 4.9.

For simplicity, specific package names and commands will be bolded. Text to be entered into configuration files will be blue.

First I install virt-manager, which brings it around 120 associated dependencies and suggestions, including qemu-kvm. You will also need to install ovmf which will provide UEFI support for the guest. Contrary to some other guides, your host does NOT need to be booted via UEFI (legacy BIOS is fine).

The first thing we need to do is enable VT-d and ensure functionality. We need to edit /etc/default/grub . Look for the line that says “GRUB_CMDLINE_DEFAULT=”” and append “intel_iommu=on” to the end of that line. Also add rd.driver.pre=vfio-pci which will ensure that vfio-pci is loaded before any other kernel drivers. In other words, we want vfio-pci to snag the card before the kernel driver does (stubbing by pci-stub was the old term which is no longer used).

vi /etc/default/grub

Original: GRUB_CMDLINE_LINUX_DEFAULT=”quiet”
New: GRUB_CMDLINE_LINUX_DEFAULT=”quiet intel_iommu=on rd.driver.pre=vfio-pci”

This command updates grub configuration:

update-grub

vfio-pci is the newer version of pci-stub (ignore any mentions of pci-stub from other guides). By loading vfio-pci first in GRUB, you should be able to grab any PCI devices before they are loaded by other kernel drivers, such as AMD/NVIDIA drivers. vfio-pci also puts your snagged hardware in low power state when it is not being used, something that pci-stub never did.

You can verify that IOMMU is working by running the following command:

find /sys/kernel/iommu_groups/ -type l

If there is a list of devices, then it is working. If there is no output then something with IOMMU must be troubleshooted. You’ll also notice that in the list of devices, you can see the IOMMU groups. Of course, you can only passthrough only group at a time. You want the GPU to be passed to have it’s own group. Often, the GPU will also have an audio device that will be passed through with it (HDMI audio).

Next, we need to find the PCI port, vendor and model of the card.

lspci -vnn (or lspci -nnk)

You should see output something like this along with many other devices on your system:

03:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1b02] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation Device [10de:11df]
Flags: fast devsel, IRQ 11
Memory at f8000000 (32-bit, non-prefetchable) [disabled] [size=16M]
Memory at a0000000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at b0000000 (64-bit, prefetchable) [disabled] [size=32M]
I/O ports at d000 [disabled] [size=128]
Expansion ROM at f9000000 [disabled] [size=512K]
Capabilities: <access denied>
Kernel driver in use: vfio-pci
Kernel modules: nvidia

03:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10ef] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:11df]
Flags: bus master, fast devsel, latency 0, IRQ 10
Memory at f9080000 (32-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel

From this we can see that the NVIDIA Titan Xp card can be identified by Vendor:Model:
10de:1b02
10de:10ef

and is located on port ID 0000:03:00.0 and 0000:03:00.1

An even better way to find the IDs and make sure the IOMMU grouping is sufficient is by using this bash script:

for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d); do echo “IOMMU group $(basename “$iommu_group”)”; for device in $(ls -1 “$iommu_group”/devices/); do echo -n $’\t’; lspci -nns “$device”; done; done

To claim the GPU with vfio-pci, create a file called

vi /etc/modprobe.d/vfio.conf

and add this one line to it with the PCI IDs:

options vfio-pci ids=10de:1b02, 10de:10ef

Fun fact: modprobe used to be a file in older version of Linux, but it was more organized to turn it into a directory (modprobe.d) and then have individual configuration files in it. You can name these configs anything that you like. You’ll also notice that these files can be used for blacklisting drivers, such as the open-source driver for NVIDIA, nouveau.

In addition to the file saved in /etc/modprobe.d, you also need to add the vfio modules to initrd.

vi /etc/initramfs-tools/modules

add the following:

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Older guides say you should also include kvm and kvm_intel, however I have never found them to be necessary. Next you can update the current initial ramdisk (initrd) image using the following command (note that some distributions like openSUSE use dracut and Arch uses mkinitcpio to do this):

update-initramfs -u

Reboot your machine and run again:

lspci -vnn

You should see your GPU which you are passing to the guest listed, and this time, it will say:

“Kernel driver in use: vfio-pci”

You should also see the correct kernel driver being used for your host’s GPU (eg. nvidia).

Last step if you are using an Nvidia Geforce card in the guest (Quadro should be ok). You must hide the fact that it is VM to Nvidia driver on the guest, or you will get error code 43 (appears in Windows Device Manager, for example).

Use this command after you’ve created the VM using virt-manager, where the last part is the name of the VM:

virsh edit win10

Then under the features section, add this:

<kvm>
<hidden state=’on’/>
</kvm>
<hyperv>
<relaxed state=’on’/>
<vapic state=’on’/>
<spinlocks state=’on’ retries=’8191’/>
<vendor_id state=’on’ value=’1234567890ab’/>
</hyperv>

Leave a Reply

Your email address will not be published. Required fields are marked *