Virtual Machines with libvirt/KVM/QEMU

= Introduction =

For simple experimentation needs, VirtualBox is more polished and easier to use. However, it is worth learning libvirt instead, because it is more powerful and you can use it later on in production. The Virtual Machine Manager GUI is easy enough for novice usage too. If you learn some tricks beforehand, you will have a much easier ride. That is the main motivation for writing this page.

The information and instructions below are for to Ubuntu 18.04 and 20.04, but most of it probably applies to other Linux systems too.

= Installing the libvirt/KVM/QEMU Software =

In order to install libvirt/KVM/QEMU, install the following packages:

qemu-kvm libvirt-daemon-system  libvirt-clients  bridge-utils

Older distributions used package libvirt-bin.

The GUI tool will probably be helpful too, but if you are installing on a headless server, you can use the GUI from your workstation too. The package is:

virt-manager

If you are using Cockpit, install this package to see the VMs on the web interface:

cockpit-machines

After installing libvirt, log off and on again, so that your user account becomes a member of the libvirt group. I had once trouble connecting to the daemon socket with the virt-manager GUI, even though the user was effectively a member of 'libvirt'. I had to reboot the entire system, and then it worked.

= Creating a Virtual Machine =

Choosing a Virtual Machine Type

 * pc is a standard PC i440FX + PIIX, from year 1996, alias of pc-i440fx-2.11 (the exact version may vary). It works best with older versions of Windows. Windows 10 is fine too.


 * q35 is a standard PC Q35 + ICH9, from year 2009, alias of pc-q35-2.11 (the exact version may vary). This machine type is newer than pc-i440fx, adds the PCI Express Bus and AHCI Bus Emulation (for SATA interfaces), and some modern motherboard and BIOS features. It works best with Linux. There is no support for legacy Windows XP/2000 guests.

The Virtual Machine Manager, as of version 2.2.1, does not let you choose the machine type clearly. The machine type is automatically chosen based on the selected OS type. A "generic" OS yields pc, and OS "Fedora" yields q35.

Choosing a Virtual Disk Format
At first glance, the qcow2 format looks cool. It ticks all the boxes:
 * It is easy, because the whole virtual disk is just 1 file.
 * It has built-in compression and copy-on-write.
 * It supports snapshots.

However, I could not find a good overview about qcow2 regarding operational issues and performance, which is in itself worrying. When you look closer, it turns out that there are important issues:


 * It is one huge file, so backing it up is very slow. If you want an incremental backup, you have to scan the whole file searching for changes. It there were a bunch of separate files, you could for example use the "last modified date" to skip some on the next incremental backup.


 * Given the block structure qcow2 has, and that it usually grows dynamically, it must fragment pretty bad, which is pretty bad for rotational disks (HDDs). In order to find the fragmentation ratio, use command "qemu-img check image.qcow2".


 * The optional compression is performed just once. And it is single threaded, so it takes a long time. If the data changes, it is not recompressed. It also must heavily contribute to fragmentation, because the compressed data size changes randomly. If you want to recompress, you need to take the image offline and copy it with "qemu-img convert".


 * Deleting a snapshot does not make the qcow2 file shrink immediately. I guess it must rely on sparse file support, which also has its peculiarities. There is no online compacting operation. You have to stop the guest OS and copy the whole qcow2 image file with "qemu-img convert". The copy is then smaller, and I hope also defragmented, at least a little.


 * In order to save disk space on the host (on the .qcow2 file) without reorganising the internal blocks, the qcow2 file format relies on the guest issuing TRIM commands, and on the host's sparse file support, which is not always available, and it further increases fragmentation.


 * There is no safe concurrent access to qcow2 image files. So you cannot safely take a snapshot, let the guest OS run further, and back up just the snapshot at the same time the guest OS continues to run.


 * You cannot fork snapshots. Reverting to a snapshot drops any other snapshots created after it, so you can only go backwards. That means that you cannot try two different configuration approaches on your guest OS simultaneously (switching between them).

Therefore, at the end of the day:
 * Snapshots can only be used for some experimentation, and not in production.
 * Compression is not practical in many situations.
 * If fragmentation becomes a performance problem, your only practical solution may be to switch to an SSD.
 * Safe, online backups are problematic.

Choosing a Virtual Graphics Card

 * VirtIO graphics offer paravirtualised 3D acceleration.
 * For Linux, VirtIO graphics are considered stable, but you do not usually need 3D acceleration in a VM. It may be safer to reduce virtualisation complexity and go for QXL instead.
 * It is not clear whether VirtIO graphics are stable under Microsoft Windows. The digitally-signed drivers from RedHat do not seem to include a driver for the VirtIO graphics card, so it probably best to use QXL instead.


 * A "Video QXL graphics card" offers paravirtualised 2D acceleration only. For Windows 8 or later, which uses the newer "Windows Display Driver Model" (WDDM), use the 'qxl-dod' graphics driver.

Choosing a Virtual CPU
Choosing the right CPU type is hard, but that is only important if you need advanced features like smooth online backups from live snapshots, or live migration. See libvirt API compareHypervisorCPU for the gory details about CPU features. However, if you need such features, you probably would not be reading this page anyway.

You normally choose option "Copy host CPU configuration", which is easy and gives good performance, and not think too much about it. Keep in mind that a snapshot of a live VM may not be able to resume live on another host with a different CPU. Nevertheless, the OS inside the VM should be able to cold boot with the new CPU type. Doing a cold boot may not be too bad if you install the QEMU guest agent and configure it to save a consistent state to persistent storage when you create a snapshot.

However, I did not further investigate how to get a consistent disk state when taking a snapshot. First of all, I do not trust that it would be completely reliable. Secondly, operating systems must be restarted every now and then anyway, due to constant updates. And choosing the right CPU type so that an online snapshot works on another PC is difficult. So I just schedule a server shutdown during the night (or once a week), back its .qcow2 file up, and restart it automatically. This way, all backups are clean. But you may have higher uptime requirements.

Other Virtual Hardware

 * RNG (Random Number Generator):
 * This is not required under Windows, but add one just in case something needs a good random source for encryption purposes.
 * The first Debian boot can take a very long time without RNG, if the OS does not find a good random source for encryption purposes and it resorts to gathering entropy from somewhere else, like mouse movements.


 * VirtIO network card: Manually set a MAC address to avoid the risk of duplicates on your LAN. Changing the MAC address later could have an impact on Windows activation. Locally-administered MAC addresses (equivalent to private IP addresses like 10.x.x.x) are:     x2-xx-xx-xx-xx-xx      x6-xx-xx-xx-xx-xx      xA-xx-xx-xx-xx-xx      xE-xx-xx-xx-xx-xx An easy-to-remember prefix is DE:AD:BE:EF:xx:xx.


 * VirtIO disks: Manually set the "Cache mode" because it often has a suboptimal default:
 * "none" is your best bet, because it provides performance characteristics that are easy to understand: when the virtual disk is busy, it has the expected disk performance impact on the whole host and any other VMs. This means that any extra RAM on the host will not be used for caching purposes, which is a waste.
 * "write-through" and "write-back" tend to pollute the host's page cache and will probably cause unexpected and possibly severe performance problems on the whole host and any other VMs, see The Linux Filesystem Cache is Braindead for more information.
 * "directsync" is slow.
 * "unsafe" is generally not a good idea.
 * Alternatively, you could try to limit the overall VM memory usage, including the page cache, with a command like virsh memtune  --config --hard-limit "$(( 4096 + 1024 ))GiB" . That is however somewhat risky, because if QEMU decides to temporary allocate more memory, the VM could fail or get killed.

Virtual Hardware for Windows

 * Avoid SATA disk interfaces, as they may prevent saving the VM state.


 * Create 2 virtual CD-ROM drives. They must be IDE, because the Windows setup program has a built-in driver to access standard IDE drives. However, if you chose the q35 machine type, IDE is not an option. I haven't checked yet whether the Windows setup program has built-in drivers for virtual SATA or SCSI devices. In any case, it is best to remove non-VirtIO devices after the initial installation.
 * One CD-ROM drive is for the Windows installation DVD (mounted as ISO image).
 * The other drive is for the VirtIO device drivers CD-ROM (mounted as ISO image). In order to get this CD-ROM, search for "Windows VirtIO Drivers" on the Internet. The Fedora Project provides digitally-signed drivers. You will have to point the Windows setup program to the VirtIO disk device driver on this CD-ROM during installation. Otherwise, the Windows setup program will not be able to access the VirtIO virtual disk. After installation, the Windows Device Manager can recursively search for device drivers, but the Windows setup program apparently cannot. Look for the driver under a subdirectory like /viostor/w10/amd64/.


 * There is a relatively new issue that causes high CPU usage on the host when a Windows 10 runs. For me, enabling the hpet setting did the trick, reducing CPU usage from 7 % to 2 % on my computer:



= TRIM Support =

If you are using a sparse .qcow2 disk image file, and the guest OS issues TRIM/UNMAP disk commands, you can save disk space on the host.

However, TRIM support can potentially cause fragmentation, so it needs to be enabled explicitly. Fragmentation causes a performance drop, especially on rotational disks (HDDs).

You can enable TRIM with the "Virtual Machine Manager". Open a VirtIO disk and look for option "Discard mode".

Note that the wrong machine type can prevent TRIM from working later on in the guest OS. Modern generic machine types like "pc" and "q35" should work. I had trouble once, and machine type "pc", which was an alias of pc-i440fx-4.2, worked for me, but pc-i440fx-bionic did not, even though it had the same description.

On a Linux guest, you need Kernel version 5.0 or later in order for the VirtIO disk driver (virtio-blk) to support TRIM, see Ubuntu bug #1523246 for more information.

On a Windows guest, make sure you install (or upgrade to) an up-to-date VirtIO driver version. I could not use TRIM on a test Windows 10 VM I had, but then I upgraded the complete VirtIO driver package to version "Aug 10 2020", and TRIM started working afterwards.

Life with TRIM gets more complicated, because some filesystems and tools do not support sparse files. You may want to investigate things like "fallocate --dig-holes" and rsync options --sparse and --inplace. You can turn an existing image into a sparse file with virt-sparsify.

Because of the extra complication, I would not bother with TRIM unless you really need to save disk space or optimise your VM backup speed.

Trim support should work out of the box on the guest.

Verifying TRIM Support on a Linux Guest
In order to verify that trim support is enabled, run this command on the guest:

lsblk --output NAME,MOUNTPOINT,DISC-MAX,FSTYPE

The output looks like this:

NAME  MOUNTPOINT                    DISC-MAX FSTYPE vda                                       0B ├─vda1 /boot/efi                          0B vfat ├─vda2                                    0B └─vda5 /                                  0B ext4

We are interested mainly in the root mountpoint ('/').

If the DISC-MAX value is 0B, then TRIM support is not enabled. Any other value means that TRIM support is enabled.

Alternatively, you could get the DISC-MAX value like this:

cat /sys/block/vda/queue/discard_max_bytes

You can check whether the OS is configured to issue TRIM commands every now and then like this:

systemctl status fstrim.timer

You can manually trigger a TRIM like this:

sudo fstrim --all --verbose

Verifying TRIM Support on a Windows Guest
The Internet is full of advice about using this command:

fsutil behavior query DisableDeleteNotify

But that obviously only tells you whether TRIM support has been disabled at OS level. From Windows 7 through to Windows 10, it says "not disabled" even on a traditional HDD with no TRIM support. Sometimes you wonder what kind of experts roam on the Internet. Here is one guy that realised something is not right with that popular command.

You would normally use CrystalDiskInfo and check whether the "Features" field for a drive includes "TRIM". But CrystalDiskInfo does not work with VirtIO virtual disks.

The best way to test TRIM support is to manually trigger one on an elevated PowerShell like this:

Optimize-Volume -DriveLetter C -ReTrim -Verbose

If you get the following error message, you know that TRIM is not supported:

Optimize-Volume : The volume optimization operation requested is not supported by the hardware backing the volume.

= After Installing the Guest Operating System =

Some Common Steps

 * Remove the CD-ROM drives used for installation, especially if they are SATA, as SATA disk interfaces may prevent saving the VM state. If you do not remove the CD-ROM drives, at least remove their virtual media.


 * Adjust boot options, so that the VM can only boot from the chosen disk.


 * Set the VM to autostart when the host boots, if desired.


 * You may want to disable the sleep or suspend mode or function in the power settings. It normally does not make sense for virtual machines anyway. If you press the sleep button by mistake while the VM has the focus, it is not immediately obvious how to wake the VM up. The trick is to send an ACPI shutdown event, which is equivalent to pressing the power button inside the VM.

SPICE Guest Tools
In order for copy-and-paste between the host and guest desktops to work over the default SPICE remote desktop connection, you need to install the SPICE guest tools.

For a Windows guest, the spice-guest-tools are at https://www.spice-space.org/.

On a Linux guest, install package spice-vdagent.

High DPI Issues
On Windows 10 version 1909, I had a scaling problem with a high-DPI host monitor when using the Remmina remote desktop: the text on the guest was too small to be legible. Such high-DPI issues are also mentioned elsewhere on the Internet for the official Microsoft Remote Desktop client.

The Windows desktop DPI scaling (for example, 125 %) works well with the SPICE client, which is the one that the Virtual Machine Manager uses by default. I haven't tested with the alternative VNC yet.

Check the Device Drivers
Open the Windows Device Manager and look for any warnings that a virtual device has no associated device driver yet.

Make sure that the guest's display adapter is not using the "Microsoft Basic Display Adapter". If so, change it to the right driver, which comes up as "Red Hat QXL controller" and will probably be faster.

Optionally Turn Off Disk Features
If you are backing up your VM at regular intervals, you may not need features like System Restore or Volume Shadow Copies, which impact disk performance and need additional disk space. You can also reduce the recycling bin size.

You may also not need the file indexing that Windows Search does.

= Virtual Machine Manager Issues =

The Linux Desktop Does Not Automatically Resize
This section probably applies only to Linux virtual machines using the classical X Window System, and not the newer Wayland. And it probably applies to the default SPICE remote desktop connection, and not to the alternative VLC. I have seen this problem with Ubuntu 18.04 and 20.04.

The first thing you will notice is that the Linux desktop on the virtual machine does not automatically resize to fill your remote-control window. This is the kind of annoying, obvious oversight that is worth a good rant about open source etc.

In order for "automatic resolution switching" to work, open the SPICE client built into the "Virtual Machine Manager" as usual, and then go to menu "View", "Scale Display", and enable "Auto resize VM with window".

That alone will probably not work, so you may be tempted to let your rant page grow further. The following workaround works for me:


 * Find out what the virtual screen is called by running the following command on the guest:     xrandr --query It will probably be called something like Virtual-0 or Virtual-1.
 * Create a desktop shortcut on the guest to run a command like this: xrandr --output Virtual-0 --auto
 * Every time you resize your remote-control window on the host, you will need to double-click on that desktop shortcut in the guest's desktop. It is not fully automatic, but it is better than nothing.

If all fails, I have written a script that resizes the guest desktop to a fixed resolution.

Only One Remote Control Window
The Virtual Machine Manager uses SPICE by default for remote desktop purposes. SPICE only supports one connection, and it looks like a second one just kicks the previous one without warning.

The first SPICE connection shows then a frozen guest desktop, and you wonder whether your virtual machine has frozen too. But don't worry, it is only the brains of the libvirt maintainers that have frozen as well.

= Helper Scripts =

You may find some of my VM helper scripts useful:

https://github.com/rdiez/Tools/tree/master/VirtualMachineManager