Recently I've been working on making it possible for us to use crosvm's
implementation of virtio-gpu (which is necessary for multi-VM Wayland).
The approach I was originally planning on was porting crosvm's
vhost-user-gpu frontend to cloud-hypervisor. That would allow us to run
the crosvm implementation of the device unmodified, with just a small
amount of glue code in cloud-hypervisor.
But then I discovered some things that made me decide to investigate
other approaches:
- crosvm does not implement the standard vhost-user-gpu protocol.
It implements it its own special way. Perhaps ironically, the
crosvm-specific way seems to be closer to how vhost-user works for
other devices (like network and block), which should actually make
it easier to port the frontend to cloud-hypervisor. But it also
changes the potential to upstream that port to cloud-hypervisor
from "a hard sell" to "not going to happen". So if I did that,
it would commit us to carrying a cloud-hypervisor patch indefinitely.
- There's an interesting new protocol called vfio-user, that would be
really helpful to us in this situation. Whereas vhost-user requires
the VMM to still have some basic per-device knowledge (the glue code
I was planning to port), vfio-user operates at the PCI level, so the
VMM only needs to know that the device is PCI. So if we could
somehow provide a virtio-gpu device to cloud-hypervisor over
vfio-user, cloud-hypervisor wouldn't need any GPU-specific code at
all. Everything should just work without any changes to
cloud-hypervisor, as it already implements a vfio-user client.
- The next release of QEMU, 7.1.0, will include support for exporting
any virtual device QEMU can provide over vfio-user.
So with all this in mind, there are three ways we could try to proceed:
1. Port the crosvm-specific vhost-user-gpu frontend to cloud-hypervisor.
2. Make crosvm speak the standard version of vhost-user-gpu, then use
QEMU to act as a bridge between the crosvm GPU device, and
cloud-hypervisor, translating vhost-user to vfio-user, but not doing
anything else. (So we're not using QEMU to run a VM, just to
translate between these two protocols and handle the PCI stuff.)
3. Implement a vfio-user server in crosvm, so crosvm device backends
can be used directly with cloud-hypervisor.
3 is the clear best option, because it doesn't require adding QEMU into
the system, and it would be entirely upstreamable — I spoke to a crosvm
developer about it in their Matrix channel and they said they'd be
interested in patches for it. But it's also quite complicated to
implement, and beyond my ability, at least if I want results any time
soon. 2 would probably be even more complicated as it would require
coordinating between crosvm and QEMU to get them to agree on how the
protocol should work.
So if we want this working soon, 1 is the only feasible option, at least
if it's me doing the work. But it means committing to carrying a
cloud-hypervisor patch until somebody comes along to implement 3. It
gives me bad vibes because it goes against the upstream-first approach I
try to take with Spectrum development, and once timely package updates
are something we have to take more seriously, the patch no longer
applying would block any sort of automatic updates.
So I'm posting this to solicit thoughts on what to do here. The ideal
scenario is that we are able to find somebody else (with more VMM
implementation experience than me) who is able to do 3, and then I would
be totally comfortable with doing 1 as a stopgap until that can happen.
Otherwise, I can either keep trying to chip away at doing it myself,
however long that takes, or we'd have to just accept the consequences of
having the patch indefinitely, and hope that Google or somebody else
also finds themself wishing crosvm had a vfio-user server and implements
it themself.
Thoughts welcome. :)