Last week, I'd just figured out how to do a normal vhost-user setup with
a QEMU VM connected to DPDK. This week, I wanted to try to move DPDK
into another VM using the experimental virtio-vhost-user driver, taking
the host system out of the networking equation altogether.
In theory this should have been a very simple change, but I couldn't get
it to work. DPDK claimed to be forwarding packets to the ethernet
device I'd attached to the backend VM (the one running DPDK), but
networking in the frontend VM (what you might think of as the
application VM) didn't work at all. It tried and failed to do DHCP, and
so couldn't progress beyond that.
A breakthrough came when I thought to look at the logs of my local DHCP
server. I saw that it was actually receiving requests from the VM, and
assigning it an IP address. Once I realised this, I hypothesised that
outgoing traffic was working, but not incoming.
Finally having something to look for, I had a look through the DPDK
virtio-vhost-user driver, and my suspicion was confirmed in an
unexpected way. It looks like incoming traffic (from the perspective of
the virtio-vhost-user frontend) is not actually implemented at all!
But with outbound traffic working, this means that I'm confident enough
I understand virtio-vhost-user enough to be able to leave this here for
now. From Spectrum's side, I can now be pretty sure that everything
should be workable, so we can just wait a bit for virtio-vhost-user to
get a bit further along and then revisit it. And since the frontend
has no idea it's talking to virtio-vhost-user instead of normal
vhost-user, we can use normal (host-based) vhost-user for now, and drop
virtio-vhost-user in down the line.
A couple of outstanding questions I still don't know the answer to about
- How will routing work if I have multiple frontend VMs with multiple
virtio-vhost-user connections all wanting to use the same network
device? Will I want to use something like Open vSwitch for that?
- DPDK by default uses a busyloop to check for data to process, for
efficiency. This is obviously not appropriate for a
workstation-focused operating system. There is an interrupt-based
mode, though, but I don't know how to use it yet.
Since I consider the concept proven, though, I'm going to punt on these
for now. The longer I leave these questions, the more likely it is that
a kernel driver for virtio-vhost-user will emerge and we can use that
instead. That's not to say I want to leave inter-guest networking
hanging forever, but I have other inter-guest networking bits I can
switch focus to for now, and once those are down I can revisit the
virtio-vhost-user backend situation.
I started integrating the vhost-user-net code from Cloud Hypervisor into
crosvm. I'm at the point where I can get all the copied Cloud
Hypervisor code to compile in crosvm, which is pretty good! I have not
yet written the code to actually start one of these devices yet, though,
so I haven't been able to test it yet.
It's been interesting to look at Cloud Hypervisor because it's a
codebase that is heavily based on crosvm (even more so than Firecracker
is), but that has also evolved and diverged from it. It's especially
interesting to see stuff where parallel evolution occurred between the
crosvm and Cloud Hypervisor codebases, or when Cloud Hypervisor changed
how some crosvm code worked, and then later changed it back again.
The codebases were still similar enough that I could have the
cloud-hypervisor device integrated into the crosvm codebase in a day,
although there's lots of code duplication that will have to be dealt
with -- I copied over a bunch of supporting code rather than trying to
integrate it into the crosvm equivalents to get the code running for the
first time in an environment as similar as possible to the one it was
designed for. I expect that when I test the device in crosvm it'll
probably work fairly quickly if not first try. The more complicated
part will be a bit of a change to how crosvm does guest memory that
isn't strictly necessary but is important for security.
crosvm allocates all guest memory in a single memfd. This means that,
to share guest memory with another process, like when using vhost-user,
the only option is to share all of guest memory. This would sort of
defeat the purpose of hardware isolation in Spectrum! But from what I
could tell -- I'm not 100% on this -- the guest memory abstraction in
cloud-hypervisor is more advanced, and I think it might support multiple
memfds backing guest memory for this sort of thing. I'll have to adapt
crosvm to that model to be able to use vhost-user securely.
The new "Bibliography" page is up! Lots of links to relevant resources
about concepts important to Spectrum. :)
It's a bit of a relief to have returned from the uncertain world of DPDK
to the familiar territory of crosvm. I'm confident that the next bit of
work here (vhost-user in crosvm) won't be that much of a big deal.
Hopefully, we'll have at interim networking to a reasonable degree
fairly soon. After that, I plan to look at file sharing, possibly with
vhost-user-fs (virtio-fs over vhost-user), which I noticed
cloud-hypervisor implements today. That should be pretty similar to the
networking stuff, although I don't think any virtio-fs virtio-vhost-user
code exists at the moment.