Last week, I'd just figured out how to do a normal vhost-user setup with a QEMU VM connected to DPDK. This week, I wanted to try to move DPDK into another VM using the experimental virtio-vhost-user driver, taking the host system out of the networking equation altogether.
In theory this should have been a very simple change, but I couldn't get it to work. DPDK claimed to be forwarding packets to the ethernet device I'd attached to the backend VM (the one running DPDK), but networking in the frontend VM (what you might think of as the application VM) didn't work at all. It tried and failed to do DHCP, and so couldn't progress beyond that.
A breakthrough came when I thought to look at the logs of my local DHCP server. I saw that it was actually receiving requests from the VM, and assigning it an IP address. Once I realised this, I hypothesised that outgoing traffic was working, but not incoming.
Finally having something to look for, I had a look through the DPDK virtio-vhost-user driver, and my suspicion was confirmed in an unexpected way. It looks like incoming traffic (from the perspective of the virtio-vhost-user frontend) is not actually implemented at all!
But with outbound traffic working, this means that I'm confident enough I understand virtio-vhost-user enough to be able to leave this here for now. From Spectrum's side, I can now be pretty sure that everything should be workable, so we can just wait a bit for virtio-vhost-user to get a bit further along and then revisit it. And since the frontend has no idea it's talking to virtio-vhost-user instead of normal vhost-user, we can use normal (host-based) vhost-user for now, and drop virtio-vhost-user in down the line.
A couple of outstanding questions I still don't know the answer to about DPDK are:
- How will routing work if I have multiple frontend VMs with multiple virtio-vhost-user connections all wanting to use the same network device? Will I want to use something like Open vSwitch for that?
- DPDK by default uses a busyloop to check for data to process, for efficiency. This is obviously not appropriate for a workstation-focused operating system. There is an interrupt-based mode, though, but I don't know how to use it yet.
Since I consider the concept proven, though, I'm going to punt on these for now. The longer I leave these questions, the more likely it is that a kernel driver for virtio-vhost-user will emerge and we can use that instead. That's not to say I want to leave inter-guest networking hanging forever, but I have other inter-guest networking bits I can switch focus to for now, and once those are down I can revisit the virtio-vhost-user backend situation.
I started integrating the vhost-user-net code from Cloud Hypervisor into crosvm. I'm at the point where I can get all the copied Cloud Hypervisor code to compile in crosvm, which is pretty good! I have not yet written the code to actually start one of these devices yet, though, so I haven't been able to test it yet.
It's been interesting to look at Cloud Hypervisor because it's a codebase that is heavily based on crosvm (even more so than Firecracker is), but that has also evolved and diverged from it. It's especially interesting to see stuff where parallel evolution occurred between the crosvm and Cloud Hypervisor codebases, or when Cloud Hypervisor changed how some crosvm code worked, and then later changed it back again.
The codebases were still similar enough that I could have the cloud-hypervisor device integrated into the crosvm codebase in a day, although there's lots of code duplication that will have to be dealt with -- I copied over a bunch of supporting code rather than trying to integrate it into the crosvm equivalents to get the code running for the first time in an environment as similar as possible to the one it was designed for. I expect that when I test the device in crosvm it'll probably work fairly quickly if not first try. The more complicated part will be a bit of a change to how crosvm does guest memory that isn't strictly necessary but is important for security.
crosvm allocates all guest memory in a single memfd. This means that, to share guest memory with another process, like when using vhost-user, the only option is to share all of guest memory. This would sort of defeat the purpose of hardware isolation in Spectrum! But from what I could tell -- I'm not 100% on this -- the guest memory abstraction in cloud-hypervisor is more advanced, and I think it might support multiple memfds backing guest memory for this sort of thing. I'll have to adapt crosvm to that model to be able to use vhost-user securely.
The new "Bibliography" page is up! Lots of links to relevant resources about concepts important to Spectrum. :)
It's a bit of a relief to have returned from the uncertain world of DPDK to the familiar territory of crosvm. I'm confident that the next bit of work here (vhost-user in crosvm) won't be that much of a big deal. Hopefully, we'll have at interim networking to a reasonable degree fairly soon. After that, I plan to look at file sharing, possibly with vhost-user-fs (virtio-fs over vhost-user), which I noticed cloud-hypervisor implements today. That should be pretty similar to the networking stuff, although I don't think any virtio-fs virtio-vhost-user code exists at the moment.