What a week. Progress has felt a bit slow, but the work has been consistently interesting, and there have been some exciting developments in the ecosystem.
Last week I'd posted the patches required to get all of Spectrum working with a more up-to-date Nixpkgs. After last week's email, Cole reviewed the patch set, so I applied it, and merged the nixpkgs-update branch into master, so now Spectrum is using a reasonably up-to-date Nixpkgs for the first time in a long time.
I also said
Once that's done, it's time for another chromiumOSPackages upgrade, but that should be pretty easy this time because we're only one version behind.
Alas, it's never that simple.
Firstly, the chromiumOSPackages update script broke, because the information about the currently released Chromium OS build seems to have gone missing. Google is apparently serving build number 13982, but their published build metadata includes builds 13981 and 13983, but not 13982. This means it's not possible for me to know what Git revisions are used in the currently released Chromium OS. Assuming this is just a one time thing, I hacked the update script to just look at the previous build, but we should keep an eye on this. If it ever happens again we should probably implement some sort of mitigation in the update script.
Once I had new versions of the Chromium OS packages it was time to get them to build, which was straightforward enough for everything except crosvm. For Spectrum, we have a patch for crosvm to make it support VIRTIO_NET_F_MAC, which is a mechanism by which the host system can indicate to the guest kernel what it should set as the MAC address of a virtual network device. After the update, this patch no longer applied, because all of a sudden crosvm has two different virtual network device implementations.
This turns out to be because crosvm has implemented vhost-user, a protocol to allow virtual devices to be implemented outside the VMM program! This is great news, but it's also surprising. virtiofs was designed to be implemented with vhost-user, but when crosvm implemented it a while ago, they became the only implementer to do so in-VMM. It's great to see them moving in the vhost-user direction, because it makes it much easier to mix and match virtual device implementation and VMMs. Most excitingly, I saw a reference to vhost-user-wl, meaning a standalone implementation of Virtio Wayland. This would allow us to use Virtio Wayland with other, non-crosvm VMMs, which is great because I think cloud-hypervisor is probably going to end up being a better fit for Spectrum, but Virtio Wayland was crosvm's killer app. I'd even thought about trying to port crosvm's virtio wayland implementation to vhost-user myself, so it's great to know that when the time comes, that'll already have been done for me.
While I could have fixed my crosvm patch for all this new virtual network device code, the introduction of vhost-user support means we should be able to drop the patch altogether. cloud-hypervisor provides a vhost-user-net implementation that already supports VIRTIO_NET_F_MAC, so if we can just get crosvm to talk to cloud-hypervisor's vhost-user-net implementation we shouldn't have to carry any patch for this any more.
: https://cros-updates-serving.appspot.com/ : https://chromium.googlesource.com/chromiumos/manifest-versions : https://spectrum-os.org/git/nixpkgs/tree/pkgs/os-specific/linux/chromium-os/... : https://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.html : https://qemu.readthedocs.io/en/latest/interop/vhost-user.html : https://virtio-fs.gitlab.io/index.html#overview : https://github.com/cloud-hypervisor/cloud-hypervisor : https://github.com/cloud-hypervisor/cloud-hypervisor/tree/db2159b5638eb4ccf5...
cloud-hypervisor / rust-vmm ---------------------------
So I started looking at cloud-hypervisor to try to hook this all up, but it looks like cloud-hypervisor still has some issues where it doesn't quite follow the vhost-user specification. As I was trying to debug this, I noticed some UB in rust-vmm (the shared utility code project for crosvm and its derivatives like cloud-hypervisor and Firecracker).
Being a good citizen, I started working on a fix for this, and then I encountered some more issues with functions that should have been marked as unsafe but weren't. So that's going to need to be fixed too.
But the rust-vmm code is otherwise very high quality and easy to work with, and they're very responsive, so it shouldn't take long to get these issues fixed. The affected code is fortunately all to do with communication between vhost-user backends and the VMM, so it's very very unlikely that it's anything that could be exploited by a guest.
These sorts of problems are exactly the sort of thing that Rust is supposed to prevent, so it was disappointing to discover these issues. But on the other hand, Rust made them stand out like sore thumbs to me, so even though these issues managed to sneak in, there's definitely still a huge benefit to using Rust for these sorts of programs.
In the next week, I'm hoping to get all the rust-vmm issues I've discovered fixed, and maybe get cloud-hypervisor's backends to speak proper vhost-user. I'm getting my second vaccination next week as well though, so we'll see. I might just end up being sick.