After I got an isolated Wayland compositor working last week, I wasn't
really sure what to do next -- this was a big piece of work that I'd
been very focused on for a while. The funding milestone I'm closest to
is to do with implementing hardware isolation, which the Wayland work
was a part of, so I decided to keep going with that, and explore other
types of isolation. More on that in a bit.
Posted my patch for virtio_wl display socket support in
libwayland-server. This is what allows it to run in a VM, and
receive connections from clients in other VMs. The patch description is
very extensive, so I recommend reading it for more detail if you're
It introduces a libvirtio_wl, which should also be useful for porting
other programs that we might want to communicate with across a VM
boundary, if they are written with normal Unix sockets in mind
(including transferring file descriptors). This is the evolution of
code I previously had put in wlroots, moved to Wayland for convenience.
If it ever acquires another user (or maybe even if it doesn't) it might
make sense to make it its own package, since virtio_wl is useful even if
Wayland isn't involved.
I pushed all my crosvm changes to get the isolated compositor working to
the work-in-progress "interguest" branch. Remember, I only got it
working last week right before I needed to start writing the TWiS email,
so I hadn't even done that yet! I also posted some patches to the list
to fix a bug in my previous crosvm deadlock fix, and to improve some
related documentation. As usual, these were kindly reviewed by Cole.
Next, I turned my attention to other forms of hardware isolation.
Wayland was a bit special, because despite crosvm including a virtual
"Wayland device", it's not really hardware, and so it required an
approach to isolation that will be quite different to other crosvm
virtual devices. My hope is that other virtual devices should all be
substantially similar to each other.
The basic idea for actual hardware isolation is that rather than having
drivers in the host kernel for USB, network devices, etc. those will be
exposed to dedicated VMs as virtual PCI devices. This should
substantially reduce host kernel attack surface. crosvm virtual devices
will be run in these device VMs, and communicate over virtio with
application VMs as normal. This will require implementing in crosvm a
virtio proxy device, than allows for the crosvm running an application
VM to forward virtio communication to the virtual device running in
userspace in the driver VM.
(The reason devices aren't attached to application VMs directly but run
in seperate device VMs is that hardware is probably not going to be very
happy if multiple kernels are trying to talk to it at the same time.
Additionally, this indirection means that application VMs only have to
use the one virtio driver for that device category, rather than any of
the hundreds of drivers for different hardware in that category. If one
of those drivers had a vulnerability, this should help to contain it to
the device VM.)
So I started writing this virtio proxy. The basic idea is to copy
virtio buffers from application VM guest memory into memory that can be
shared with the userspace virtual device in the device VM. I can't find
any prior art on this (which is not unusual -- not many systems isolate
drivers in this way), so this has required a lot of looking back at the
virtio paper and spec to make sure I understand what to do here.
As I write this, the next problem to solve is integrating some sort of
memory allocator that can manage buffer allocations in the shared memory
that the virtual device looks at. This is a new area for me that I'd
appreciate advice on if anybody can give it -- think of it like, I have a
memfd, mmaped into my process, and I would like to dynamically allocate
and release memory buffers of dynamic sizes in that region. I'm sure
there's a library I'll be able to plug in for this.
As usual, big thank you to Cole for reviewing patches, and for finding
room for improvement even in languages/areas he isn't familiar with.
It feels nice to have done some thinking about the project at a slightly
higher level than I have been recently, and to know where I am on the
way to the next milestone. Having taken a lot of time away from the
milestone list this year to work on fundamentals, it's good to feel like
I'm getting back on track.