This Week in Spectrum, 2021-W10

14 Mar 2021

      Surprise!

I sent the last This Week in Spectrum in August.  Since then, it's
been pretty quiet.  What happened?

Towards then end of last year, I was aggressively pushing to achieve a
funding milestone, but was starting to struggle to keep myself going.
After that last TWiS, I decided to stop posting them for a while to
fully concentrate on the push.  I continued like that for two more
months, before getting to the point where I was so burned out I just
couldn't keep going, with the milestone still not done, the work I had
done so far on it unpublished (and not in a good state to be
published), and time running out.

Fortunately, I was able to talk to NLnet, and they agreed to extend my
funding deadlines and renegotiate the milestones.  With that sorted
out, in late November I began a hiatus from Spectrum to allow myself
to recover from the burnout.  I'm very grateful to my GitHub Sponsors,
thanks to whom I was able to take this time to recover.

I tried to come back at the end of January, but after a single day of
Spectrum work I was feeling much the same as I had been in November.

This week, though, something changed!  I've been working on Spectrum
again for the last week, and I've been feeling great about it!

Now, there were three months of work between the last TWiS in August,
and the start of my hiatus.  That's a lot of work I haven't written
about.  But I think the only way this is really going to work is if I
leave that largely unwritten about for now, because TWiS tends to be
thousands of words long even when it's covering a single week of
work.  So I'm just going to talk about what I've done in the last
week, explaining context from previous work as necessary.

With all that in mind, here we go!

"Qubes-lite with KVM and Wayland"[1]
------------------------------------

First up is something somebody else did, which is always nice!  Thomas
has been doing his own great work replacing his Qubes system with a
system built on NixOS, using KVM and Wayland.

Thomas got in touch on the Spectrum discuss mailing list in January[2]
asking some questions about how some things were implemented, and we
went back and forth a bit then (although not as much as I'd have
liked, due to my burnout :/).  Since then, he's come up with some
awesome stuff, like an OCaml implementation of a virtio-wayland proxy
(equivalent to Sommelier in Chromium OS).

In our further conversation prompted by Thomas's article, Thomas
raised a further interesting idea[3] -- what if we had a
Sommelier-like Wayland proxy, but that ran in the same security domain
as the compositor.  That way, we could filter to allow only the
protocol extensions we want to.  We could implement our own
permissions when the compositor doesn't.  This code would be
responsible for securely handling Wayland messages from an untrusted
guest, which would be beneficial because it would be a lot less code
to audit, and because it could be written in a memory-safe language.
This would avoid the need to write or find a secure Wayland
compositor, and would in fact allow us to use virtually any compositor
without having to worry nearly as much about its security credentials.

I'm very excited by that idea, and I've been asking myself for the
past few days why I didn't think of it before!  I've thought several
times "it's quite nice that Sommelier gives us a known subset of
Wayland, but it's a shame we can't rely on it since it runs inside a
application guest".  But for some reason it just never occurred to me
to follow that idea one step further, to "what if we had a similar
program that run in the compositor's security domain".

[1]: https://roscidus.com/blog/blog/2021/03/07/qubes-lite-with-kvm-and-wayland/
[2]: https://spectrum-os.org/lists/archives/spectrum-discuss/CAG4opy_hz_ESEpY0TqJ...
[3]: https://spectrum-os.org/lists/archives/spectrum-discuss/CAG4opy_JztAH3tD+ZUq...

uscpi-vsock
-----------

ucspi-vsock[4] is a program I started writing in September.  VSOCK[5]
is a special Linux socket type that allows for communication between
VMs, and I'm using it to allow services running in VMs (a USBIP
daemon, for example), to notify the host (and by extension other VMs),
when the service in the VM becomes ready to accept connections.
USCPI[6] is a standard command interface for making it easy to do
socket communication using standard IO primitives, without having to
teach programs how to do socket connections for every kind of socket.
There are several UCSPI implementations for common socket types like
Internet and Unix domain sockets, but until now there was not one for
VSOCK.

So, for example, in a VM we can do:

    vsockclient 2 $port sh -c 'echo >&7'

Which will connect to the host over VSOCK (in VSOCK the host is always
address 2), write a newline, and close the connection.  UCSPI lets us
implement this with standard shell tools.  There's an equivalent
"vsockserver" program for the other end, that will run a command
whenever a connection is received.

Anyway, this week I went back to ucspi-vsock to fix some
bugs[7][8][9][10].  As usual, (although I'm happy to realise this is
still the case after all these months!) Cole was kind enough to review
my patches.

[4]: https://spectrum-os.org/git/ucspi-vsock/
[5]: https://man7.org/linux/man-pages/man7/vsock.7.html
[6]: https://cr.yp.to/proto/ucspi.txt
[7]: https://spectrum-os.org/lists/archives/spectrum-devel/20210309154048.14474-1...
[8]: https://spectrum-os.org/lists/archives/spectrum-devel/20210309171816.8589-1-...
[9]: https://spectrum-os.org/lists/archives/spectrum-devel/20210310204516.20041-1...
[10]: https://spectrum-os.org/lists/archives/spectrum-devel/20210310204555.20725-1...

Interguest networking
---------------------

The bulk of my work this week has been in Spectrum's Nixpkgs, which is
where (for now, at least) all the VM definitions and stuff live.
Mostly, I've been trying to clean up months of frantic work into
something I can actually publish as a series of patches, so it's out
there instead of just on my computer!

My focus at the moment is on doing this with the interguest networking
implementation I wrote last year.  The plan for this remains the same
as last year: have a VM that manages all network hardware access and
acts as a router for other VMs that need to talk to the outside world.
Eventually, this will hopefully use virtio-vhost-user[11], which will
allow us to avoid going through a networking stack on the host at all
(or even having one), but until virtio-vhost-user is further along,
we'll use bridge devices on the host to connect client VMs to the
router.  A very basic version of the latter is implemented and working
for me locally -- I can run an application VM, and have it connect to
another VM which runs the drivers for my ethernet port using VFIO (PCI
passthrough).

This code was a big ball of mud, and I'd also subtly broken it when I
moved on from interguest networking last year to try to get a proof of
concept going for another form of interguest communication.  So I
spent this week trying to page all the context I'd lost in the last
few months back into my brain, making lots of partial git commits,
writing commit messages, and fixing it up so that it works again.

I think what I have is about ready to publish -- hopefully you'll see
some patches next week.  But what exists at the moment is still very
limited -- the biggest limitation is that the router VM doesn't
support hotplugging, so a VM that wants to connect to it actually has
to be started first, before the router VM.  Not good for something
that should be run as a system service and likely started on boot!
To overcome this limitation, I'll probably have to add support in
crosvm for adding network devices at runtime using the control
socket.  I think this should be easy enough, and I haven't looked at
what's happened in crosvm + rust-vmm since I've been away, so it's
even possible somebody else will have done this already.

[11]: https://wiki.qemu.org/Features/VirtioVhostUser

So, the todo list for next week is getting the initial interguest
networking PoC patchset posted.  That'll come with a nice little demo
other people will be able to try out if they want to.  And then, I'll
work on improving it further to the point where it's actually
practical.  I don't want to spend /too/ much time on this, since
ultimately I do want to be using a virtio-vhost-user stack, which is
quite different, but it's important to have an implementation of this
so there's something to test further development (that doesn't care
about the implementation details) against.

My biggest worry at the moment is burning out again.  Working in
moderation isn't easy for me (I tend to get very sucked into things),
but it's important for the project that I find a way to keep my work
sustainable, so we don't lose time like this again.

I'm still a little hesitant to say "this is it, I'm back for real".
But the signs are looking good.

Thank you for sticking with me through this.

This Week in Spectrum, 2021-W10

Alyssa Ross