This is going to be a very short one, and it's coming a day late,
because I have been busy and didn't want to stop yesterday. 2020-W22's
TWiS was sent a day early, so the delay is balanced out. :)
I'm finally making progress with the Wayland stuff I've been staring at
for the past two weeks! Now that I'm unblocked, things feel like
they're moving very quickly.
I've modified a Wayland compositor and libwayland-server to use a
virtio_wl socket. I've also implemented the compositor side of the
accept(2) fallback. Just now, I finished writing a proxy program that
implements the host side of that. Next up is testing it.
After that, I think we might be there? But it's also possible I have my
head so deep in it there's another bit that I've forgotten.
Another week that ended up being a detour from the Wayland work I wanted
to do, unfortunately. I think this will be the end of it, though.
Fixed an error in my crosvm deadlock fix from last week. Due to
an oversight (obvious as soon as I tried to test the affected devices,
and from looking at the code in retrospect), this actually broke the
devices that were *not* affected by the deadlock, which I had neglected
to properly test. Fortunately, it was a very small and easy fix.
I also had an interesting (and ongoing) conversation with Cole about
the deadlock fix in the mailing list thread. It's one of those
situations where getting to talk about the code with another person and
having to explain things is going to result in much better code. So
we'll end up with a much tidier fix than what I came up with on my own.
Thanks to Cole for his feedback and this opportunity. It's great to
know that somebody else is passing an eye over my code.
My current area of attention is modifying Wayland to use a virtio_wl
socket as the display socket. Preparing to start implementing that this
week, it occurred to me that working on it using my current setup was
going to be extremely time consuming. Up until now, I've been doing Nix
VM rebuilds when I want to test a change to, for example, wlroots. This
means rebuilding, in that case, wlroots and all dependencies
(i.e. Wayfire). In the case of wlroots, it's not so bad because
rebuilding two packages doesn't take all that long, but it wasn't going
to work for Wayland.
So I decided that the best way forward would be to create a VM running a
traditional Linux distribution, in which I could hack on Wayland without
having to restart the VM or rebuild all of its dependents. To that end,
I spent much of this week setting up an Alpine VM that can be run under
both crosvm and QEMU. (Configuring networking with crosvm is hard for a
one-off like this, but QEMU has a built-in userspace networking stack
that makes it easy, so when I want to, e.g. install a package, I can
boot with QEMU to do that, and then go back to crosvm for testing
virtio_wl things.) This resulted in a Nixpkgs pull request to
package Alpine's apk package manager, which was used in building the VM
As of this afternoon, I have this VM all set up and ready to go. I have
a script that runs it under crosvm with a virtio_wl socket available to
try to connect Wayland to, and another virtio_wl socket connected to the
host Wayland compositor for testing. It also sets up some limited
networking that allows me to SSH into the VM, but does not allow the VM
to access the internet. A separate script runs the VM under QEMU with
full networking. The VM has Wayland, wlroots, and Sway (the choice of
wlroots-based compositor is irrelevant to this work, and Sway is easier
to build) installed from source, and because I can SSH to it I can edit
files in the VM easily from my host Emacs using TRAMP.
All of this will allow me to iterate on the Wayland work without having
to wait upwards of an hour to test each change. This was actually the
first time I'd ever booted a VM image in crosvm that I hadn't generated
specifically for that purpose myself with Nix. I improved my
understanding of crosvm in a few ways in the process (although not
anything concrete enough I can think of to write down here).
Next week, I'll continue with the Wayland work. It shouldn't be all
that much work in the end, but getting set up to actually be able to do
it has been annoyingly complicated.
I just discovered this project and I'm really excited about it! I've long waited for an OS that combines Qubes-like compartmentalization with the reproducibility of Nix/GUIX.
I'm trying to understand what this project aims to improve over Qubes, other than the integration of Nix (which I do think is really important!). I read the motivation page , but I'm not yet convinced by most
of the points mentioned that relate to Qubes. Going over each usability issue mentioned in the motivation doc:
- "Hardware compatibility is extremely limited": I don't believe this is really the case for the minimum Qubes 4 requirements : most modern computers people buy support these. Is there anything I'm missing?
- "People are reluctant to use Xen on their computer for power management etc. reasons." Can you elaborate on these issues?
- I know that Qubes considered using KVM and decided against it for security reasons . My understanding is that the downside of this decision is the limited hardware support, which is one of the things that Spectrum views as an opportunity for improvement. Can you elaborate on this decision?
- "VMs are heavy": How will Spectrum improve on this without sacrificing security?
- "GUI applications are buggy, command line tools are mostly undocumented": I assume that the reason for this is the lack of resources the Qubes project has. However, I don't see how this will be be
better in the case of Spectrum which is a new project with one developer.
More generally, I'm wondering whether this projects' goals couldn't be better achieved by trying to work with the Qubes developers to integrate Nix. It may very well be that they would reject it for
some reason, but then the logical next step would be to fork Qubes.
Have you reached out to the Qubes developers?
Thanks in advance!
An interesting week. Didn't go quite in the direction I expected it to.
Cole kindly provided some documentation for spectrum-vm for the
developer manual. This was especially nice, because it marked the first
time I was able to remove an item from the Contribution Ideas page
because somebody had done the contribution! Thanks Cole!
My main goal for this week was to start modifying libwayland-server to
use a virtio_wl socket as the display socket, as described last week.
So, in pursuit of this goal, I started the week by trying to write a
test program that would start a VM running a compositor, create a
socket, attach one end to that VM, and then start another VM running a
Wayland client, and attach the other end of the socket to that VM.
Then, in theory, all I had to do was the libwayland-server
modifications, and then the compositor and client would be able to talk
to each other.
Unfortunately, I didn't get that far. The program froze after starting
the first VM. I assumed I must have done something wrong, so I rewrote
it, and had the same problem. I did it again, and same problem.
Some further debugging revealed that the command to add the socket to
the first VM was hanging. And also, crosvm was hanging. Running the
command outside this program worked fine, though. It returned
immediately, and crosvm kept running with the new socket added.
What I eventually realised was that the crosvm hang only happened if the
command was sent to it early enough. After some intense staring at the
crosvm code, I realised that there was a deadlock resulting from trying
to wait for a response from a virtual device before the device had been
initialized. Despite this issue manifesting with adding Wayland
sockets, which is a feature unique to crosvm, a lot of the code for that
is adapted from the code for resizing virtual block devices, which is an
upstream feature. And that had the same issue, so this is actually an
The fix was not easy, and required some fairly substantial refactoring.
It's described in detail in the patch message. I think it's probably
a pretty interesting read.
I also merged in the latest upstream crosvm changes.
I spent a few more hours trying to get libvda (an optional dependency of
crosvm; required for virtio-video) to build, and I got it to work,
thanks to some help from Puck. There's still a hack in there I'll tidy
up before I push it, but it's cool to have this packaged for future
experimentation with this crosvm feature.
With the crosvm issue out of the way, which ended up taking most of the
week to identify and fix, hopefully next week I'll be able to actually
get going on those libwayland-server changes.
Apologies for the incoherent title: I'd like to better understand the design choices of this project and discuss how it relates to other projects in this space.
First I'd like to say that I think that using crosvm is a really great decision. Google has a lot of manpower working on ChromeOS and Android, and building on their work is something that should
pay off, especially for a project such as Spectrum that tackles such a huge undertaking (building a secure OS).
Here's a few questions to kick off the discussion:
- Have you considered using a micro kernel based host like seL4, similar to what Genode does (at least as I understand it)?
- Have you considered gVisor  for lightweight compartmentalization?
- Have you considered reusing stuff from the Whonix project?
I wasn't quite back up to normal speed after being sick last week, I
think, but we're getting somewhere and I did some cool stuff!
Upgraded Mailman. Most significantly, the Hyperkitty UI has some nice
improvements, which I think will make it easier to navigate. :)
I integrated and committed last week's wlroots patch (to make it
use virtio_wl-compatible memfds).
Spent quite a lot the start of the week working on packaging more
Chromium OS components to build some optional crosvm features. They
just added support for virtio-video, which sounds pretty cool. I
managed to get libchrome working again, which I previously had to remove
because I couldn't get it to build! I haven't committed any of this
yet, because I haven't got to the point where it's useful for building
optional crosvm features, and I don't want to add packages that aren't
useful for anything, but I did make quite a lot of progress. After a
couple of days of this, though, I put it aside to focus on other things,
since it would be useful to have, but its not a priority right now.
I posted a patch to make listing a directory shared with --shared-dir
not crash crosvm when sandboxing was enabled. I'd been reluctant to fix
this, because --shared-dir (which exposes a host file system to a crosvm
guest) is not going to be useful in Spectrum, where shared directories
should come from another VM. But it's useful enough in development that
not fixing it was more frustrating than just fixing it.
I spent a lot of time this week thinking about Wayland connections. My
original plan for inter-VM communication was to make it possible to run
crosvm devices as standalone programs inside VMs. I still think this is
the right track for most virtio devices, but for Wayland it's starting
to feel less and less like it makes sense, because the Wayland device is
basically a wrapper around a socket, and both ends will be hooked up to
sockets. So what I now plan is that a VM running a Wayland client
application will connect to a host socket, with the other end connected
to a VM running the compositor. The client applications will be run
under Sommelier, and so will be able to run unmodified. The compositor
will be modified to use a virtio_wl socket.
A slight complication here is that a Wayland compositor usually accepts
on a socket to receive client connections, but you can't accept on a
virtio_wl context. So instead, what we can do is emulate accept by
attaching a new virtio_wl socket, and then sending the name of that
socket over the main socket. This can be wrapped inside the VM to look
very much like an accept. Unfortunately, crosvm virtio_wl sockets could
only be set at startup time. So today I fixed that. Adding support
for adding virtio_wl sockets at runtime was pleasantly straightforward.
I feel very energized now that I managed to bang that out in a day. :D
This is one of those weeks where I can feel writing this that it might
not make much sense to anybody else. But the important thing with these
weekly updates is that I write _something_ we can refer back to in
future, and communicate that things are happening. If you'd like to
better understand what I'm talking about here, I'm happy to try walking
through it on IRC.
Next week, I'm going to work on getting libwayland-server to speak this
weird virtio_wl faux-accept.