Hi,
I've been running Qubes for a few years now and I'd like to give
Spectrum a try, as I've been having some hardware and performance
problems with Qubes. Is there some up-to-date guide I can follow? I
found https://alyssa.is/using-virtio-wl/#demo and was able to see the
weston terminal. I also tried updating to the latest commit and was
able to get a nested wayfire window with:
nix-build . -A spectrumPackages && ./result-3/bin/spectrum-vm
(I'm fairly new to Nix, so not sure if this is the right way to do things)
I managed to change the keyboard layout, mount a tmpfs for home, and
increase the memory enough to start firefox, but I haven't managed to
get much further. Things I tried so far:
- I tried replacing wayfire with weston-terminal, to avoid the nested
session. But sommelier segfaults when I do that.
- I tried adding `--shared-dir /tmp/ff:ff:type=9p` to share a host
directory. Then `mount -t 9p -o trans=virtio,version=9p2000.L ff /tmp`
in the VM seemed to work, but `ls /tmp` crashed the VM.
- I tried using `-d /dev/mapper/disk` to share an LVM partition, but
`mount -t ext4 /dev/vdb /tmp` refused to mount it.
- I tried enabling networking with `--host_ip 10.0.0.1`, etc, but it
said it couldn't create a tap device. I guess it needs more
privileges.
Ideally, I'd like to run a VM with each of my old Qubes filesystems,
to get back to where I was with my Qubes setup, before investigating
new spectrum stuff (e.g. one app per VM). Do you have any advice on
this? I see these lists are a bit quiet - I hope someone is still
working on this because it sounds great :-)
Thanks!
--
talex5 (GitHub/Twitter) http://roscidus.com/blog/
GPG: 5DD5 8D70 899C 454A 966D 6A51 7513 3C8F 94F6 E0CC
The Spectrum IRC channel is now #spectrum on irc.libera.chat.
It can also be joined through the Matrix bridge as
#spectrum:libera.chat. I understand there are some teething problems
with the Matrix bridge, but hopefully they should be resolved soon.
Apologies for the inconvenience.
Tommorow (2021-05-19) at 7:30 UTC, I'm going to be part of a panel
discussion at the NGI Forum. NGI is the EU initiative that funds my
work on Spectrum.
The topic of the panel is "Internet of Trust", and it'll be streamed
here:
https://2021.ngiforum.eu/
There's a bit more information about the panel in the agenda here:
https://prod5.assets-cdn.io/event/6431/assets/8379007230-30f8d8b6fb.pdf
I've been thinking a lot about this for a while, thanks to conversations
with Thomas and Puck, CCed here. I think it's time to put it into words
properly and start working towards making it happen.
One of the benefits that Wayland is supposed to have over X11 is
security. A Wayland application isn't supposed to be able to record the
screen without user permission, for example. But in most compositors,
it can, with no restrictions. Existing Wayland compositors are
monolithic, and each one would have to implement its own access
controls. (Mutter already does this to some extent, at least for screen
sharing, I believe.) The popular Wayland compositors are largely
focused on being feature-complete reimplementations of their X11
equivalents, and so taking advantage of the security features and access
controls the Wayland protocol makes possible hasn't been a priority for
them. Additionally, every popular Wayland compositor is written in a
memory-unsafe language, and this combined with the complexity of the
Wayland protocol, with all the extensions involved, presents a serious
concern to applications of Wayland that involve untrusted clients.
To solve these problems, I propose a proxy program that sits between
Wayland clients and the compositor, in the same privelege domain as the
compositor. The proxy would decode and re-encode every Wayland request
(client->compositor message), and would discard any request it didn't
understand. This would mitigate the problem of a large, privileged
program written in a memory-unsafe language being exposed to untrusted
inputs. Additionally, the proxy would support a plugin interface,
through which the user of the proxy (or their distributor) could
configure custom behaviour. This could be used to prompt the user for
confirmation before allowing a screen capture request, or even to
implement a similar thing for e.g. clipboard access, for which there is
no support in the Wayland protocol. It could even be used to modify
surfaces, to implement things like Qubes-style unspoofable coloured
window borders.
This approach would allow permissions systems and other custom Wayland
behaviour to be implemented in a compositor-independent manner.
Distributions which suppor tseveral compositors could implement
customisations in a single place, and users of compositors which lack
security features and the assurances memory-safety can provide against
untrusted input would gain access to those things.
I'd like to hear feedback here, but I think early in the life of this
idea we should also reach out to the broader Wayland community. I think
there's a lot of potential for this idea beyond Spectrum, and it would
be great if it could be something developed with input from a big
breadth of Wayland users. If we can do that, it might be sensible for
it to live at freedesktop.org? I'm not sure how that works.
Let me know what you all think. :)
On May 22, 2021, at 8:05 AM, Alyssa Ross <hi(a)alyssa.is> wrote:
>
> One of the benefits that Wayland is supposed to have over X11 is
> security. A Wayland application isn't supposed to be able to record the
> screen without user permission, for example. But in most compositors,
> it can, with no restrictions.
<snip>
>
> To solve these problems, I propose a proxy program that sits between
> Wayland clients and the compositor, in the same privelege domain as the
> compositor.
<snip>
> If we can do that, it might be sensible for
> it to live at freedesktop.org? I'm not sure how that works.
I am curious, if you have time, to hear more on why the approach of a proxy vs picking a compositor and implementing security there.
If the problem is that the Wayland community so far has not considered security a priority, it seems that a security proxy may suffer from those same forces. Basically, will it be easier to attract developers or gain widespread adoption of a proxy as opposed to getting buy-in to do security directly in a compositor? You mention writing in a memory safe language and having a compositor neutral solution as technical advantages.
Do you think a proxy is a good choice primarily because it can achieve a better technical result, or is the choice of a new component more a matter of difficulty getting community buy-in from a popular compositor and doing security there? How would you weigh the upsides of a new project against the difficulties of getting a new thing off the ground and adopted?
(This is really just curiosity on my part and my $0.02 from the outside. You may have already had a lot of discussions about that, or even already tried talking to compositor folk and not gotten traction. Seems worth some explicit consideration.)
Introduction
------------
Virtio-vhost-user[1] is a promising virtualisation technology that allows
virtual devices that are exposed to VMs to themselves be implemented
in VMs.
Let's break down its name a bit to understand how it works:
* Virtio[2] is a standard driver interface for virtualisation.
Interfaces are available for all sorts of types of virtual devices,
e.g. virtio-net, virtio-blk, and virtio-scsi. Typically, virtio
devices are implemented by a virtual machine monitor (VMM).
* Vhost[3] is kernelspace implementation of virtio virtual devices,
created for their performance benefit. Instead of implementing the
virtual devices itself, the VMM talks to the kernel implementation
of them using a special ioctl protocol.
* Vhost-user[4] allows another process to implement the vhost
protocol, instead of the kernel, by using a UNIX socket instead of
ioctls on a special character device. This doesn't provide the raw
performance of vhost, but it serves a different purpose -- it
allows virtual devices to be implemented by external programs, in a
standardised way so they're portable between VMMs.
Virtio-vhost-user allows the program implementing the virtual device
to run in a VM of its own, by having the VMM for that VM create the
vhost-user socket, and transferring messages over it to its guest
using virtio. This is exciting for Spectrum, because it would mean
that the host system doesn't have to interact with physical hardware
directly beyond the PCI level, and can instead pass it through to a
VM, which is responsible for implementing the virtual device backed by
that physical hardware, which can be exposed to other VMs.
Last year I spent a while looking into virtio-vhost-user[5][6][7].
It's a long way from being ready to use, and it seems to be maturing
very slowly. It might be useful to us eventually for driver
isolation, or something else might come along. My conclusion from my
research was that we should decide later, once the ecosystem has had a
chance to develop. But I wanted something to come out of the research
I did anyway, and so I've prepared a demonstration.
[1]: https://wiki.qemu.org/Features/VirtioVhostUser
[2]: https://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.html
[3]: https://blog.vmsplice.net/2011/09/qemu-internals-vhost-architecture.html
[4]: https://qemu.readthedocs.io/en/latest/interop/vhost-user.html
[5]: https://spectrum-os.org/lists/archives/spectrum-devel/87pn8rezqn.fsf@alyssa…
[6]: https://spectrum-os.org/lists/archives/spectrum-devel/87blk1pwph.fsf@alyssa…
[7]: https://spectrum-os.org/lists/archives/spectrum-devel/87wo2glkg0.fsf@alyssa…
What the demo does
------------------
Using any sort of non-host-based virtual device implementation is
going to have to start with taking the virtual device implementation
out of the VMM running an application VM, and vhost-user the clear
solution to this. Vhost-user isn't supported by crosvm -- its focus
is on doing all the virtualisation required by Chromium OS, so there's
no need for it to allow other programs to provide virtual device
implementations. So another part of the research I did was to try to
port the vhost-user implementation from cloud-hypervisor to crosvm,
which I was able to do successfully[8]. This has implications beyond
vhost-user, and even beyond crosvm, because it demonstrates that it's
practical to port features between rust-vmm[9] VMMs, which means we
don't have to worry about finding one that provides every feature we
need (which is just as well, because there isn't one).
So here I demonstrate not just a "standard" virtio-vhost-user setup
(to the extent that such a thing can exist at this early stage), but
also that my patched crosvm with vhost-user support is capable of
interoperating with the experimental virtio-vhost-user implementation
for QEMU, because both are speaking the standardised vhost-user
protocol.
The demo sets up two VMs. One is run with my patched crosvm, and
expects a virtual ethernet device to be provided by a vhost-user
socket. When it boots, it brings up its network interface, tries to
run a DHCP client, and then exits. The other VM is run with Nikos
Dragazis and Stefan Hajnoczi's experimental virtio-vhost-user
implementation in QEMU[10]. It gets a standard virtual ethernet
device (backed by a TAP device on the host), and the virtio-vhost-user
device hooked up to the socket that crosvm will be connecting to.
Inside the VM, a userspace networking stack (DPDK, again modified to
support virtio-vhost-user by Nikos Dragazis[11]) implements the device
side of virtio-vhost-user, and forwards packets sent by crosvm's guest
to the virtual ethernet device backed by the host TAP.
+-----------------------------------------------------------------------------+
| |
| +------------------------+ +------------------------+ |
| | | | | |
| | +----------------+ | | +----------------+ | |
| | | | | | | | | |
| +-----+ | | +--------+ | | | | | | |
| | TAP +------+---+---+ DPDK +---+---+------+---+ | | |
| +-----+ | | +--------+ | | | | | | |
| | | | | | | | | |
| | | Linux | | | | Linux | | |
| | +----------------+ | | +----------------+ | |
| | | | | |
| | QEMU | | crosvm | |
| +------------------------+ +------------------------+ |
| |
| Linux |
+-----------------------------------------------------------------------------+
A complicating factor is that the virtio-vhost-user implementation for
DPDK only supports outgoing traffic[12]. So packets coming from
crosvm will be relayed to the TAP, but not the other way around. This
means that we can't just use ping inside the crosvm VM to verify that
the connection is working. Instead, we have to tcpdump on the host
and verify that the packets the DHCP client inside the crosvm VM is
sending are arriving on the TAP.
For this to be useful for our intended purpose of isolating drivers
for physical devices, we'd pass through the device here rather than
using a TAP. It would otherwise work exactly the same, but it's more
difficult to test it's working correctly. (I have tested it though --
for the first version of this I got working last year, I verified it
worked by checking the logs of my local network's DHCP server.)
[8]: https://spectrum-os.org/lists/archives/spectrum-devel/20210512170812.192540…
[9]: https://github.com/rust-vmm
[10]: https://github.com/ndragazis/qemu/tree/virtio-vhost-user
[11]: https://github.com/ndragazis/dpdk-next-virtio/tree/virtio-vhost-user
[12]: https://github.com/ndragazis/dpdk-next-virtio/blob/2d60e63/drivers/virtio_v…
Running the demo
----------------
First, create a TAP device for QEMU to use:
# ip tuntap add qemutap mode tap
# ip link set qemutap up
Start tcpdump, so we can see if packets arrive on the TAP:
# tcpdump -i qemutap
Start the QEMU VM:
$ $(nix-build -A qemuVm /path/to/demo.nix)
When you see "Press enter to exit", DPDK is ready to receive a
virtio-vhost-user connection.
Start the crosvm VM:
$ $(nix-build -A crosvmVm /path/to/demo.nix)
Once that VM boots, you should see some "BOOTP/DHCP" lines in the
tcpdump output. This demonstrates that traffic from the crosvm guest
has been relayed over virtio-vhost-user to DPDK, and then to the TAP
on the host over virtio-net.
You'll want to press enter to shut down the QEMU VM now, because DPDK
pegs a CPU core (for reasons[*] unrelated to virtio-vhost-user that
are out of scope here).
Then you can remove the TAP device:
# ip link delete qemutap
Nix expression for the demo
---------------------------
# SPDX-License-Identifier: MIT OR Apache-2.0
# SPDX-FileCopyrightText: 2021 Alyssa Ross <hi(a)alyssa.is>
let
pinned = builtins.fetchTarball {
url = "https://github.com/NixOS/nixpkgs/tarball/b14062b75c4e8ef4dd4110282f7105be87…";
sha256 = "1hzs0w6pcwwbzl2gkqyk46yrzizzm03mph4kggws02a6vlwphsib";
};
in
{ pkgs ? import pinned {} }: with pkgs;
rec {
linux = pkgs.linux.override {
structuredExtraConfig = with lib.kernel; {
"9P_FS" = yes;
NET_9P = yes;
NET_9P_VIRTIO = yes;
PACKET = yes;
VFIO = yes;
VFIO_NOIOMMU = yes;
VFIO_PCI = yes;
VIRTIO_NET = yes;
VIRTIO_PCI = yes;
};
};
dpdk = stdenv.mkDerivation {
name = "dpdk-virtio-vhost-user";
src = fetchFromGitHub {
owner = "ndragazis";
repo = "dpdk-next-virtio";
rev = "0a46582dc1d02c0dc5069347ffff1a64239385f2";
sha256 = "169cxdps9k764jj420q44262x3291h2jcqsbrh7038hqjczjkgif";
};
buildInputs = [ numactl ];
configurePhase = ''
runHook preConfigure
make $makeFlags defconfig
runHook postConfigure
'';
enableParallelBuilding = true;
RTE_KERNELDIR = "${linux.dev}/lib/modules/${linux.modDirVersion}/build";
NIX_CFLAGS_COMPILE = [
"-Wno-error=implicit-fallthrough"
"-Wno-error=incompatible-pointer-types"
];
makeFlags = [
"RTE_OUTPUT=$(out)/lib"
"kerneldir=$(out)/lib/modules/${linux.modDirVersion}/build"
"prefix=$(out)"
];
inherit (pkgs.dpdk) meta;
};
# DPDK is huge! We just need one program from it.
testpmd = runCommandNoCC "testpmd" {} ''
mkdir -p $out/bin
cp ${dpdk}/bin/testpmd $out/bin
'';
# qemu has changed build system since the virtio-vhost-user branch
# was last updated, so it's simpler to just make a new derivation
# and inherit the bits that are the same than to override the
# existing one.
qemu = stdenv.mkDerivation {
name = "qemu-virtio-vhost-user";
src = fetchFromGitHub {
owner = "ndragazis";
repo = "qemu";
rev = "f9ab08c0c8cfc58036ed95b895f9780397448071";
sha256 = "0p6v4i7gj70d6x7s28x3i3x9z8vlswcbbqdwfbhlx87bbnxjrn3b";
fetchSubmodules = true;
};
enableParallelBuilding = true;
nativeBuildInputs =
lib.subtractLists [ ninja meson ] qemu_kvm.nativeBuildInputs;
postPatch = ''
sed -i '/$(INSTALL_DIR) "$(DESTDIR)$(qemu_localstatedir)/d' Makefile
# The virtio-vhost-user implementation tries to allocate a huge
# PCI bar, that's bigger than some CPUs can support! If you see
# a kernel panic in vp_reset(), lower this further.
substituteInPlace hw/virtio/virtio-vhost-user-pci.c \
--replace '1ULL << 36' '1ULL << 34'
'';
inherit (qemu_kvm) buildInputs configureFlags meta;
};
qemuInitramfs = makeInitrd {
contents = [
{
symlink = "/init";
object = writeScript "init" ''
#!${busybox}/bin/sh -eux
export PATH=${busybox}/bin
mkdir -p /nix/store /run /var
mount -t sysfs none /sys
mount -t proc none /proc
mount -t tmpfs none /run
mount -t devtmpfs none /dev
mkdir /dev/hugepages
mount -t hugetlbfs none /dev/hugepages
ln -s /run /var
# Unbind the virtio-net (host TAP) and virtio-vhost-user devices
# from their default drivers, since we'll be passing them
# through to DPDK.
echo 0000:00:04.0 > /sys/bus/pci/devices/0000:00:04.0/driver/unbind
echo 0000:00:05.0 > /sys/bus/pci/devices/0000:00:05.0/driver/unbind
# Tell the vfio-pci driver it can support virtio-net and
# virtio-vhost-user devices. Since our devices are not
# bound to any driver at the moment, doing this will bind
# them to vfio-pci automatically.
echo 1af4 1000 > /sys/bus/pci/drivers/vfio-pci/new_id
echo 1af4 1017 > /sys/bus/pci/drivers/vfio-pci/new_id
echo 256 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
${testpmd}/bin/testpmd \
-l 0-1 \
-w 0000:00:05.0 \
--vdev net_vhost0,iface=0000:00:05.0,virtio-transport=1 \
-w 0000:00:04.0
poweroff -f
'';
}
];
};
qemuVm = writeShellScript "qemu-vm" ''
exec ${qemu}/bin/qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 1G \
-M q35,kernel-irqchip=split \
-initrd ${qemuInitramfs}/initrd \
-netdev tap,id=net0,ifname=qemutap,script=no,downscript=no \
-device virtio-net-pci,netdev=net0,addr=04 \
-chardev socket,id=chardev0,path="$XDG_RUNTIME_DIR/vhost-user0.sock",server,nowait \
-device virtio-vhost-user-pci,addr=05,chardev=chardev0 \
-kernel ${linux}/${stdenv.hostPlatform.linux-kernel.target} \
-append "console=ttyS0 vfio.enable_unsafe_noiommu_mode=1" \
-nographic
'';
# Can't use overrideAttrs because of cargoSha256.
crosvm = rustPlatform.buildRustPackage rec {
name = "crosvm-virtio-vhost-user";
src = fetchFromGitiles {
url = "https://chromium.googlesource.com/chromiumos/platform/crosvm";
rev = "8a7e4e902a4950b060ea23b40c0dfce7bfa1b2cb";
sha256 = "1lm6psp0xakb66nhgmmh94valc4wzbb967chk80msk8bcvsfpdn4";
};
unpackPhase =
let origSrc = pkgs.crosvm.passthru.src; in
builtins.replaceStrings [ "${origSrc}" origSrc.name ] [ "$src" src.name ]
pkgs.crosvm.unpackPhase;
cargoPatches = [
(fetchpatch {
url = "https://spectrum-os.org/lists/archives/spectrum-devel/20210512170812.192540…";
sha256 = "0yzqrpgq35s9wxvbf9s3dgs5cpyxgdc5hr14hsdjr0gd18a6camg";
})
];
patches = pkgs.crosvm.patches ++ [
(fetchpatch {
url = "https://spectrum-os.org/lists/archives/spectrum-devel/20210512170812.192540…";
sha256 = "0g2rvqqa4lvq7bjq0s1ynsjx7lmrxql7lsdv8wyzb7d2z9j6mj13";
})
(fetchpatch {
url = "https://spectrum-os.org/lists/archives/spectrum-devel/20210512170812.192540…";
sha256 = "051sz87i8kzc5sbygk2bpiqp4g32y9fxswg2yax1nd3lg4rxh43r";
})
(fetchpatch {
url = "https://spectrum-os.org/lists/archives/spectrum-devel/20210512170812.192540…";
sha256 = "1jpas65masn2xg9jxha16vi0y7scarzhl221y9wxh4chi4aa4m3f";
})
];
cargoSha256 = "07yizbhs64jrb05fq5g7sx812xbz2989bsficacq5l19ziax5164";
passthru = pkgs.crosvm.passthru // { inherit src; };
inherit (pkgs.crosvm) sourceRoot postPatch nativeBuildInputs buildInputs
preBuild postInstall CROSVM_CARGO_TEST_KERNEL_BINARY meta;
};
crosvmInitramfs = makeInitrd {
contents = [
{
symlink = "/init";
object = writeScript "init" ''
#!${busybox}/bin/sh -eux
export PATH=${busybox}/bin
mount -t sysfs none /sys
mount -t proc none /proc
ip link set eth0 up
udhcpc -n || :
reboot -f
'';
}
];
};
crosvmVm = writeShellScript "crosvm-vm" ''
# In our patched crosvm, suppling --mac without --host_ip or
# --netmask will put it into vhost-user mode.
exec ${crosvm}/bin/crosvm run \
--mac 0A:B3:EC:FF:FF:FF \
-i ${crosvmInitramfs}/initrd \
${linux}/${stdenv.hostPlatform.linux-kernel.target}
'';
}