From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.3 (2019-12-06) on atuin.qyliss.net X-Spam-Level: X-Spam-Status: No, score=-4.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,SPF_HELO_PASS autolearn=unavailable autolearn_force=no version=3.4.3 Received: by atuin.qyliss.net (Postfix, from userid 496) id C0F911C99; Mon, 24 Aug 2020 22:03:51 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by atuin.qyliss.net (Postfix) with ESMTP id C3DE91BE4; Mon, 24 Aug 2020 22:03:30 +0000 (UTC) Received: by atuin.qyliss.net (Postfix, from userid 496) id B55A51C18; Mon, 24 Aug 2020 22:03:27 +0000 (UTC) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by atuin.qyliss.net (Postfix) with ESMTPS id 978391BC1; Mon, 24 Aug 2020 22:03:23 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 4D76D5C0193; Mon, 24 Aug 2020 18:03:22 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 24 Aug 2020 18:03:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alyssa.is; h= from:to:cc:subject:date:message-id:mime-version:content-type; s= fm3; bh=hG2lJrFavLj2tllupfsK+HzTh5FbWZ122zifLYQHMso=; b=dCShizzg RoIf9GINLI61UN3G/LQC0DED2ao7RFF4kOqRa49+JnpsCQvftG3ZTKvEEBPIm53j 1An9qNERcnsFulOUILFyPr0Me0mtICdIYMv51P5X8cW/yzCtYmQykZVMaqzLiV5l dGmzB5oURJ+KKdRiLky/gOma0MKVZDHDkJhxIn+yULPi1loTHXkUyQX3tmF6BBjw BHF5/FQlL0IueM8UUjte/TwWBCZk4RZEQRZUvnW1fBdgYbs145nScuVmabwCBIBu hE209Vdue8n+Fyn1J1vtZ+QcJLmcHXWAWXjW42gI9GcLyjQav02bNBoPq+yzGvCu xeB/GHd+are8Nw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:message-id :mime-version:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; bh=hG2lJrFavLj2tllupfsK+HzTh5FbW Z122zifLYQHMso=; b=PoEAeUNs5lyFkB/I5eWMu/zpWMY/KeDBCxRWp8fHIsTf9 J++Rn2skgYnZ5pZhfYLxjDiEa9+MhHSobmKcmN1PRfNbZ4OaSiBxUGe1rsOSfsJl gHMzswEH5AbOfe2sh/FYbbbQcvxGeassEfeAm75X4ZMft0goO38tE5rC+fS3bjcT R0U66ITv1NKqLbkmO04kAZ9lN8mHJDjaQbkqenxDQfa2QhFaix1g2Nql57pCIoB/ yMj0uRtzvv6xUJkLZXMvJU3A3UwjR3HuihjKDvwf4o8k+Es8tSceOAAMM3I7R5Os 4MWCimh3fKE15+v/hGlChqco5y+ql+3DlwNX8bzsA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduiedrudduledgtdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkfggtgesthdtredttddttdenucfhrhhomheptehlhihsshgrucft ohhsshcuoehhihesrghlhihsshgrrdhisheqnecuggftrfgrthhtvghrnhepkeejkeefhf egheegveevjeduheetvddugedukeelvdeiffdufedvieefudetjeegnecuffhomhgrihhn pehsphgvtghtrhhumhdqohhsrdhorhhgpdhquhgsvghsqdhoshdrohhrghdpkhgvrhhnvg hlrdhorhhgnecukfhppeekgedrudekgedrvddviedrudefgeenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehqhihlihhsshesvghvvgdrqhihlh hishhsrdhnvght X-ME-Proxy: Received: from eve.qyliss.net (p54b8e286.dip0.t-ipconnect.de [84.184.226.134]) by mail.messagingengine.com (Postfix) with ESMTPA id 44985328005A; Mon, 24 Aug 2020 18:03:21 -0400 (EDT) Received: by eve.qyliss.net (Postfix, from userid 1000) id B09E68D1; Mon, 24 Aug 2020 22:03:19 +0000 (UTC) From: Alyssa Ross To: discuss@spectrum-os.org, devel@spectrum-os.org Subject: This (and Last) Week in Spectrum, 2020-W34 & 2020-W35 Date: Mon, 24 Aug 2020 22:03:16 +0000 Message-ID: <87zh6jk8gr.fsf@alyssa.is> MIME-Version: 1.0 Content-Type: text/plain Message-ID-Hash: 6GRLLYXX4QTLTHE57M23VKSLLKUGNHTP X-Message-ID-Hash: 6GRLLYXX4QTLTHE57M23VKSLLKUGNHTP X-MailFrom: qyliss@eve.qyliss.net X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: edef , Philipp =?utf-8?Q?Steinpa=C3=9F?= X-Mailman-Version: 3.3.1 Precedence: list List-Id: General high-level discussion about Spectrum Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Last week I wasn't feeling well, so there was no This Week in Spectrum. crosvm ------ Where we left off, I had been attempting to port vhost-user-net support from cloud-hypervisor to crosvm. I'd been trying to port the first incarnation of the code in cloud-hypervisor to the contemporary version of crosvm from when it was added, thinking that that would be easier because the two codebases together. But I ran into the problem that this earliest incarnation of the vhost-user-net code from cloud-hypervisor didn't actually work (at least with the backend I was attempting to test it with). I'd been attempting to figure out exactly which changes were required to make it work, but hadn't been successful with that yet, and I thought I'd probably need to start the port over, from the latest cloud-hypervisor and crosvm code. The next day, I decided to give my previous strategy one more try, though, and an hour or two later, I found the required cloud-hypervisor change, applied it to crosvm, and it worked! So I now have a crosvm tree capable of vhost-user-net[1]. This means that it's looking good for my plans for inter-guest networking, and network hardware isolation. With that in place, I decided to start thinking about other kinds of hardware isolation and inter-VM communication, and that's what I did for most of the last two weeks. Let's go through them: Files will be shared between VMs using virtio-fs. This has the unique feature of (soon) being able to bypass guest page caches, and have only a single shared cache between VMs. This brings a performance improvement, but as I understand it, should also reduce memory consumption because each VM won't have to maintain its own copy of a disk-backed page. Of course, this feature (DAX) is also a big side channel, so it won't be appropriate for all use cases. But I think for some things people want to do with Spectrum, this will be very important. The problem with this is that, because it uses the page cache of the host kernel, the host has to know about the filesystem that's being shared -- there's no running virtiofsd in a VM if we want DAX. But I'd really like it if a (non-boot) block device could be used as a filesystem without the host having to actually talk to the device. I was stuck here, but edef pointed out to me that we could use the kernel's 9P support to attach the block device to a VM, and then mounting the filesystem in the host over 9P, either over a network connection or (ideally) vsock. It looks like the kernel should be able to handle 9P over vsock, but I haven't tested yet. We can use existing virtiofsd and 9P software (there are promising Rust implementations of each), and harden them against potential vulnerabilites like directory traversals using kernel features like RESOLVE_BENEATH and RESOLVE_NO_XDEV. For the boot device, maybe there's no reason not to just mount it using the host kernel, or maybe there's something to be gained by just reading a small bootstrap payload into memory from the start of the disk once, and then making all future communication go via a VM. I'm not really sure yet. But the important thing is we'll have mechanisms for all this in place. Maybe we'll decide that non-boot devices should just go over inter-VM 9P, but in any case, we'll still need all these pieces. GPU isolation should be possible by forwarding the GPU to a VM, but there are a few problems here. The first is that it would mean rendered surfaces have to be copied via shared memory to the VM with the GPU, before being sent to the GPU. Additionally, sharing the GPU between VMs for rendering at all would require significantly more work. The result of this is that graphics performance using an isolated GPU will probably be poor, at least for now. The final problem is that passthrough of integrated GPUs seems to be very difficult to get right. I will probably need to acquire some hardware that I've sene a report of this working on, so I can figure out what I've been doing wrong on the two computers I've tried it on so far. I suspect that I will get GPU isolation working, but I'm not sure how reliable or performant it will be. For generic USB devices, I expect to be able to take an approach similar to Qubes[2], having a VM to handle interactions with the hardware USB controller, and exposing individual USB devices over USB/IP to other VMs. It would be nice if I could use vsock for this too. [1]: https://spectrum-os.org/git/crosvm/?h=vhost-user-net [2]: https://www.qubes-os.org/doc/usb-devices/ spectrum-os.org --------------- Philipp registered a Matrix room and bridged it to the #spectrum IRC channel. I'm told that this should make it easier for Matrix users to join the room, since some bug in Matrix's IRC bridge prevents people from joining from Matrix the usual way. Philipp also sent a patch[3] to improve the instructions for Matrix users joining the channel on the website. Thanks Philipp! [3]: https://spectrum-os.org/lists/archives/spectrum-devel/87wo247zu7.fsf@alyssa.is/T/#t QEMU ---- I sent the previously requested patch[4] to resolve ambiguities in the vhost-user spec. No response yet, though. I'll probably resend it some time soon. [4]: https://lore.kernel.org/qemu-devel/20200813094847.4288-1-hi@alyssa.is/ I'm finding it hard to keep going at the moment. The stuff I'm doing now is probably the hardest part of implementing Spectrum, and it's frustrating to realise that not everything I want to do is going to be possible. So much of the KVM ecosystem assumes that things will be host<->guest, and there's not always an easy solution. But, whatever we end up with, it's going to be a lot better than what I'm using today, and what lots of other people are using today. I think I'm going to be able to deliver a good experience with a fairly high degree of protection against malicious hardware. But it's not going to be perfect. I'm pushing quite hard to make it over the line with my hardware isolation funding milestone. I'm so close, and I'm about to need the money. But once I've hit that, I think I'm going to need a break. This stuff is gruelling.