From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on atuin.qyliss.net X-Spam-Level: X-Spam-Status: No, score=-4.5 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,SPF_HELO_PASS autolearn=unavailable autolearn_force=no version=3.4.4 Received: by atuin.qyliss.net (Postfix, from userid 496) id 3120E6335; Sun, 11 Apr 2021 12:01:48 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by atuin.qyliss.net (Postfix) with ESMTP id C83D95F21; Sun, 11 Apr 2021 11:58:50 +0000 (UTC) Received: by atuin.qyliss.net (Postfix, from userid 496) id 763AE5EFC; Sun, 11 Apr 2021 11:58:49 +0000 (UTC) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by atuin.qyliss.net (Postfix) with ESMTPS id A51565B34 for ; Sun, 11 Apr 2021 11:57:58 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 169C95C00E3 for ; Sun, 11 Apr 2021 07:57:58 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Sun, 11 Apr 2021 07:57:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alyssa.is; h= from:to:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm2; bh=xMKfOQVeWfZr/ MDBKuEDNSrhbIEjrRa7e8pPcEDDZKc=; b=FyHlRoymXDDV+/j/hnH8etqmz+KZt Z12rYRGtHI5yo1qlcV+s1iq5z7AEzrswtH8gmFM5Ez8E2Fc+qCbwfZjQmsigc4sb aTEt5jtc4vGLJZD3KvioX0yotKvjBgTgJ1PPI03FCz8C7HuuxL/5djmqnl2iM/Hp qhxf58crjXY2wU5zbwQiAnKIRh+ZnjGAiZhiT+9VG5i5GwQj84OkN6dg+SJSWmIZ FQkPF78c04/60U/2zuCjeWf4v7jnym2GeAXWM5uQXEMTDCNbRT/sEB2YpHuedRav CwkySYa9cnhYZ82bMHzatT1zUcZi1pnZk6yNRJTQIQEao6BecuOHSJMuA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=xMKfOQVeWfZr/MDBKuEDNSrhbIEjrRa7e8pPcEDDZKc=; b=C4vv7xFI zEpnzGxrMsDQn0pZIhaZ+DEFfiW4pKklM/HSSMy1sK+GhxZvV/hDDOaF/O9CDcaS xcch30D/znX712veEUCnOe623m0iR9Q/w+bFx1+j6woRQFtw2gdK9GrbJevtOvZe ZU0QpsO80Dd9aqZvT8nef8744S51aZjzV2dOHlE74iQlSY+hnQRWCt2Jzh1bV9V7 hUFXVKnfkuYDWLC7uj0Agcxg2v7MV439gVI9UKWtLpApDnQcsB4Z3UEnTAgkUpzb EsTivjxnTR4QK1gtxQn3pHrL/kBaTVZOGH8ibEWKZg7Mc/bWYGZi2OVFLYTE0Na4 QQeGkW5QhQYW3Q== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrudekgedgudejhecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtke ertdertddtnecuhfhrohhmpeetlhihshhsrgcutfhoshhsuceohhhisegrlhihshhsrgdr ihhsqeenucggtffrrghtthgvrhhnpeekteehuddufefftddtveetgeeiuddulefgleehud egffelheevkeefgeehgfelveenucffohhmrghinhepkhgvrhhnvghlrdhorhhgpdhshihs qdhvmhhsrdhnvghtpdhlihhnuhigrdguvghvnecukfhppeejledrvdefhedrudduledrud dtjeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehq hihlihhsshesgidvvddtrdhqhihlihhsshdrnhgvth X-ME-Proxy: Received: from x220.qyliss.net (p4feb776b.dip0.t-ipconnect.de [79.235.119.107]) by mail.messagingengine.com (Postfix) with ESMTPA id E414B24005D for ; Sun, 11 Apr 2021 07:57:57 -0400 (EDT) Received: by x220.qyliss.net (Postfix, from userid 1000) id 7BBDB1930; Sun, 11 Apr 2021 11:57:56 +0000 (UTC) From: Alyssa Ross To: devel@spectrum-os.org Subject: [PATCH nixpkgs 16/16] spectrumPackages.spectrum-testhost: init Date: Sun, 11 Apr 2021 11:57:40 +0000 Message-Id: <20210411115740.29615-17-hi@alyssa.is> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210411115740.29615-1-hi@alyssa.is> References: <20210411115740.29615-1-hi@alyssa.is> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: ZJQBUATHESUAURGL4V5KPWP4QNPAVPA6 X-Message-ID-Hash: ZJQBUATHESUAURGL4V5KPWP4QNPAVPA6 X-MailFrom: qyliss@x220.qyliss.net X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.3.1 Precedence: list List-Id: Patches and low-level development discussion Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: This produces a shell script that sets up a host system for running the VMs under sys-vms, and then starts an s6-rc service manager that can run the VMs. This mirrors how I imagine things working on the eventual Spectrum host system. With s6-rc, we can express dependencies between services, so when vm-app is started, vm-net will automatically be started too if it isn't already running. One thing I haven't implemented yet is readiness notification. Ideally, the cloud-hypervisor instance running the router VM would be able to tell s6 once its control socket was listening, and s6 wouldn't consider it to be up until that point. But it can't do that at the moment, so s6 considers it to be up immediately, and starts vm-app right away. This means that vm-app will usually fail once due to vm-net's socket not existing, and then immediately be restarted and work the second time. I think that's fine for now. The serial console for vm-app will be connected to the terminal. To interact with vm-net instead, the serial console for vm-app can be disconnected uncommenting the two commented out lines in its definition, and the serial console for vm-net can be enabled by commenting the redirfd that disables its stdin, and uncommenting the --serial line. We're using cloud-hypervisor instead of crosvm for vm-net because cloud-hypervisor supports adding virtual ethernet devices at runtime and crosvm doesn't, and it'll be important to be able to add connections to new VMs as applications are started on a running system. The TODO about removing the device from cloud-hypervisor is going to stay a TODO for now, because the solution is quite complicated: if we tell cloud-hypervisor to remove the device from the VM using its API, it'll still be trying to read from it for a short time after returning, so we'll still get read errors in cloud-hypervisor after we delete the TAP. I think the only good way to handle this will be to use a non-persistent TAP device, so that it automatically gets cleaned up by the kernel when cloud-hypervisor is done with it. But to do that, cloud-hypervisor will have to be able to add TAPs from file descriptors at runtime, which will probably be quite difficult to fit into its HTTP API -- at least it's over a Unix socket. --- pkgs/os-specific/linux/spectrum/default.nix | 2 + .../linux/spectrum/testhost/default.nix | 205 ++++++++++++++++++ 2 files changed, 207 insertions(+) create mode 100644 pkgs/os-specific/linux/spectrum/testhost/default.nix diff --git a/pkgs/os-specific/linux/spectrum/default.nix b/pkgs/os-specific/linux/spectrum/default.nix index 7e07ee60f43..c4cccab3787 100644 --- a/pkgs/os-specific/linux/spectrum/default.nix +++ b/pkgs/os-specific/linux/spectrum/default.nix @@ -8,6 +8,8 @@ let spectrum-vm = callPackage ./spectrum-vm { linux = linux_vm; }; + spectrum-testhost = callPackage ./testhost { }; + linux_vm = callPackage ./linux/vm.nix { linux = linux_cros; }; makeRootfs = callPackage ./rootfs { }; diff --git a/pkgs/os-specific/linux/spectrum/testhost/default.nix b/pkgs/os-specific/linux/spectrum/testhost/default.nix new file mode 100644 index 00000000000..7e1a973e8c6 --- /dev/null +++ b/pkgs/os-specific/linux/spectrum/testhost/default.nix @@ -0,0 +1,205 @@ +{ lib, runCommandNoCC, writeScript, writeScriptBin, writeShellScript, writeText +, coreutils, cloud-hypervisor, crosvm, curl, execline, gnutar, gnused, iproute +, iptables, jq, kmod, mktuntap, rsync, s6, s6-rc, sys-vms, utillinux +}: + +let + inherit (lib) concatStrings escapeShellArg makeBinPath mapAttrsToList + optionalString; + + compose2 = f: g: a: b: f (g a b); + + concatMapAttrs = compose2 concatStrings mapAttrsToList; + + makeServicesDir = { services }: + runCommandNoCC "services" {} '' + mkdir $out + ${concatMapAttrs (name: attrs: '' + mkdir $out/${name} + ${concatMapAttrs (key: value: '' + cp -r ${value} $out/${name}/${key} + '') attrs} + '') services} + ''; + + s6RcCompile = { fdhuser ? null }: source: + runCommandNoCC "s6-rc-compile" {} '' + ${s6-rc}/bin/s6-rc-compile \ + ${optionalString (fdhuser != null) "-h ${escapeShellArg fdhuser}"} \ + dest ${source} + tar -C dest -cf $out . + ''; + + compiledRcServicesDir = s6RcCompile {} (makeServicesDir { + services = { + vm-app = { + run = writeScript "app-run" '' + #! ${execline}/bin/execlineb -S0 + # fdclose 0 + + # Checking the return value of the bridge creation is + # important, because if it fails due to the bridge already + # existing that means something else could already be using + # this bridge. + if { ip link add name br0 type bridge } + if { ip link set br0 up } + + # Calculate the MACs for our TAP and the router's TAP. + backtick -in router_nic_dec { + expr ${toString sys-vms.app.vmID} * 2 + 64 * 256 * 256 + } + backtick -in client_nic_dec { + expr ${toString sys-vms.app.vmID} * 2 + 64 * 256 * 256 + 1 + } + multisubstitute { + importas -iu router_nic_dec router_nic_dec + importas -iu client_nic_dec client_nic_dec + } + backtick -i router_mac { + pipeline { printf %x $router_nic_dec } + sed s/^\\(..\\)\\(..\\)\\(..\\)$/0A:B3:EC:\\1:\\2:\\3/ + } + backtick -i client_mac { + pipeline { printf %x $client_nic_dec } + sed s/^\\(..\\)\\(..\\)\\(..\\)$/0A:B3:EC:\\1:\\2:\\3/ + } + multisubstitute { + importas -iu router_mac router_mac + importas -iu client_mac client_mac + } + + # Create the net VM end, and attach it to the net VM. + # + # Use a hardcoded name for now because if we use a dynamic + # one iproute2 has no way of telling us the name that was + # chosen: + # https://lore.kernel.org/netdev/20210406134240.wwumpnrzfjbttnmd@eve.qyliss.net/ + define other_tap_name vmtapnet + # Try to delete the device in case the VM was powered off + # (as the finish script wouldn't have been run in that + # case.) Since we check the return value of ip tuntap add, + # in the case of a race condition between deleting the + # device and creating it again, we'll just fail and try + # again. + foreground { ip link delete $other_tap_name } + if { ip tuntap add name $other_tap_name mode tap } + if { ip link set $other_tap_name master br0 } + if { ip link set $other_tap_name up } + if { + pipeline { + jq -n "$ARGS.named" + --arg tap $other_tap_name + --arg mac $router_mac + } + curl -iX PUT + -H "Accept: application/json" + -H "Content-Type: application/json" + --data-binary @- + --unix-socket ../vm-net/env/cloud-hypervisor.sock + http://localhost/api/v1/vm.add-net + } + + mktuntap -pvBi vmtap%d 6 + importas -iu tap_name TUNTAP_NAME + if { ip link set $tap_name master br0 } + if { ip link set $tap_name up } + if { iptables -t nat -A POSTROUTING -o $tap_name -j MASQUERADE } + + ${crosvm}/bin/crosvm run -p init=/sbin/init -p notifyport=''${port} + # --serial type=file,path=/tmp/app.log + --cid 4 + --tap-fd 6,mac=''${client_mac} + --root ${sys-vms.app.rootfs.squashfs} ${sys-vms.app.linux}/bzImage + ''; + finish = writeScript "app-finish" '' + #! ${execline}/bin/execlineb -S0 + # TODO: remove from vm-net + foreground { ip link delete vmtapnet } + ip link delete br0 + ''; + type = writeText "app-type" '' + longrun + ''; + dependencies = writeText "app-dependencies" '' + vm-net + ''; + }; + + vm-net = { + run = writeScript "net-run" '' + #! ${execline}/bin/execlineb -S0 + # This is only necessary for when running s6 from a tty. + # (i.e. when debugging or running the demo). + redirfd -w 0 /dev/null + + define PCI_LOCATION 0000:00:19.0 + define PCI_PATH /sys/bus/pci/devices/''${PCI_LOCATION} + + # Unbind the network device from the driver it's already + # attached to, if any. + foreground { + redirfd -w 1 ''${PCI_PATH}/driver/unbind + printf "%s" $PCI_LOCATION + } + + # (Re)bind the device to the VFIO PCI driver. + if { modprobe vfio-pci } + backtick -in device_id { + if { dd bs=2 skip=1 count=2 status=none if=''${PCI_PATH}/vendor } + if { printf " " } + dd bs=2 skip=1 count=2 status=none if=''${PCI_PATH}/device + } + importas -iu device_id device_id + foreground { + redirfd -w 1 /sys/bus/pci/drivers/vfio-pci/new_id + printf "%s" $device_id + } + + foreground { mkdir env } + + ${cloud-hypervisor}/bin/cloud-hypervisor + --api-socket env/cloud-hypervisor.sock + --console off + # --serial tty + --cmdline "console=ttyS0 panic=30 root=/dev/vda" + --device path=''${PCI_PATH} + --disk path=${sys-vms.net.rootfs.squashfs},readonly=on + --kernel ${sys-vms.net.linux.dev}/vmlinux + ''; + type = writeText "net-type" '' + longrun + ''; + }; + }; + }); + + servicesDir = makeServicesDir { + services = { + ".s6-svscan" = { + finish = writeShellScript ".s6-svscan-finish" ""; + }; + }; + }; +in + +writeScriptBin "spectrum-testhost" '' + #! ${execline}/bin/execlineb -S0 + export PATH ${makeBinPath [ + coreutils curl execline gnused gnutar iproute iptables jq kmod mktuntap rsync + s6 s6-rc + ]} + + if { redirfd -w 1 /proc/sys/net/ipv4/ip_forward echo 1 } + + importas -iu runtime_dir XDG_RUNTIME_DIR + backtick -in TOP { mktemp -dp $runtime_dir spectrum.XXXXXXXXXX } + importas -iu top TOP + if { echo $top } + if { rsync -r --chmod=Du+w ${servicesDir}/ ''${top}/service } + background { + if { mkdir -p ''${top}/s6-rc/compiled } + if { tar -C ''${top}/s6-rc/compiled -xf ${compiledRcServicesDir} } + s6-rc-init -c ''${top}/s6-rc/compiled -l ''${top}/s6-rc/live ''${top}/service + } + s6-svscan ''${top}/service +'' -- 2.30.0