patches and low-level development discussion
 help / color / mirror / code / Atom feed
From: "Cole Helbling" <cole.e.helbling@outlook.com>
To: "Alyssa Ross" <hi@alyssa.is>, <devel@spectrum-os.org>
Subject: Re: [PATCH crosvm 2/2] crosvm: fix deadlock on early VmRequest
Date: Mon, 15 Jun 2020 18:08:48 -0700	[thread overview]
Message-ID: <CH2PR14MB357930F1707BB3BA2C869CE7B39D0@CH2PR14MB3579.namprd14.prod.outlook.com> (raw)
In-Reply-To: <20200614114344.22642-3-hi@alyssa.is>

On Sun Jun 14, 2020 at 11:43 AM, Alyssa Ross wrote:
> If a DiskCommand was received on the crosvm socket before the
> virtio-block device was activated, the Token::VmRequest case in the
> main event loop would forward the request to the block device socket,
> and then wait syncronously for a response. That response would never
> come because the device hadn't been activated, and it would never be
> activated because the event loop would never continue, and therefore
> never be able to respond to the event that causes the device to be
> activated. crosvm would therefore just hang forever, waiting for a
> response that would never come.
>
> This patch fixes this deadlock by keeping track of whether devices
> that send a response in this way have been activated yet. If they
> have already been activated, messages are sent and responses are
> received as normal. If they have not been activated, messages are
> instead put into a per-device queue. Once the device is activated,
> queued messages are processed all at once, and then the device is
> marked as ready, and the queue is dropped. Future messages are
> processed immediately as they come in, with no further queueing.

Sounds like quite the "chicken or the egg"-type problem.

> A device indicates that it is ready by sending a message on its
> socket. The main crosvm event loop can then poll the socket, to be
> notified when the device is ready. This poll event will only trigger
> once -- once it has been received, it is removed from the poll
> context.
>
> Currently, the only device type that responds to external control
> messages AND needs to be activated by an event is the block device.
> The balloon device does not respond to messages, and the xhci
> controller device is activated up front. The code is nevertheless
> structured so that it should be very easy to drop another kind of
> device in to the queuing system, should that be required.

+1 for modularity.

> ---
> devices/src/virtio/block.rs | 5 +
> src/linux.rs | 201 ++++++++++++++++++++++++++++++------
> vm_control/src/lib.rs | 57 +++++++---
> 3 files changed, 213 insertions(+), 50 deletions(-)
>
> diff --git a/src/linux.rs b/src/linux.rs
> index ec2067cf..ccd7d88c 100644
> --- a/src/linux.rs
> +++ b/src/linux.rs
>
> ----->8-----
>
> @@ -1667,6 +1668,45 @@ fn file_to_i64<P: AsRef<Path>>(path: P) ->
> io::Result<i64> {
> .ok_or_else(|| io::Error::new(io::ErrorKind::InvalidData, "empty file"))
> }
>  
> +/// Returns a boolean indicating whether the VM should be exited.
> +fn do_vm_request(
> + request: VmRequest,
> + device_socket: Option<&UnixSeqpacket>,
> + control_socket: &VmControlResponseSocket,
> + run_mode_arc: &Arc<VcpuRunMode>,
> + vcpu_handles: &mut Vec<JoinHandle<()>>,
> + io_bus: &mut Bus,
> +) -> MsgResult<bool> {
> + let mut run_mode_opt = None;
> + let response = request.execute(&mut run_mode_opt, device_socket);
> + control_socket.send(&response)?;
> + if let Some(run_mode) = run_mode_opt {
> + info!("control socket changed run mode to {}", run_mode);
> + match run_mode {
> + VmRunMode::Exiting => Ok(true),
> + VmRunMode::Running => {
> + if let VmRunMode::Suspending = *run_mode_arc.mtx.lock() {
> + io_bus.notify_resume();
> + }
> + run_mode_arc.set_and_notify(VmRunMode::Running);
> + for handle in vcpu_handles {
> + let _ = handle.kill(SIGRTMIN() + 0);

I know this is essentially just moved (and probably isn't even something
you wrote), but do you know why this is `+ 0`? Does this somehow coerce
to the desired type or something? Maybe I'm overlooking something
obvious here.

> ----->8-----

Either way, the above is just a nit. Good work on tracking this
particular issue down!

Reviewed-by: Cole Helbling <cole.e.helbling@outlook.com>

  reply	other threads:[~2020-06-16  1:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-14 11:43 [PATCH crosvm 0/2] Fix " Alyssa Ross
2020-06-14 11:43 ` [PATCH crosvm 1/2] msg_socket: introduce UnixSeqpacketExt Alyssa Ross
2020-06-16  0:17   ` Cole Helbling
2020-06-16  9:32     ` Alyssa Ross
2020-06-22 22:06       ` Cole Helbling
2020-06-23  2:32         ` Alyssa Ross
2020-06-25  1:54       ` impaqt
2020-07-09 13:24         ` Alyssa Ross
2020-06-14 11:43 ` [PATCH crosvm 2/2] crosvm: fix deadlock on early VmRequest Alyssa Ross
2020-06-16  1:08   ` Cole Helbling [this message]
2020-06-16  9:39     ` Alyssa Ross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CH2PR14MB357930F1707BB3BA2C869CE7B39D0@CH2PR14MB3579.namprd14.prod.outlook.com \
    --to=cole.e.helbling@outlook.com \
    --cc=devel@spectrum-os.org \
    --cc=hi@alyssa.is \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://spectrum-os.org/git/crosvm
	https://spectrum-os.org/git/doc
	https://spectrum-os.org/git/mktuntap
	https://spectrum-os.org/git/nixpkgs
	https://spectrum-os.org/git/spectrum
	https://spectrum-os.org/git/ucspi-vsock
	https://spectrum-os.org/git/www

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).