From 0eb31ff1b9ea3c91c5d71ec4f4061f1dbd3c0d71 Mon Sep 17 00:00:00 2001 Message-Id: <0eb31ff1b9ea3c91c5d71ec4f4061f1dbd3c0d71.1335172957.git.minovotn@redhat.com> In-Reply-To: <8e20c74819db1d275690c2a426b9b9c9311ce679.1335172957.git.minovotn@redhat.com> References: <8e20c74819db1d275690c2a426b9b9c9311ce679.1335172957.git.minovotn@redhat.com> From: "Michael S. Tsirkin" Date: Sun, 22 Apr 2012 15:42:48 +0200 Subject: [PATCH 2/2] virtio: add missing mb() on notification RH-Author: Michael S. Tsirkin Message-id: <20120422154247.GA12494@redhat.com> Patchwork-id: 39370 O-Subject: [PATCH RHEL6.3] virtio: add missing mb() on notification Bugzilla: 804578 RH-Acked-by: Xiao Wang RH-Acked-by: Amos Kong RH-Acked-by: Paolo Bonzini During normal operation, virtio host first writes a used index and then checks whether it should interrupt the guest by reading guest avail flag/used event index values. Guest does the reverse: writes the index/flag, then checks the used ring. The ordering is important: if host avail flag read bypasses the used index write, we could in effect get this timing: host avail flag/used event index read guest enable interrupts: this performs avail flag/used event index write guest check used ring: ring is empty host used index write This timing results in a lost interrupt: guest will never be notified about the used ring update. This has actually been observed in the field, when using qemu-kvm such that the guest vcpu and qemu io run on different host cpus, but only seems to trigger on some specific hardware, and only with userspace virtio: vhost has the necessary smp_mb() in place to prevent the reordering, so the same workload stalls forever waiting for an interrupt with vhost=off but works fine with vhost=on. Insert an smp_mb() barrier operation in userspace virtio to ensure the correct ordering. Applying this patch fixed the race condition we have observed. Tested on x86_64. I checked the code generated by the new macro for i386 and ppc but didn't run virtio. Signed-off-by: Michael S. Tsirkin Upstream status: posted Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=804578 Tested: on x86_64 Note: upstream uses smp_XX style barrier macros, and has code for architectures besides x86_64. We don't have any of this on RHEL6 - this is a minimal patch that just ports the bugfix. Please-review: Amit Shah Please-review: Juan Quintela Please-review: Jason Wang --- hw/virtio.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) Signed-off-by: Michal Novotny --- hw/virtio.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/hw/virtio.c b/hw/virtio.c index c7f39f2..2db9b68 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -29,6 +29,7 @@ */ #define wmb() __asm__ __volatile__("": : :"memory") +#define mb() __asm__ __volatile__("mfence":::"memory") typedef struct VRingDesc { @@ -691,6 +692,8 @@ static bool vring_notify(VirtIODevice *vdev, VirtQueue *vq) { uint16_t old, new; bool v; + /* We need to expose used array entries before checking used event. */ + mb(); /* Always notify when queue is empty (when feature acknowledge) */ if (((vdev->guest_features & (1 << VIRTIO_F_NOTIFY_ON_EMPTY)) && !vq->inuse && vring_avail_idx(vq) == vq->last_avail_idx)) { -- 1.7.7.6