[PATCH v3 0/6] vhost-user: Add multiqueue support
This series implements multiqueue support for vhost-user mode, allowing passt to utilize multiple queue pairs for improved network throughput when used with multi-CPU guest VMs. While this version uses a single thread for packet processing, it enables the guest kernel to distribute network traffic across multiple queues and vCPUs. The implementation advertises support for up to 16 queue pairs (32 virtqueues) by setting VIRTIO_NET_F_MQ and VHOST_USER_PROTOCOL_F_MQ feature flags. Packets are routed to the appropriate RX queue based on which TX queue they originated from, following the virtio specification's automatic receive steering requirements. This series adds: - Multiqueue capability advertisement (VIRTIO_NET_F_MQ and VHOST_USER_PROTOCOL_F_MQ features) - Per-queue-pair packet pools to support concurrent queue operations - Queue pair parameter throughout the network stack, propagated through all protocol handlers (TCP, UDP, ICMP, ARP, DHCP, DHCPv6, NDP) - Flow-aware queue routing that tracks the originating TX queue for each flow and routes return packets to the corresponding RX queue - Test coverage with VHOST_USER_MQ environment variable to validate multiqueue functionality across all protocols (TCP, UDP, ICMP) and services (DHCP, NDP) Current behavior: - TX queue selection is controlled by the guest kernel's networking stack - RX packets are routed to queues based on their associated flows, with the queue assignment updated on each packet from TX to maintain affinity - Host-initiated flows (e.g., from socket-side connections) currently default to queue pair 0 The changes are transparent to single-queue operation - passt/pasta modes and single-queue vhost-user configurations continue to work unchanged, always using queue pair 0. v3: - Removed --max-qpairs configuration option - multiqueue support is now always enabled up to 16 queue pairs without requiring explicit configuration - Replaced "tap: Add queue pair parameter throughout the packet processing path" with "tap: Convert packet pools to per-queue-pair arrays for multiqueue" - simplified implementation by converting global pools to arrays rather than passing pool parameters throughout - Changed qpair parameter type from int to unsigned int throughout the codebase - Simplified test infrastructure - queues parameter is always set on netdev, mq=true added to virtio-net only when VHOST_USER_MQ > 1 - Updated QEMU usage hints to always show multiqueue-capable command line v2: - New patch: "tap: Remove pool parameter from tap4_handler() and tap6_handler()" to clean up unused parameters before adding queue pair parameter - Changed to one packet pool per queue pair instead of shared pools across all queue pairs - Split "multiqueue: Add queue-aware flow management..." into two patches: - "tap: Add queue pair parameter throughout the packet processing path" - "flow: Add queue pair tracking to flow management" - Updated test infrastructure patch with refined implementation Laurent Vivier (6): tap: Remove pool parameter from tap4_handler() and tap6_handler() vhost-user: Enable multiqueue test: Add multiqueue support to vhost-user test infrastructure vhost-user: Add queue pair parameter throughout the network stack tap: Convert packet pools to per-queue-pair arrays for multiqueue flow: Add queue pair tracking to flow management arp.c | 15 +++-- arp.h | 6 +- dhcp.c | 5 +- dhcp.h | 2 +- dhcpv6.c | 12 ++-- dhcpv6.h | 2 +- flow.c | 33 +++++++++ flow.h | 17 +++++ fwd.c | 18 ++--- fwd.h | 5 +- icmp.c | 25 ++++--- icmp.h | 4 +- ndp.c | 35 ++++++---- ndp.h | 7 +- netlink.c | 2 +- tap.c | 177 ++++++++++++++++++++++++++++--------------------- tap.h | 20 +++--- tcp.c | 47 +++++++------ tcp.h | 7 +- tcp_vu.c | 8 ++- test/lib/setup | 21 +++--- test/run | 23 +++++++ udp.c | 47 +++++++------ udp.h | 6 +- udp_flow.c | 8 ++- udp_flow.h | 2 +- udp_internal.h | 4 +- udp_vu.c | 4 +- vhost_user.c | 10 +-- virtio.h | 2 +- vu_common.c | 15 +++-- vu_common.h | 3 +- 32 files changed, 374 insertions(+), 218 deletions(-) -- 2.51.1
These handlers only ever operate on their respective global pools
(pool_tap4 and pool_tap6). The pool parameter was always passed the
same value, making it unnecessary indirection.
Access the global pools directly instead, simplifying the function
signatures.
Signed-off-by: Laurent Vivier
Advertise multi-queue support in vhost-user by setting VIRTIO_NET_F_MQ
and VHOST_USER_PROTOCOL_F_MQ feature flags, and increase
VHOST_USER_MAX_VQS from 2 to 32, supporting up to 16 queue pairs.
Currently, only the first RX queue (queue 0) is used for receiving
packets. The guest kernel selects which TX queue to use for
transmission. Full multi-RX queue load balancing will be implemented in
future work.
Update the QEMU usage hint to show the required parameters for enabling
multiqueue: queues parameter on the netdev, and mq=true on the
virtio-net device.
Signed-off-by: Laurent Vivier
With the recent addition of multiqueue support to passt's vhost-user
implementation, we need test coverage to validate the functionality. The
test infrastructure previously only tested single queue configurations.
Add a VHOST_USER_MQ environment variable to control the number of queue
pairs. The queues parameter on the netdev is always set to this value
(defaulting to 1 for single queue). When set to values greater than 1,
the setup scripts add mq=true to the virtio-net device for enabling
multiqueue support.
The test suite now runs an additional set of tests with 8 queue pairs to
exercise the multiqueue paths across all protocols (TCP, UDP, ICMP) and
services (DHCP, NDP). Note that the guest kernel will only enable as many
queues as there are vCPUs.
Signed-off-by: Laurent Vivier
Add a queue pair parameter to vu_send_single() and propagate this parameter
through the entire network stack call chain. The queue pair parameter specifies
which queue pair to use for sending packets in vhost-user mode.
All callers currently pass queue pair #0 to preserve existing
behavior. This is a preparatory step for enabling multi-queue and
per-queue worker threads in vhost-user mode.
No functional change.
Signed-off-by: Laurent Vivier
Convert the global pool_tap4 and pool_tap6 packet pools from single
pools to arrays of pools, one for each queue pair. This change is
necessary to support multiqueue operation in vhost-user mode, where
multiple queue pairs may be processing packets concurrently.
The pool storage structures (pool_tap4_storage and pool_tap6_storage)
are now arrays of VHOST_USER_MAX_VQS/2 elements, with corresponding
pointer arrays (pool_tap4 and pool_tap6) for accessing them.
Update tap_flush_pools() and tap_handler() to take a qpair parameter
that selects which pool to operate on. Add bounds checking assertions
to ensure qpair is within valid range.
In passt and pasta modes, all operations use queue pair 0 (hardcoded
in tap_passt_input and tap_pasta_input). In vhost-user mode, the queue
pair is derived from the virtqueue index (index / 2, as TX/RX queues
come in pairs).
All pools within the array share the same buffer pointer:
- In vhost-user mode: Points to the vhost-user memory structure, which
is safe as packet data remains in guest memory and pools only track
iovecs
- In passt/pasta mode: Points to pkt_buf, which is safe as only queue
pair 0 is used
Signed-off-by: Laurent Vivier
For multiqueue support, we need to ensure packets are routed to the
correct RX queue based on which TX queue they originated from. This
requires tracking the queue pair association for each flow.
Add a qpair field to struct flow_common to store the queue pair number
for each flow (FLOW_QPAIR_INVALID if not assigned). The field uses 5
bits, allowing support for up to 31 queue pairs (index 31 is reserved
for FLOW_QPAIR_INVALID), which we verify is sufficient for
VHOST_USER_MAX_VQS via static assertion.
Introduce flow_qp() to retrieve the queue pair for a flow (returning 0
for NULL flows or flows without a valid assignment), and flow_setqp()
to assign queue pairs. Update all protocol handlers (TCP, UDP, ICMP)
and their tap handlers to accept a qpair parameter and assign it to
flows using FLOW_SETQP().
The implementation updates the queue pair assignment on every packet
received from TX. This follows the virtio specification's requirement
for automatic receive steering: "After the driver transmitted a packet
of a flow on transmitqX, the device SHOULD cause incoming packets for
that flow to be steered to receiveqX." By tracking the most recent TX
queue for each flow, we ensure return traffic is directed to the
corresponding RX queue, maintaining flow affinity across queue pairs.
The vhost-user code now uses FLOW_QP() to select the appropriate RX
queue when sending packets, ensuring they're routed based on the
originating TX queue rather than always using queue 0.
Note that flows initiated from the host side (via sockets, for example
udp_flow_from_sock()) currently default to queue pair 0, as they don't
have an associated incoming queue to derive the assignment from.
Signed-off-by: Laurent Vivier
On Wed, Dec 03, 2025 at 07:54:29PM +0100, Laurent Vivier wrote:
These handlers only ever operate on their respective global pools (pool_tap4 and pool_tap6). The pool parameter was always passed the same value, making it unnecessary indirection.
Access the global pools directly instead, simplifying the function signatures.
Signed-off-by: Laurent Vivier
Reviewed-by: David Gibson
--- tap.c | 46 +++++++++++++++++++++------------------------- 1 file changed, 21 insertions(+), 25 deletions(-)
diff --git a/tap.c b/tap.c index 44b06448757a..2cda8c9772b8 100644 --- a/tap.c +++ b/tap.c @@ -696,23 +696,21 @@ static bool tap4_is_fragment(const struct iphdr *iph, /** * tap4_handler() - IPv4 and ARP packet handler for tap file descriptor * @c: Execution context - * @in: Ingress packet pool, packets with Ethernet headers * @now: Current timestamp * * Return: count of packets consumed by handlers */ -static int tap4_handler(struct ctx *c, const struct pool *in, - const struct timespec *now) +static int tap4_handler(struct ctx *c, const struct timespec *now) { unsigned int i, j, seq_count; struct tap4_l4_t *seq;
- if (!c->ifi4 || !in->count) - return in->count; + if (!c->ifi4 || !pool_tap4->count) + return pool_tap4->count;
i = 0; resume: - for (seq_count = 0, seq = NULL; i < in->count; i++) { + for (seq_count = 0, seq = NULL; i < pool_tap4->count; i++) { size_t l3len, hlen, l4len; struct ethhdr eh_storage; struct iphdr iph_storage; @@ -722,7 +720,7 @@ resume: struct iov_tail data; struct iphdr *iph;
- if (!packet_get(in, i, &data)) + if (!packet_get(pool_tap4, i, &data)) continue;
eh = IOV_PEEK_HEADER(&data, eh_storage); @@ -789,7 +787,7 @@ resume: if (iph->protocol == IPPROTO_UDP) { struct iov_tail eh_data;
- packet_get(in, i, &eh_data); + packet_get(pool_tap4, i, &eh_data); if (dhcp(c, &eh_data)) continue; } @@ -820,7 +818,7 @@ resume: goto append;
if (seq_count == TAP_SEQS) - break; /* Resume after flushing if i < in->count */ + break; /* Resume after flushing if i < pool_tap4->count */
for (seq = tap4_l4 + seq_count - 1; seq >= tap4_l4; seq--) { if (L4_MATCH(iph, uh, seq)) { @@ -866,32 +864,30 @@ append: } }
- if (i < in->count) + if (i < pool_tap4->count) goto resume;
- return in->count; + return pool_tap4->count; }
/** * tap6_handler() - IPv6 packet handler for tap file descriptor * @c: Execution context - * @in: Ingress packet pool, packets with Ethernet headers * @now: Current timestamp * * Return: count of packets consumed by handlers */ -static int tap6_handler(struct ctx *c, const struct pool *in, - const struct timespec *now) +static int tap6_handler(struct ctx *c, const struct timespec *now) { unsigned int i, j, seq_count = 0; struct tap6_l4_t *seq;
- if (!c->ifi6 || !in->count) - return in->count; + if (!c->ifi6 || !pool_tap6->count) + return pool_tap6->count;
i = 0; resume: - for (seq_count = 0, seq = NULL; i < in->count; i++) { + for (seq_count = 0, seq = NULL; i < pool_tap6->count; i++) { size_t l4len, plen, check; struct in6_addr *saddr, *daddr; struct ipv6hdr ip6h_storage; @@ -903,7 +899,7 @@ resume: struct ipv6hdr *ip6h; uint8_t proto;
- if (!packet_get(in, i, &data)) + if (!packet_get(pool_tap6, i, &data)) return -1;
eh = IOV_REMOVE_HEADER(&data, eh_storage); @@ -1011,7 +1007,7 @@ resume: goto append;
if (seq_count == TAP_SEQS) - break; /* Resume after flushing if i < in->count */ + break; /* Resume after flushing if i < pool_tap6->count */
for (seq = tap6_l4 + seq_count - 1; seq >= tap6_l4; seq--) { if (L4_MATCH(ip6h, proto, uh, seq)) { @@ -1058,10 +1054,10 @@ append: } }
- if (i < in->count) + if (i < pool_tap6->count) goto resume;
- return in->count; + return pool_tap6->count; }
/** @@ -1080,8 +1076,8 @@ void tap_flush_pools(void) */ void tap_handler(struct ctx *c, const struct timespec *now) { - tap4_handler(c, pool_tap4, now); - tap6_handler(c, pool_tap6, now); + tap4_handler(c, now); + tap6_handler(c, now); }
/** @@ -1115,14 +1111,14 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data, case ETH_P_ARP: case ETH_P_IP: if (!pool_can_fit(pool_tap4, data)) { - tap4_handler(c, pool_tap4, now); + tap4_handler(c, now); pool_flush(pool_tap4); } packet_add(pool_tap4, data); break; case ETH_P_IPV6: if (!pool_can_fit(pool_tap6, data)) { - tap6_handler(c, pool_tap6, now); + tap6_handler(c, now); pool_flush(pool_tap6); } packet_add(pool_tap6, data); -- 2.51.1
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Dec 03, 2025 at 07:54:30PM +0100, Laurent Vivier wrote:
Advertise multi-queue support in vhost-user by setting VIRTIO_NET_F_MQ and VHOST_USER_PROTOCOL_F_MQ feature flags, and increase VHOST_USER_MAX_VQS from 2 to 32, supporting up to 16 queue pairs.
Currently, only the first RX queue (queue 0) is used for receiving packets. The guest kernel selects which TX queue to use for transmission. Full multi-RX queue load balancing will be implemented in future work.
Update the QEMU usage hint to show the required parameters for enabling multiqueue: queues parameter on the netdev, and mq=true on the virtio-net device.
Signed-off-by: Laurent Vivier
Reviewed-by: David Gibson
--- tap.c | 7 +++++-- vhost_user.c | 10 ++++++---- virtio.h | 2 +- 3 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/tap.c b/tap.c index 2cda8c9772b8..591b49491aa3 100644 --- a/tap.c +++ b/tap.c @@ -1314,8 +1314,11 @@ static void tap_backend_show_hints(struct ctx *c) break; case MODE_VU: info("You can start qemu with:"); - info(" kvm ... -chardev socket,id=chr0,path=%s -netdev vhost-user,id=netdev0,chardev=chr0 -device virtio-net,netdev=netdev0 -object memory-backend-memfd,id=memfd0,share=on,size=$RAMSIZE -numa node,memdev=memfd0\n", - c->sock_path); + info(" kvm ... -chardev socket,id=chr0,path=%s " + "-netdev vhost-user,id=netdev0,chardev=chr0,queues=$QUEUES " + "-device virtio-net,netdev=netdev0,mq=true " + "-object memory-backend-memfd,id=memfd0,share=on,size=$RAMSIZE " + "-numa node,memdev=memfd0\n", c->sock_path); break; } } diff --git a/vhost_user.c b/vhost_user.c index aa7c869d9e56..845fdb551c84 100644 --- a/vhost_user.c +++ b/vhost_user.c @@ -323,6 +323,7 @@ static bool vu_get_features_exec(struct vu_dev *vdev, uint64_t features = 1ULL << VIRTIO_F_VERSION_1 | 1ULL << VIRTIO_NET_F_MRG_RXBUF | + 1ULL << VIRTIO_NET_F_MQ | 1ULL << VHOST_F_LOG_ALL | 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
@@ -767,7 +768,8 @@ static void vu_check_queue_msg_file(struct vhost_user_msg *vmsg) int idx = vmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
if (idx >= VHOST_USER_MAX_VQS) - die("Invalid vhost-user queue index: %u", idx); + die("Invalid vhost-user queue index: %u (maximum %u)", idx, + VHOST_USER_MAX_VQS);
if (nofd) { vmsg_close_fds(vmsg); @@ -896,7 +898,8 @@ static bool vu_get_protocol_features_exec(struct vu_dev *vdev, uint64_t features = 1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | 1ULL << VHOST_USER_PROTOCOL_F_DEVICE_STATE | - 1ULL << VHOST_USER_PROTOCOL_F_RARP; + 1ULL << VHOST_USER_PROTOCOL_F_RARP | + 1ULL << VHOST_USER_PROTOCOL_F_MQ;
(void)vdev; vmsg_set_reply_u64(vmsg, features); @@ -935,10 +938,9 @@ static bool vu_get_queue_num_exec(struct vu_dev *vdev, { (void)vdev;
- /* NOLINTNEXTLINE(misc-redundant-expression) */ vmsg_set_reply_u64(vmsg, VHOST_USER_MAX_VQS / 2);
- debug("VHOST_USER_MAX_VQS %u", VHOST_USER_MAX_VQS / 2); + debug("queue num %u", VHOST_USER_MAX_VQS / 2);
return true; } diff --git a/virtio.h b/virtio.h index 12caaa0b6def..176c935cecc7 100644 --- a/virtio.h +++ b/virtio.h @@ -88,7 +88,7 @@ struct vu_dev_region { uint64_t mmap_addr; };
-#define VHOST_USER_MAX_VQS 2 +#define VHOST_USER_MAX_VQS 32
/* * Set a reasonable maximum number of ram slots, which will be supported by -- 2.51.1
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Dec 03, 2025 at 07:54:31PM +0100, Laurent Vivier wrote:
With the recent addition of multiqueue support to passt's vhost-user implementation, we need test coverage to validate the functionality. The test infrastructure previously only tested single queue configurations.
Add a VHOST_USER_MQ environment variable to control the number of queue pairs. The queues parameter on the netdev is always set to this value (defaulting to 1 for single queue). When set to values greater than 1, the setup scripts add mq=true to the virtio-net device for enabling multiqueue support.
The test suite now runs an additional set of tests with 8 queue pairs to exercise the multiqueue paths across all protocols (TCP, UDP, ICMP) and services (DHCP, NDP). Note that the guest kernel will only enable as many queues as there are vCPUs.
Signed-off-by: Laurent Vivier
Reviewed-by: David Gibson
--- test/lib/setup | 21 +++++++++++++-------- test/run | 23 +++++++++++++++++++++++ 2 files changed, 36 insertions(+), 8 deletions(-)
diff --git a/test/lib/setup b/test/lib/setup index 5994598744a3..3872a02b109b 100755 --- a/test/lib/setup +++ b/test/lib/setup @@ -18,6 +18,8 @@ VCPUS="$( [ $(nproc) -ge 8 ] && echo 6 || echo $(( $(nproc) / 2 + 1 )) )" MEM_KIB="$(sed -n 's/MemTotal:[ ]*\([0-9]*\) kB/\1/p' /proc/meminfo)" QEMU_ARCH="$(uname -m)" [ "${QEMU_ARCH}" = "i686" ] && QEMU_ARCH=i386 +VHOST_USER=0 +VHOST_USER_MQ=1
# setup_build() - Set up pane layout for build tests setup_build() { @@ -46,6 +48,7 @@ setup_passt() { [ ${DEBUG} -eq 1 ] && __opts="${__opts} -d" [ ${TRACE} -eq 1 ] && __opts="${__opts} --trace" [ ${VHOST_USER} -eq 1 ] && __opts="${__opts} --vhost-user" + [ ${VHOST_USER_MQ} -gt 1 ] && __virtio_opts="${__virtio_opts},mq=true"
context_run passt "make clean" context_run passt "make valgrind" @@ -59,8 +62,8 @@ setup_passt() { __vmem="$(((${__vmem} + 500) / 1000))G" __qemu_netdev=" \ -chardev socket,id=c,path=${STATESETUP}/passt.socket \ - -netdev vhost-user,id=v,chardev=c \ - -device virtio-net,netdev=v \ + -netdev vhost-user,id=v,chardev=c,queues=${VHOST_USER_MQ} \ + -device virtio-net,netdev=v${__virtio_opts} \ -object memory-backend-memfd,id=m,share=on,size=${__vmem} \ -numa node,memdev=m" else @@ -156,6 +159,7 @@ setup_passt_in_ns() { [ ${DEBUG} -eq 1 ] && __opts="${__opts} -d" [ ${TRACE} -eq 1 ] && __opts="${__opts} --trace" [ ${VHOST_USER} -eq 1 ] && __opts="${__opts} --vhost-user" + [ ${VHOST_USER_MQ} -gt 1 ] && __virtio_opts="${__virtio_opts},mq=true"
if [ ${VALGRIND} -eq 1 ]; then context_run passt "make clean" @@ -173,8 +177,8 @@ setup_passt_in_ns() { __vmem="$(((${__vmem} + 500) / 1000))G" __qemu_netdev=" \ -chardev socket,id=c,path=${STATESETUP}/passt.socket \ - -netdev vhost-user,id=v,chardev=c \ - -device virtio-net,netdev=v \ + -netdev vhost-user,id=v,chardev=c,queues=${VHOST_USER_MQ} \ + -device virtio-net,netdev=v${__virtio_opts} \ -object memory-backend-memfd,id=m,share=on,size=${__vmem} \ -numa node,memdev=m" else @@ -251,6 +255,7 @@ setup_two_guests() { [ ${DEBUG} -eq 1 ] && __opts="${__opts} -d" [ ${TRACE} -eq 1 ] && __opts="${__opts} --trace" [ ${VHOST_USER} -eq 1 ] && __opts="${__opts} --vhost-user" + [ ${VHOST_USER_MQ} -gt 1 ] && __virtio_opts="${__virtio_opts},mq=true"
context_run_bg passt_2 "./passt -s ${STATESETUP}/passt_2.socket -P ${STATESETUP}/passt_2.pid -f ${__opts} --hostname hostname2 --fqdn fqdn2 -t 10004 -u 10004" wait_for [ -f "${STATESETUP}/passt_2.pid" ] @@ -260,14 +265,14 @@ setup_two_guests() { __vmem="$(((${__vmem} + 500) / 1000))G" __qemu_netdev1=" \ -chardev socket,id=c,path=${STATESETUP}/passt_1.socket \ - -netdev vhost-user,id=v,chardev=c \ - -device virtio-net,netdev=v \ + -netdev vhost-user,id=v,chardev=c,queues=${VHOST_USER_MQ} \ + -device virtio-net,netdev=v${__virtio_opts} \ -object memory-backend-memfd,id=m,share=on,size=${__vmem} \ -numa node,memdev=m" __qemu_netdev2=" \ -chardev socket,id=c,path=${STATESETUP}/passt_2.socket \ - -netdev vhost-user,id=v,chardev=c \ - -device virtio-net,netdev=v \ + -netdev vhost-user,id=v,chardev=c,queues=${VHOST_USER_MQ} \ + -device virtio-net,netdev=v${__virtio_opts} \ -object memory-backend-memfd,id=m,share=on,size=${__vmem} \ -numa node,memdev=m" else diff --git a/test/run b/test/run index f858e5586847..652cc12b1234 100755 --- a/test/run +++ b/test/run @@ -190,6 +190,29 @@ run() { test passt_vu_in_ns/shutdown teardown passt_in_ns
+ VHOST_USER=1 + VHOST_USER_MQ=8 + setup passt_in_ns + test passt_vu/ndp + test passt_vu_in_ns/dhcp + test passt_vu_in_ns/icmp + test passt_vu_in_ns/tcp + test passt_vu_in_ns/udp + test passt_vu_in_ns/shutdown + teardown passt_in_ns + + setup two_guests + test two_guests_vu/basic + teardown two_guests + + setup passt_in_ns + test passt_vu/ndp + test passt_vu_in_ns/dhcp + test perf/passt_vu_tcp + test perf/passt_vu_udp + test passt_vu_in_ns/shutdown + teardown passt_in_ns + # TODO: Make those faster by at least pre-installing gcc and make on # non-x86 images, then re-enable. skip_distro() { -- 2.51.1
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, 3 Dec 2025 19:54:30 +0100
Laurent Vivier
Advertise multi-queue support in vhost-user by setting VIRTIO_NET_F_MQ and VHOST_USER_PROTOCOL_F_MQ feature flags, and increase VHOST_USER_MAX_VQS from 2 to 32, supporting up to 16 queue pairs.
Currently, only the first RX queue (queue 0) is used for receiving packets. The guest kernel selects which TX queue to use for transmission. Full multi-RX queue load balancing will be implemented in future work.
Update the QEMU usage hint to show the required parameters for enabling multiqueue: queues parameter on the netdev, and mq=true on the virtio-net device.
Signed-off-by: Laurent Vivier
--- tap.c | 7 +++++-- vhost_user.c | 10 ++++++---- virtio.h | 2 +- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/tap.c b/tap.c index 2cda8c9772b8..591b49491aa3 100644 --- a/tap.c +++ b/tap.c @@ -1314,8 +1314,11 @@ static void tap_backend_show_hints(struct ctx *c) break; case MODE_VU: info("You can start qemu with:"); - info(" kvm ... -chardev socket,id=chr0,path=%s -netdev vhost-user,id=netdev0,chardev=chr0 -device virtio-net,netdev=netdev0 -object memory-backend-memfd,id=memfd0,share=on,size=$RAMSIZE -numa node,memdev=memfd0\n", - c->sock_path); + info(" kvm ... -chardev socket,id=chr0,path=%s " + "-netdev vhost-user,id=netdev0,chardev=chr0,queues=$QUEUES " + "-device virtio-net,netdev=netdev0,mq=true " + "-object memory-backend-memfd,id=memfd0,share=on,size=$RAMSIZE " + "-numa node,memdev=memfd0\n", c->sock_path); break; } } diff --git a/vhost_user.c b/vhost_user.c index aa7c869d9e56..845fdb551c84 100644 --- a/vhost_user.c +++ b/vhost_user.c @@ -323,6 +323,7 @@ static bool vu_get_features_exec(struct vu_dev *vdev, uint64_t features = 1ULL << VIRTIO_F_VERSION_1 | 1ULL << VIRTIO_NET_F_MRG_RXBUF | + 1ULL << VIRTIO_NET_F_MQ | 1ULL << VHOST_F_LOG_ALL | 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
@@ -767,7 +768,8 @@ static void vu_check_queue_msg_file(struct vhost_user_msg *vmsg) int idx = vmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
if (idx >= VHOST_USER_MAX_VQS) - die("Invalid vhost-user queue index: %u", idx); + die("Invalid vhost-user queue index: %u (maximum %u)", idx, + VHOST_USER_MAX_VQS);
if (nofd) { vmsg_close_fds(vmsg); @@ -896,7 +898,8 @@ static bool vu_get_protocol_features_exec(struct vu_dev *vdev, uint64_t features = 1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | 1ULL << VHOST_USER_PROTOCOL_F_DEVICE_STATE | - 1ULL << VHOST_USER_PROTOCOL_F_RARP; + 1ULL << VHOST_USER_PROTOCOL_F_RARP | + 1ULL << VHOST_USER_PROTOCOL_F_MQ;
(void)vdev; vmsg_set_reply_u64(vmsg, features); @@ -935,10 +938,9 @@ static bool vu_get_queue_num_exec(struct vu_dev *vdev, { (void)vdev;
- /* NOLINTNEXTLINE(misc-redundant-expression) */ vmsg_set_reply_u64(vmsg, VHOST_USER_MAX_VQS / 2);
- debug("VHOST_USER_MAX_VQS %u", VHOST_USER_MAX_VQS / 2); + debug("queue num %u", VHOST_USER_MAX_VQS / 2);
Nit, if you respin: this "queue num %u" message doesn't carry any context at all. Actually, why is it needed? It's defined at build time anyway. If it's needed maybe "Using up to %u vhost-user queue pairs"?
return true; } diff --git a/virtio.h b/virtio.h index 12caaa0b6def..176c935cecc7 100644 --- a/virtio.h +++ b/virtio.h @@ -88,7 +88,7 @@ struct vu_dev_region { uint64_t mmap_addr; };
-#define VHOST_USER_MAX_VQS 2 +#define VHOST_USER_MAX_VQS 32
/* * Set a reasonable maximum number of ram slots, which will be supported by
-- Stefano
On Wed, 3 Dec 2025 19:54:32 +0100
Laurent Vivier
diff --git a/vu_common.c b/vu_common.c index b13b7c308fd8..80d9a30f6f71 100644 --- a/vu_common.c +++ b/vu_common.c @@ -196,11 +196,11 @@ static void vu_handle_tx(struct vu_dev *vdev, int index,
data = IOV_TAIL(elem[count].out_sg, elem[count].out_num, 0); if (IOV_DROP_HEADER(&data, struct virtio_net_hdr_mrg_rxbuf)) - tap_add_packet(vdev->context, &data, now); + tap_add_packet(vdev->context, 0, &data, now);
count++; } - tap_handler(vdev->context, now); + tap_handler(vdev->context, 0, now);
if (count) { int i; @@ -235,23 +235,26 @@ void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, }
/** - * vu_send_single() - Send a buffer to the front-end using the RX virtqueue + * vu_send_single() - Send a buffer to the front-end using a specified virtqueue * @c: execution context + * @qpair: Queue pair on which to send the buffer * @buf: address of the buffer * @size: size of the buffer * * Return: number of bytes sent, -1 if there is an error */ -int vu_send_single(const struct ctx *c, const void *buf, size_t size) +int vu_send_single(const struct ctx *c, unsigned int qpair, const void *buf, size_t size) { struct vu_dev *vdev = c->vdev; - struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE]; struct iovec in_sg[VIRTQUEUE_MAX_SIZE]; + struct vu_virtq *vq; size_t total; int elem_cnt; int i;
+ vq = &vdev->vq[qpair << 1];
<< 1 instead of * 2 is a bit surprising here, for a few seconds I thought you swapped qpair and 1. Then I started thinking that somebody is likely to mix up (probably not you) indices of RX and TX queues at some point. So... what about some macros, say (let's see if I got it right this time): #define VHOST_SEND_QUEUE(pair) ((pair) * 2) #define VHOST_RECV_QUEUE(pair) (pair) and: #define VHOST_QUEUE_PAIR(q) ((q) % 2) ? (q) : (q) / 2) ...are they correct? A short description or "Theory of operation" section somewhere with a recap of how queue indices are used would be nice to have. And maybe also something explaining that 0 that's now appearing in argument lists: #define VHOST_NO_QUEUE 0 ?
+ trace("vu_send_single size %zu", size);
if (!vu_queue_enabled(vq) || !vu_queue_started(vq)) { diff --git a/vu_common.h b/vu_common.h index f538f237790b..9ceb8034a9a5 100644 --- a/vu_common.h +++ b/vu_common.h @@ -56,6 +56,7 @@ void vu_flush(const struct vu_dev *vdev, struct vu_virtq *vq, struct vu_virtq_element *elem, int elem_cnt); void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, const struct timespec *now); -int vu_send_single(const struct ctx *c, const void *buf, size_t size); +int vu_send_single(const struct ctx *c, unsigned int qpair, const void *buf, + size_t size);
#endif /* VU_COMMON_H */
I'm still reviewing the rest, currently at 5/6. -- Stefano
On 12/11/25 08:01, Stefano Brivio wrote:
On Wed, 3 Dec 2025 19:54:30 +0100 Laurent Vivier
wrote: Advertise multi-queue support in vhost-user by setting VIRTIO_NET_F_MQ and VHOST_USER_PROTOCOL_F_MQ feature flags, and increase VHOST_USER_MAX_VQS from 2 to 32, supporting up to 16 queue pairs.
Currently, only the first RX queue (queue 0) is used for receiving packets. The guest kernel selects which TX queue to use for transmission. Full multi-RX queue load balancing will be implemented in future work.
Update the QEMU usage hint to show the required parameters for enabling multiqueue: queues parameter on the netdev, and mq=true on the virtio-net device.
Signed-off-by: Laurent Vivier
--- tap.c | 7 +++++-- vhost_user.c | 10 ++++++---- virtio.h | 2 +- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/tap.c b/tap.c index 2cda8c9772b8..591b49491aa3 100644 --- a/tap.c +++ b/tap.c @@ -1314,8 +1314,11 @@ static void tap_backend_show_hints(struct ctx *c) break; case MODE_VU: info("You can start qemu with:"); - info(" kvm ... -chardev socket,id=chr0,path=%s -netdev vhost-user,id=netdev0,chardev=chr0 -device virtio-net,netdev=netdev0 -object memory-backend-memfd,id=memfd0,share=on,size=$RAMSIZE -numa node,memdev=memfd0\n", - c->sock_path); + info(" kvm ... -chardev socket,id=chr0,path=%s " + "-netdev vhost-user,id=netdev0,chardev=chr0,queues=$QUEUES " + "-device virtio-net,netdev=netdev0,mq=true " + "-object memory-backend-memfd,id=memfd0,share=on,size=$RAMSIZE " + "-numa node,memdev=memfd0\n", c->sock_path); break; } } diff --git a/vhost_user.c b/vhost_user.c index aa7c869d9e56..845fdb551c84 100644 --- a/vhost_user.c +++ b/vhost_user.c @@ -323,6 +323,7 @@ static bool vu_get_features_exec(struct vu_dev *vdev, uint64_t features = 1ULL << VIRTIO_F_VERSION_1 | 1ULL << VIRTIO_NET_F_MRG_RXBUF | + 1ULL << VIRTIO_NET_F_MQ | 1ULL << VHOST_F_LOG_ALL | 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
@@ -767,7 +768,8 @@ static void vu_check_queue_msg_file(struct vhost_user_msg *vmsg) int idx = vmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
if (idx >= VHOST_USER_MAX_VQS) - die("Invalid vhost-user queue index: %u", idx); + die("Invalid vhost-user queue index: %u (maximum %u)", idx, + VHOST_USER_MAX_VQS);
if (nofd) { vmsg_close_fds(vmsg); @@ -896,7 +898,8 @@ static bool vu_get_protocol_features_exec(struct vu_dev *vdev, uint64_t features = 1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | 1ULL << VHOST_USER_PROTOCOL_F_DEVICE_STATE | - 1ULL << VHOST_USER_PROTOCOL_F_RARP; + 1ULL << VHOST_USER_PROTOCOL_F_RARP | + 1ULL << VHOST_USER_PROTOCOL_F_MQ;
(void)vdev; vmsg_set_reply_u64(vmsg, features); @@ -935,10 +938,9 @@ static bool vu_get_queue_num_exec(struct vu_dev *vdev, { (void)vdev;
- /* NOLINTNEXTLINE(misc-redundant-expression) */ vmsg_set_reply_u64(vmsg, VHOST_USER_MAX_VQS / 2);
- debug("VHOST_USER_MAX_VQS %u", VHOST_USER_MAX_VQS / 2); + debug("queue num %u", VHOST_USER_MAX_VQS / 2);
Nit, if you respin: this "queue num %u" message doesn't carry any context at all. Actually, why is it needed? It's defined at build time anyway.
If it's needed maybe "Using up to %u vhost-user queue pairs"?
I agree, and I was planning to update it but missed the change before posting. Thanks, Laurent
On 12/11/25 08:01, Stefano Brivio wrote:
On Wed, 3 Dec 2025 19:54:32 +0100 Laurent Vivier
wrote: diff --git a/vu_common.c b/vu_common.c index b13b7c308fd8..80d9a30f6f71 100644 --- a/vu_common.c +++ b/vu_common.c @@ -196,11 +196,11 @@ static void vu_handle_tx(struct vu_dev *vdev, int index,
data = IOV_TAIL(elem[count].out_sg, elem[count].out_num, 0); if (IOV_DROP_HEADER(&data, struct virtio_net_hdr_mrg_rxbuf)) - tap_add_packet(vdev->context, &data, now); + tap_add_packet(vdev->context, 0, &data, now);
count++; } - tap_handler(vdev->context, now); + tap_handler(vdev->context, 0, now);
if (count) { int i; @@ -235,23 +235,26 @@ void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, }
/** - * vu_send_single() - Send a buffer to the front-end using the RX virtqueue + * vu_send_single() - Send a buffer to the front-end using a specified virtqueue * @c: execution context + * @qpair: Queue pair on which to send the buffer * @buf: address of the buffer * @size: size of the buffer * * Return: number of bytes sent, -1 if there is an error */ -int vu_send_single(const struct ctx *c, const void *buf, size_t size) +int vu_send_single(const struct ctx *c, unsigned int qpair, const void *buf, size_t size) { struct vu_dev *vdev = c->vdev; - struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE]; struct iovec in_sg[VIRTQUEUE_MAX_SIZE]; + struct vu_virtq *vq; size_t total; int elem_cnt; int i;
+ vq = &vdev->vq[qpair << 1];
<< 1 instead of * 2 is a bit surprising here, for a few seconds I thought you swapped qpair and 1.
Then I started thinking that somebody is likely to mix up (probably not you) indices of RX and TX queues at some point. So... what about some macros, say (let's see if I got it right this time):
#define VHOST_SEND_QUEUE(pair) ((pair) * 2) #define VHOST_RECV_QUEUE(pair) (pair)
I will. David had the same comment. TX and RX are from the point of view of guest, it's not obvious when we read passt code. I would prefer as David proposed to use, i.e. FROMGUEST and TOGUEST: #define VHOST_FROM_GUEST(qpair) ((qpair) * 2 + 1) #define VHOST_TO_GUEST(qpair) ((qpair) * 2)>
and:
#define VHOST_QUEUE_PAIR(q) ((q) % 2) ? (q) : (q) / 2)
I don't undestand the purpose of this one.
...are they correct? A short description or "Theory of operation" section somewhere with a recap of how queue indices are used would be nice to have.
And maybe also something explaining that 0 that's now appearing in argument lists:
#define VHOST_NO_QUEUE 0
It's not really NO_QUEUE, it's default queue pair, the queue pair 0 Thanks, Laurent
On Thu, 11 Dec 2025 09:48:42 +0100
Laurent Vivier
On 12/11/25 08:01, Stefano Brivio wrote:
On Wed, 3 Dec 2025 19:54:32 +0100 Laurent Vivier
wrote: diff --git a/vu_common.c b/vu_common.c index b13b7c308fd8..80d9a30f6f71 100644 --- a/vu_common.c +++ b/vu_common.c @@ -196,11 +196,11 @@ static void vu_handle_tx(struct vu_dev *vdev, int index,
data = IOV_TAIL(elem[count].out_sg, elem[count].out_num, 0); if (IOV_DROP_HEADER(&data, struct virtio_net_hdr_mrg_rxbuf)) - tap_add_packet(vdev->context, &data, now); + tap_add_packet(vdev->context, 0, &data, now);
count++; } - tap_handler(vdev->context, now); + tap_handler(vdev->context, 0, now);
if (count) { int i; @@ -235,23 +235,26 @@ void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, }
/** - * vu_send_single() - Send a buffer to the front-end using the RX virtqueue + * vu_send_single() - Send a buffer to the front-end using a specified virtqueue * @c: execution context + * @qpair: Queue pair on which to send the buffer * @buf: address of the buffer * @size: size of the buffer * * Return: number of bytes sent, -1 if there is an error */ -int vu_send_single(const struct ctx *c, const void *buf, size_t size) +int vu_send_single(const struct ctx *c, unsigned int qpair, const void *buf, size_t size) { struct vu_dev *vdev = c->vdev; - struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE]; struct iovec in_sg[VIRTQUEUE_MAX_SIZE]; + struct vu_virtq *vq; size_t total; int elem_cnt; int i;
+ vq = &vdev->vq[qpair << 1];
<< 1 instead of * 2 is a bit surprising here, for a few seconds I thought you swapped qpair and 1.
Then I started thinking that somebody is likely to mix up (probably not you) indices of RX and TX queues at some point. So... what about some macros, say (let's see if I got it right this time):
#define VHOST_SEND_QUEUE(pair) ((pair) * 2) #define VHOST_RECV_QUEUE(pair) (pair)
I will. David had the same comment.
Uh, wait, I must have missed it. Do you have a Message-ID? I'm afraid I must have missed some emails here but I don't see them in archives either...
TX and RX are from the point of view of guest, it's not obvious when we read passt code.
Right, yes, for me neither, I always get confused. That's why I thought we could make the RX vhost-user queue become "SEND" in passt's code, but:
I would prefer as David proposed to use, i.e. FROMGUEST and TOGUEST:
#define VHOST_FROM_GUEST(qpair) ((qpair) * 2 + 1) #define VHOST_TO_GUEST(qpair) ((qpair) * 2)
...this is even clearer. It misses the QUEUE though. Does VHOST_QUEUE_{FROM,TO}_GUEST fit where you use it? Otherwise I guess VQ together with FROM / TO should be clear enough.
and:
#define VHOST_QUEUE_PAIR(q) ((q) % 2) ? (q) : (q) / 2)
I don't undestand the purpose of this one.
To get the pair number from a queue number. You're doing something like that (I guess?) in 5/6, vu_handle_tx(): + tap_flush_pools(index / 2); + tap_add_packet(vdev->context, index / 2, &data, now); + tap_handler(vdev->context, index / 2, now); but now that I see your definition for VHOST_FROM_GUEST() above, and that the purpose wasn't clear to you, I guess it should be: #define VHOST_PAIR_FROM_QUEUE(q) (((q) % 2) ? ((q) - 1 / 2) : ((q) / 2)) ...or maybe it's not needed? I'm not sure.
...are they correct? A short description or "Theory of operation" section somewhere with a recap of how queue indices are used would be nice to have.
And maybe also something explaining that 0 that's now appearing in argument lists:
#define VHOST_NO_QUEUE 0
It's not really NO_QUEUE, it's default queue pair, the queue pair 0
Hmm but for non-vhost-user usages then it's not a queue, right? Well, whatever, as long as we have a definition for it... or maybe we could have VHOST_QUEUE_DEFAULT and NO_VHOST_QUEUE or VHOST_NO_QUEUE all being 0? -- Stefano
On 12/11/25 13:16, Stefano Brivio wrote:
On Thu, 11 Dec 2025 09:48:42 +0100 Laurent Vivier
wrote: On 12/11/25 08:01, Stefano Brivio wrote:
On Wed, 3 Dec 2025 19:54:32 +0100 Laurent Vivier
wrote: diff --git a/vu_common.c b/vu_common.c index b13b7c308fd8..80d9a30f6f71 100644 --- a/vu_common.c +++ b/vu_common.c @@ -196,11 +196,11 @@ static void vu_handle_tx(struct vu_dev *vdev, int index,
data = IOV_TAIL(elem[count].out_sg, elem[count].out_num, 0); if (IOV_DROP_HEADER(&data, struct virtio_net_hdr_mrg_rxbuf)) - tap_add_packet(vdev->context, &data, now); + tap_add_packet(vdev->context, 0, &data, now);
count++; } - tap_handler(vdev->context, now); + tap_handler(vdev->context, 0, now);
if (count) { int i; @@ -235,23 +235,26 @@ void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, }
/** - * vu_send_single() - Send a buffer to the front-end using the RX virtqueue + * vu_send_single() - Send a buffer to the front-end using a specified virtqueue * @c: execution context + * @qpair: Queue pair on which to send the buffer * @buf: address of the buffer * @size: size of the buffer * * Return: number of bytes sent, -1 if there is an error */ -int vu_send_single(const struct ctx *c, const void *buf, size_t size) +int vu_send_single(const struct ctx *c, unsigned int qpair, const void *buf, size_t size) { struct vu_dev *vdev = c->vdev; - struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE]; struct iovec in_sg[VIRTQUEUE_MAX_SIZE]; + struct vu_virtq *vq; size_t total; int elem_cnt; int i;
+ vq = &vdev->vq[qpair << 1];
<< 1 instead of * 2 is a bit surprising here, for a few seconds I thought you swapped qpair and 1.
Then I started thinking that somebody is likely to mix up (probably not you) indices of RX and TX queues at some point. So... what about some macros, say (let's see if I got it right this time):
#define VHOST_SEND_QUEUE(pair) ((pair) * 2) #define VHOST_RECV_QUEUE(pair) (pair)
I will. David had the same comment.
Uh, wait, I must have missed it. Do you have a Message-ID? I'm afraid I must have missed some emails here but I don't see them in archives either...
Message-ID: aRF1_Qj6uxf1ndiA@zatzit
TX and RX are from the point of view of guest, it's not obvious when we read passt code.
Right, yes, for me neither, I always get confused. That's why I thought we could make the RX vhost-user queue become "SEND" in passt's code, but:
I would prefer as David proposed to use, i.e. FROMGUEST and TOGUEST:
#define VHOST_FROM_GUEST(qpair) ((qpair) * 2 + 1) #define VHOST_TO_GUEST(qpair) ((qpair) * 2)
...this is even clearer. It misses the QUEUE though. Does VHOST_QUEUE_{FROM,TO}_GUEST fit where you use it? Otherwise I guess VQ together with FROM / TO should be clear enough.
and:
#define VHOST_QUEUE_PAIR(q) ((q) % 2) ? (q) : (q) / 2)
I don't undestand the purpose of this one.
To get the pair number from a queue number. You're doing something like that (I guess?) in 5/6, vu_handle_tx():
+ tap_flush_pools(index / 2);
+ tap_add_packet(vdev->context, index / 2, &data, now);
+ tap_handler(vdev->context, index / 2, now);
but now that I see your definition for VHOST_FROM_GUEST() above, and that the purpose wasn't clear to you, I guess it should be:
#define VHOST_PAIR_FROM_QUEUE(q) (((q) % 2) ? ((q) - 1 / 2) : ((q) / 2))
Why not simply: #define VHOST_PAIR_FROM_QUEUE(q) (q / 2) QUEUES 0,1 -> QP 0 QUEUES 2,3 -> QP 1
...or maybe it's not needed? I'm not sure.
...are they correct? A short description or "Theory of operation" section somewhere with a recap of how queue indices are used would be nice to have.
And maybe also something explaining that 0 that's now appearing in argument lists:
#define VHOST_NO_QUEUE 0
It's not really NO_QUEUE, it's default queue pair, the queue pair 0
Hmm but for non-vhost-user usages then it's not a queue, right? Well, For non vhost usage we can say there is only one queue.
whatever, as long as we have a definition for it... or maybe we could have VHOST_QUEUE_DEFAULT and NO_VHOST_QUEUE or VHOST_NO_QUEUE all being 0?
Perhaps we could instead use a generic naming: QPAIR_DEFAULT QUEUE_FROM_GUEST(qpair) QUEUE_TO_GUEST(qpair) Thanks, Laurent
On Thu, 11 Dec 2025 14:26:01 +0100
Laurent Vivier
On 12/11/25 13:16, Stefano Brivio wrote:
On Thu, 11 Dec 2025 09:48:42 +0100 Laurent Vivier
wrote: On 12/11/25 08:01, Stefano Brivio wrote:
On Wed, 3 Dec 2025 19:54:32 +0100 Laurent Vivier
wrote: diff --git a/vu_common.c b/vu_common.c index b13b7c308fd8..80d9a30f6f71 100644 --- a/vu_common.c +++ b/vu_common.c @@ -196,11 +196,11 @@ static void vu_handle_tx(struct vu_dev *vdev, int index,
data = IOV_TAIL(elem[count].out_sg, elem[count].out_num, 0); if (IOV_DROP_HEADER(&data, struct virtio_net_hdr_mrg_rxbuf)) - tap_add_packet(vdev->context, &data, now); + tap_add_packet(vdev->context, 0, &data, now);
count++; } - tap_handler(vdev->context, now); + tap_handler(vdev->context, 0, now);
if (count) { int i; @@ -235,23 +235,26 @@ void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, }
/** - * vu_send_single() - Send a buffer to the front-end using the RX virtqueue + * vu_send_single() - Send a buffer to the front-end using a specified virtqueue * @c: execution context + * @qpair: Queue pair on which to send the buffer * @buf: address of the buffer * @size: size of the buffer * * Return: number of bytes sent, -1 if there is an error */ -int vu_send_single(const struct ctx *c, const void *buf, size_t size) +int vu_send_single(const struct ctx *c, unsigned int qpair, const void *buf, size_t size) { struct vu_dev *vdev = c->vdev; - struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE]; struct iovec in_sg[VIRTQUEUE_MAX_SIZE]; + struct vu_virtq *vq; size_t total; int elem_cnt; int i;
+ vq = &vdev->vq[qpair << 1];
<< 1 instead of * 2 is a bit surprising here, for a few seconds I thought you swapped qpair and 1.
Then I started thinking that somebody is likely to mix up (probably not you) indices of RX and TX queues at some point. So... what about some macros, say (let's see if I got it right this time):
#define VHOST_SEND_QUEUE(pair) ((pair) * 2) #define VHOST_RECV_QUEUE(pair) (pair)
I will. David had the same comment.
Uh, wait, I must have missed it. Do you have a Message-ID? I'm afraid I must have missed some emails here but I don't see them in archives either...
Message-ID: aRF1_Qj6uxf1ndiA@zatzit
Ah, yes, I read that, but I didn't relate it to this topic as it was just about the direction / naming. I see now.
TX and RX are from the point of view of guest, it's not obvious when we read passt code.
Right, yes, for me neither, I always get confused. That's why I thought we could make the RX vhost-user queue become "SEND" in passt's code, but:
I would prefer as David proposed to use, i.e. FROMGUEST and TOGUEST:
#define VHOST_FROM_GUEST(qpair) ((qpair) * 2 + 1) #define VHOST_TO_GUEST(qpair) ((qpair) * 2)
...this is even clearer. It misses the QUEUE though. Does VHOST_QUEUE_{FROM,TO}_GUEST fit where you use it? Otherwise I guess VQ together with FROM / TO should be clear enough.
and:
#define VHOST_QUEUE_PAIR(q) ((q) % 2) ? (q) : (q) / 2)
I don't undestand the purpose of this one.
To get the pair number from a queue number. You're doing something like that (I guess?) in 5/6, vu_handle_tx():
+ tap_flush_pools(index / 2);
+ tap_add_packet(vdev->context, index / 2, &data, now);
+ tap_handler(vdev->context, index / 2, now);
but now that I see your definition for VHOST_FROM_GUEST() above, and that the purpose wasn't clear to you, I guess it should be:
#define VHOST_PAIR_FROM_QUEUE(q) (((q) % 2) ? ((q) - 1 / 2) : ((q) / 2))
Why not simply:
#define VHOST_PAIR_FROM_QUEUE(q) (q / 2)
QUEUES 0,1 -> QP 0 QUEUES 2,3 -> QP 1
Ah, right, of course. Don't forget the parentheses around 'q'.
...or maybe it's not needed? I'm not sure.
...are they correct? A short description or "Theory of operation" section somewhere with a recap of how queue indices are used would be nice to have.
And maybe also something explaining that 0 that's now appearing in argument lists:
#define VHOST_NO_QUEUE 0
It's not really NO_QUEUE, it's default queue pair, the queue pair 0
Hmm but for non-vhost-user usages then it's not a queue, right? Well, For non vhost usage we can say there is only one queue.
whatever, as long as we have a definition for it... or maybe we could have VHOST_QUEUE_DEFAULT and NO_VHOST_QUEUE or VHOST_NO_QUEUE all being 0?
Perhaps we could instead use a generic naming:
QPAIR_DEFAULT QUEUE_FROM_GUEST(qpair) QUEUE_TO_GUEST(qpair)
...but for non vhost-user we would have QUEUE_FROM_GUEST(QPAIR_DEFAULT) evaluating to 1 which isn't correct I guess? In general it looks reasonable to me, I would just like to make sure we avoid passing around a '0' in the non-vhost-user case which would look rather obscure. -- Stefano
On 12/11/25 16:27, Stefano Brivio wrote:
On Thu, 11 Dec 2025 14:26:01 +0100 Laurent Vivier
wrote: On 12/11/25 13:16, Stefano Brivio wrote:
On Thu, 11 Dec 2025 09:48:42 +0100 Laurent Vivier
wrote: On 12/11/25 08:01, Stefano Brivio wrote:
On Wed, 3 Dec 2025 19:54:32 +0100 Laurent Vivier
wrote: diff --git a/vu_common.c b/vu_common.c index b13b7c308fd8..80d9a30f6f71 100644 --- a/vu_common.c +++ b/vu_common.c @@ -196,11 +196,11 @@ static void vu_handle_tx(struct vu_dev *vdev, int index,
data = IOV_TAIL(elem[count].out_sg, elem[count].out_num, 0); if (IOV_DROP_HEADER(&data, struct virtio_net_hdr_mrg_rxbuf)) - tap_add_packet(vdev->context, &data, now); + tap_add_packet(vdev->context, 0, &data, now);
count++; } - tap_handler(vdev->context, now); + tap_handler(vdev->context, 0, now);
if (count) { int i; @@ -235,23 +235,26 @@ void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, }
/** - * vu_send_single() - Send a buffer to the front-end using the RX virtqueue + * vu_send_single() - Send a buffer to the front-end using a specified virtqueue * @c: execution context + * @qpair: Queue pair on which to send the buffer * @buf: address of the buffer * @size: size of the buffer * * Return: number of bytes sent, -1 if there is an error */ -int vu_send_single(const struct ctx *c, const void *buf, size_t size) +int vu_send_single(const struct ctx *c, unsigned int qpair, const void *buf, size_t size) { struct vu_dev *vdev = c->vdev; - struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE]; struct iovec in_sg[VIRTQUEUE_MAX_SIZE]; + struct vu_virtq *vq; size_t total; int elem_cnt; int i;
+ vq = &vdev->vq[qpair << 1];
<< 1 instead of * 2 is a bit surprising here, for a few seconds I thought you swapped qpair and 1.
Then I started thinking that somebody is likely to mix up (probably not you) indices of RX and TX queues at some point. So... what about some macros, say (let's see if I got it right this time):
#define VHOST_SEND_QUEUE(pair) ((pair) * 2) #define VHOST_RECV_QUEUE(pair) (pair)
I will. David had the same comment.
Uh, wait, I must have missed it. Do you have a Message-ID? I'm afraid I must have missed some emails here but I don't see them in archives either...
Message-ID: aRF1_Qj6uxf1ndiA@zatzit
Ah, yes, I read that, but I didn't relate it to this topic as it was just about the direction / naming. I see now.
TX and RX are from the point of view of guest, it's not obvious when we read passt code.
Right, yes, for me neither, I always get confused. That's why I thought we could make the RX vhost-user queue become "SEND" in passt's code, but:
I would prefer as David proposed to use, i.e. FROMGUEST and TOGUEST:
#define VHOST_FROM_GUEST(qpair) ((qpair) * 2 + 1) #define VHOST_TO_GUEST(qpair) ((qpair) * 2)
...this is even clearer. It misses the QUEUE though. Does VHOST_QUEUE_{FROM,TO}_GUEST fit where you use it? Otherwise I guess VQ together with FROM / TO should be clear enough.
and:
#define VHOST_QUEUE_PAIR(q) ((q) % 2) ? (q) : (q) / 2)
I don't undestand the purpose of this one.
To get the pair number from a queue number. You're doing something like that (I guess?) in 5/6, vu_handle_tx():
+ tap_flush_pools(index / 2);
+ tap_add_packet(vdev->context, index / 2, &data, now);
+ tap_handler(vdev->context, index / 2, now);
but now that I see your definition for VHOST_FROM_GUEST() above, and that the purpose wasn't clear to you, I guess it should be:
#define VHOST_PAIR_FROM_QUEUE(q) (((q) % 2) ? ((q) - 1 / 2) : ((q) / 2))
Why not simply:
#define VHOST_PAIR_FROM_QUEUE(q) (q / 2)
QUEUES 0,1 -> QP 0 QUEUES 2,3 -> QP 1
Ah, right, of course. Don't forget the parentheses around 'q'.
Noted :)
...or maybe it's not needed? I'm not sure.
...are they correct? A short description or "Theory of operation" section somewhere with a recap of how queue indices are used would be nice to have.
And maybe also something explaining that 0 that's now appearing in argument lists:
#define VHOST_NO_QUEUE 0
It's not really NO_QUEUE, it's default queue pair, the queue pair 0
Hmm but for non-vhost-user usages then it's not a queue, right? Well, For non vhost usage we can say there is only one queue.
whatever, as long as we have a definition for it... or maybe we could have VHOST_QUEUE_DEFAULT and NO_VHOST_QUEUE or VHOST_NO_QUEUE all being 0?
Perhaps we could instead use a generic naming:
QPAIR_DEFAULT QUEUE_FROM_GUEST(qpair) QUEUE_TO_GUEST(qpair)
...but for non vhost-user we would have QUEUE_FROM_GUEST(QPAIR_DEFAULT) evaluating to 1 which isn't correct I guess?
Yes, _but_ it can map it to what it wants. For instance, for non vhost-user, there will be only qpair #0, and QUEUE_FROM_GUEST(QPAIR_DEFAULT) can be mapped to "read from the tap socket" and QUEUE_TO_GUEST(QPAIR_DEFAULT) can be mapped to "write to the tap socket". In fact, in the threading part, one thread will manage one queue pair, and each action is mapped to the expected queue.
In general it looks reasonable to me, I would just like to make sure we avoid passing around a '0' in the non-vhost-user case which would look rather obscure.
I totally agree and I'm reviewing my code to avoid that. Thanks, Laurent
participants (3)
-
David Gibson
-
Laurent Vivier
-
Stefano Brivio