This incorporates all feedback of v5, and the commits are slightly reordered so the comment of the last commit makes more sense.
Signed-off-by: Volker Diels-Grabsch
On Tue, Sep 16, 2025 at 09:21:12PM +0200, Volker Diels-Grabsch wrote:
Signed-off-by: Volker Diels-Grabsch
Reviewed-by: David Gibson
--- tap.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tap.c b/tap.c index 7ba6399..cf862ef 100644 --- a/tap.c +++ b/tap.c @@ -1096,7 +1096,11 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data, return;
if (memcmp(c->guest_mac, eh->h_source, ETH_ALEN)) { + char bufmac[ETH_ADDRSTRLEN]; + memcpy(c->guest_mac, eh->h_source, ETH_ALEN); + debug("New guest MAC address observed: %s", + eth_ntop(c->guest_mac, bufmac, sizeof(bufmac))); proto_update_l2_buf(c->guest_mac, NULL); }
-- 2.47.3
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
Signed-off-by: Volker Diels-Grabsch
On Tue, Sep 16, 2025 at 09:21:13PM +0200, Volker Diels-Grabsch wrote:
Signed-off-by: Volker Diels-Grabsch
Reviewed-by: David Gibson
--- tap.c | 2 +- util.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/tap.c b/tap.c index cf862ef..50e1a88 100644 --- a/tap.c +++ b/tap.c @@ -1511,7 +1511,7 @@ void tap_backend_init(struct ctx *c) * sends us packets. Use the broadcast address so that our * first packets will reach it. */ - memset(&c->guest_mac, 0xff, sizeof(c->guest_mac)); + memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac)); break; }
diff --git a/util.h b/util.h index 2a8c38f..22eaac5 100644 --- a/util.h +++ b/util.h @@ -97,6 +97,8 @@ void abort_with_msg(const char *fmt, ...) #define FD_PROTO(x, proto) \ (IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
+#define MAC_BROADCAST \ + ((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }) #define MAC_ZERO ((uint8_t [ETH_ALEN]){ 0 }) #define MAC_IS_ZERO(addr) (!memcmp((addr), MAC_ZERO, ETH_ALEN))
-- 2.47.3
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
Signed-off-by: Volker Diels-Grabsch
On Tue, Sep 16, 2025 at 09:21:14PM +0200, Volker Diels-Grabsch wrote:
Signed-off-by: Volker Diels-Grabsch
Reviewed-by: David Gibson
--- conf.c | 3 +++ passt.1 | 4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/conf.c b/conf.c index f47f48e..02e903b 100644 --- a/conf.c +++ b/conf.c @@ -2067,6 +2067,9 @@ void conf(struct ctx *c, int argc, char **argv)
isolate_user(uid, gid, !netns_only, userns, c->mode);
+ if (c->no_icmp) + c->no_ndp = 1; + if (c->pasta_conf_ns) c->no_ra = 1;
diff --git a/passt.1 b/passt.1 index cef98b2..dd00b08 100644 --- a/passt.1 +++ b/passt.1 @@ -319,8 +319,8 @@ silently dropped.
.TP .BR \-\-no-icmp -Disable the ICMP/ICMPv6 echo handler. ICMP and ICMPv6 echo requests coming from -guest or target namespace will be silently dropped. +Disable the ICMP/ICMPv6 protocol handler. ICMP and ICMPv6 requests coming from +guest or target namespace will be silently dropped. Implies \fB--no-ndp\fR.
.TP .BR \-\-no-dhcp -- 2.47.3
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
When restarting passt while QEMU keeps running with a configured
"reconnect-ms" setting, the port forwardings will stop working until
the guest sends some outgoing network traffic.
Reason: Although QEMU reconnects successfully to the unix domain
socket of the new passt process, that one no longer knows the guest's
MAC address and uses instead the broadcast MAC address. However, this
is ignored by the guest, at least if the guest runs Linux. Only after
the guest sends some network package on its own initiative, passt will
know the MAC address and will be able to establish forwarded
connections.
This change fixes this issue by sending an ARP and an NDP request to
resolve the guest's MAC address via its IPv4 and IPv6 address, which
we do know, right after the unix domain socket (re)connection.
The only case where the IP is "wrong" would be if the configuration
changed, or on the very first start right after qemu started. But in
those cases, we just wouldn't get an ARP/NDP response, and can't do
anything until we receive the guest's DHCP request - just as before.
In other words, in the worst case the ARP/NDP requests would be
harmless.
Signed-off-by: Volker Diels-Grabsch
On Tue, Sep 16, 2025 at 09:21:15PM +0200, Volker Diels-Grabsch wrote:
When restarting passt while QEMU keeps running with a configured "reconnect-ms" setting, the port forwardings will stop working until the guest sends some outgoing network traffic.
Reason: Although QEMU reconnects successfully to the unix domain socket of the new passt process, that one no longer knows the guest's MAC address and uses instead the broadcast MAC address. However, this is ignored by the guest, at least if the guest runs Linux. Only after the guest sends some network package on its own initiative, passt will know the MAC address and will be able to establish forwarded connections.
This change fixes this issue by sending an ARP and an NDP request to resolve the guest's MAC address via its IPv4 and IPv6 address, which we do know, right after the unix domain socket (re)connection.
The only case where the IP is "wrong" would be if the configuration changed, or on the very first start right after qemu started. But in those cases, we just wouldn't get an ARP/NDP response, and can't do anything until we receive the guest's DHCP request - just as before. In other words, in the worst case the ARP/NDP requests would be harmless.
Signed-off-by: Volker Diels-Grabsch
Reviewed-by: David Gibson
--- arp.c | 34 ++++++++++++++++++++++++++++++++++ arp.h | 1 + ndp.c | 20 ++++++++++++++++++++ ndp.h | 1 + passt.1 | 4 ++-- tap.c | 5 +++++ 6 files changed, 63 insertions(+), 2 deletions(-)
diff --git a/arp.c b/arp.c index 44677ad..ad088b1 100644 --- a/arp.c +++ b/arp.c @@ -112,3 +112,37 @@ int arp(const struct ctx *c, struct iov_tail *data)
return 1; } + +/** + * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address + * @c: Execution context + */ +void arp_send_init_req(const struct ctx *c) +{ + struct { + struct ethhdr eh; + struct arphdr ah; + struct arpmsg am; + } __attribute__((__packed__)) req; + + /* Ethernet header */ + req.eh.h_proto = htons(ETH_P_ARP); + memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest)); + memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source)); + + /* ARP header */ + req.ah.ar_op = htons(ARPOP_REQUEST); + req.ah.ar_hrd = htons(ARPHRD_ETHER); + req.ah.ar_pro = htons(ETH_P_IP); + req.ah.ar_hln = ETH_ALEN; + req.ah.ar_pln = 4; + + /* ARP message */ + memcpy(req.am.sha, c->our_tap_mac, sizeof(req.am.sha)); + memcpy(req.am.sip, &c->ip4.our_tap_addr, sizeof(req.am.sip)); + memcpy(req.am.tha, MAC_BROADCAST, sizeof(req.am.tha)); + memcpy(req.am.tip, &c->ip4.addr, sizeof(req.am.tip)); + + debug("Sending initial ARP request for guest MAC address"); + tap_send_single(c, &req, sizeof(req)); +} diff --git a/arp.h b/arp.h index 86bcbf8..d5ad0e1 100644 --- a/arp.h +++ b/arp.h @@ -21,5 +21,6 @@ struct arpmsg { } __attribute__((__packed__));
int arp(const struct ctx *c, struct iov_tail *data); +void arp_send_init_req(const struct ctx *c);
#endif /* ARP_H */ diff --git a/ndp.c b/ndp.c index eb090cd..588b48f 100644 --- a/ndp.c +++ b/ndp.c @@ -438,3 +438,23 @@ void ndp_timer(const struct ctx *c, const struct timespec *now) first: next_ra = now->tv_sec + interval; } + +/** + * ndp_send_init_req() - Send initial NDP NS to retrieve guest MAC address + * @c: Execution context + */ +void ndp_send_init_req(const struct ctx *c) +{ + struct ndp_ns ns = { + .ih = { + .icmp6_type = NS, + .icmp6_code = 0, + .icmp6_router = 0, /* Reserved */ + .icmp6_solicited = 0, /* Reserved */ + .icmp6_override = 0, /* Reserved */ + }, + .target_addr = c->ip6.addr + }; + debug("Sending initial NDP NS request for guest MAC address"); + ndp_send(c, &c->ip6.addr, &ns, sizeof(ns)); +} diff --git a/ndp.h b/ndp.h index b1dd5e8..781ea86 100644 --- a/ndp.h +++ b/ndp.h @@ -11,5 +11,6 @@ struct icmp6hdr; int ndp(const struct ctx *c, const struct in6_addr *saddr, struct iov_tail *data); void ndp_timer(const struct ctx *c, const struct timespec *now); +void ndp_send_init_req(const struct ctx *c);
#endif /* NDP_H */ diff --git a/passt.1 b/passt.1 index dd00b08..af5726a 100644 --- a/passt.1 +++ b/passt.1 @@ -330,8 +330,8 @@ selected IPv4 default route.
.TP .BR \-\-no-ndp -Disable NDP responses. NDP messages coming from guest or target namespace will -be ignored. +Disable Neighbor Discovery. NDP messages coming from guest or target +namespace will be ignored. No initial NDP message will be sent.
.TP .BR \-\-no-dhcpv6 diff --git a/tap.c b/tap.c index 50e1a88..0f8ee25 100644 --- a/tap.c +++ b/tap.c @@ -1359,6 +1359,11 @@ static void tap_start_connection(const struct ctx *c) ev.events = EPOLLIN | EPOLLRDHUP; ev.data.u64 = ref.u64; epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); + + if (c->ifi4) + arp_send_init_req(c); + if (c->ifi6 && !c->no_ndp) + ndp_send_init_req(c); }
/** -- 2.47.3
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
The new wording clarifies that we (1) use the broadcast MAC address
only until we know the actual MAC address of the guest, and (2) our
first packets will not necessarily "reach" the guest, in the sense of
being processed rather than dropped. (Which is why we actively send an
initial ARP and/or NDP message, to get the guest MAC address as soon
as possible.)
Signed-off-by: Volker Diels-Grabsch
On Tue, Sep 16, 2025 at 09:21:16PM +0200, Volker Diels-Grabsch wrote:
The new wording clarifies that we (1) use the broadcast MAC address only until we know the actual MAC address of the guest, and (2) our first packets will not necessarily "reach" the guest, in the sense of being processed rather than dropped. (Which is why we actively send an initial ARP and/or NDP message, to get the guest MAC address as soon as possible.)
Signed-off-by: Volker Diels-Grabsch
Reviewed-by: David Gibson
--- tap.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tap.c b/tap.c index 0f8ee25..399eeaa 100644 --- a/tap.c +++ b/tap.c @@ -1512,9 +1512,9 @@ void tap_backend_init(struct ctx *c) case MODE_PASST: tap_sock_unix_init(c);
- /* In passt mode, we don't know the guest's MAC address until it - * sends us packets. Use the broadcast address so that our - * first packets will reach it. + /* In passt mode, we don't know the guest's MAC address until + * it sends us packets. Until then, use the broadcast address + * so that our first packets will have a chance to reach it. */ memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac)); break; -- 2.47.3
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Tue, 16 Sep 2025 21:21:11 +0200
Volker Diels-Grabsch
This incorporates all feedback of v5, and the commits are slightly reordered so the comment of the last commit makes more sense.
Applied, thanks to sticking to this! I hope this will finally make the QEMU disconnect/reconnect behaviour robust enough. A couple of notes for future changes: - you can add this kind of message as cover letter instead, just git format-patch --cover-letter and git will format things as needed, and concatenate In-Reply-To: and References: email headers - reporting the version number in every subject line might make reviewers' life marginally easier and it just takes a --subject-prefix="PATCH v6" (in this case) or even something like -v6 argument to git format-patch - carry Reviewed-by: tags if you don't... change the change. For example, here, 4/5 was already reviewed by David. I added the tag back (and will always do anyway) ...in any case, those are very minor details that don't really cause me any trouble as a maintainer. -- Stefano
participants (3)
-
David Gibson
-
Stefano Brivio
-
Volker Diels-Grabsch