On Wed, 10 Sep 2025 13:35:42 +0200
Volker Diels-Grabsch
When restarting passt while QEMU keeps running with a configured "reconnect-ms" setting, the port forwardings will stop working until the guest sends some outgoing network traffic.
Reason: Although QEMU reconnects successfully to the unix domain socket of the new passt process, that one no longer knows the guest's MAC address and uses instead the broadcast MAC address. However, this is ignored by the guest, at least if the guest runs Linux. Only after the guest sends some network package on its own initiative, passt will know the MAC address and will be able to establish forwarded connections.
This change fixes this issue by sending an ARP and an NDP request to resolve the guest's MAC address via its IPv4 and IPv6 address, which we do know, right after the unix domain socket (re)connection.
The only case where the IP is "wrong" would be if the configuration changed, or on the very first start right after qemu started. But in those cases, we just wouldn't get an ARP/NDP response, and can't do anything until we receive the guest's DHCP request - just as before. In other words, in the worst case the ARP/NDP requests would be harmless.
Signed-off-by: Volker Diels-Grabsch
--- arp.c | 35 +++++++++++++++++++++++++++++++++++ arp.h | 1 + ndp.c | 21 +++++++++++++++++++++ ndp.h | 1 + passt.1 | 9 +++++---- tap.c | 15 +++++++++++---- util.h | 2 ++ 7 files changed, 76 insertions(+), 8 deletions(-) diff --git a/arp.c b/arp.c index 44677ad..b263419 100644 --- a/arp.c +++ b/arp.c @@ -112,3 +112,38 @@ int arp(const struct ctx *c, struct iov_tail *data)
return 1; } + +/** + * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address + * @c: Execution context + */ +void arp_send_init_req(const struct ctx *c) +{ + struct { + struct ethhdr eh; + struct arphdr ah; + struct arpmsg am; + } __attribute__((__packed__)) req; + + /* Ethernet header */ + req.eh.h_proto = htons(ETH_P_ARP); + memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest)); + memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source)); + + /* ARP header */ + req.ah.ar_op = htons(ARPOP_REQUEST); + req.ah.ar_hrd = htons(ARPHRD_ETHER); + req.ah.ar_pro = htons(ETH_P_IP); + req.ah.ar_hln = ETH_ALEN; + req.ah.ar_pln = 4; + + /* ARP message */ + memcpy(req.am.sha, c->our_tap_mac, sizeof(req.am.sha)); + memcpy(req.am.sip, &c->ip4.our_tap_addr, sizeof(req.am.sip)); + memcpy(req.am.tha, MAC_BROADCAST, sizeof(req.am.tha)); + memcpy(req.am.tip, &c->ip4.addr, sizeof(req.am.tip)); + + debug("Sending initial ARP request to retrieve" + " guest MAC address after reconnect");
I should have mentioned this along with an earlier comment of mine I guess: user-visible strings are an exception to the 80-column limit, rationale: https://docs.kernel.org/process/coding-style.html#breaking-long-lines-and-st... ...by the way, note that this doesn't necessarily happen after a reconnection, because if the user specifies a given address for the guest (-a 192.0.2.1), the ARP request here makes sense even on the first connection. So maybe you could just say something like: debug("Sending initial ARP request for guest MAC address"); ?
+ tap_send_single(c, &req, sizeof(req)); +} diff --git a/arp.h b/arp.h index 86bcbf8..d5ad0e1 100644 --- a/arp.h +++ b/arp.h @@ -21,5 +21,6 @@ struct arpmsg { } __attribute__((__packed__));
int arp(const struct ctx *c, struct iov_tail *data); +void arp_send_init_req(const struct ctx *c);
#endif /* ARP_H */ diff --git a/ndp.c b/ndp.c index eb090cd..a47fe42 100644 --- a/ndp.c +++ b/ndp.c @@ -438,3 +438,24 @@ void ndp_timer(const struct ctx *c, const struct timespec *now) first: next_ra = now->tv_sec + interval; } + +/** + * ndp_send_init_req() - Send initial NDP NS to retrieve guest MAC address + * @c: Execution context + */ +void ndp_send_init_req(const struct ctx *c) +{ + struct ndp_ns ns = { + .ih = { + .icmp6_type = NS, + .icmp6_code = 0, + .icmp6_router = 0, /* Reserved */ + .icmp6_solicited = 0, /* Reserved */ + .icmp6_override = 0, /* Reserved */ + }, + .target_addr = c->ip6.addr + }; + debug("Sending initial NDP NS request to retrieve" + " guest MAC address after reconnect");
Same as above.
+ ndp_send(c, &c->ip6.addr, &ns, sizeof(ns)); +} diff --git a/ndp.h b/ndp.h index b1dd5e8..781ea86 100644 --- a/ndp.h +++ b/ndp.h @@ -11,5 +11,6 @@ struct icmp6hdr; int ndp(const struct ctx *c, const struct in6_addr *saddr, struct iov_tail *data); void ndp_timer(const struct ctx *c, const struct timespec *now); +void ndp_send_init_req(const struct ctx *c);
#endif /* NDP_H */ diff --git a/passt.1 b/passt.1 index cef98b2..7000377 100644 --- a/passt.1 +++ b/passt.1 @@ -319,8 +319,9 @@ silently dropped.
.TP .BR \-\-no-icmp -Disable the ICMP/ICMPv6 echo handler. ICMP and ICMPv6 echo requests coming from -guest or target namespace will be silently dropped. +Disable the ICMP/ICMPv6 handler, and hence also NDP.
Maybe we could make the code directly reflect this and simplify checks? That is, in conf() (conf.c), you could set c->no_ndp if c->no_icmp (after all options have been parsed).
ICMP and ICMPv6 requests +coming from guest or target namespace will be silently dropped. No initial NDP +message will be sent.
.TP .BR \-\-no-dhcp @@ -330,8 +331,8 @@ selected IPv4 default route.
.TP .BR \-\-no-ndp -Disable NDP responses. NDP messages coming from guest or target namespace will -be ignored. +Disable Neighbor Discovery. NDP messages coming from guest or target +namespace will be ignored. No initial NDP message will be sent.
.TP .BR \-\-no-dhcpv6 diff --git a/tap.c b/tap.c index 7ba6399..25c32f9 100644 --- a/tap.c +++ b/tap.c @@ -1096,7 +1096,9 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data, return;
if (memcmp(c->guest_mac, eh->h_source, ETH_ALEN)) { + char bufmac[ETH_ADDRSTRLEN];
Customary, for readability: one newline here (between declaration and code).
memcpy(c->guest_mac, eh->h_source, ETH_ALEN); + info("Guest MAC address: %s", eth_ntop(c->guest_mac, bufmac, sizeof(bufmac)));
This is one line you can break easily instead. But I don't think it should be info(), otherwise the guest can happily spam system logs on the host. I'm undecided between suggesting debug() or trace(), because on one hand it's a rather relevant information (maybe say "New guest MAC address"?), on the other hand nothing prevents a guest from using a different MAC address every other frame, and make debugging with --debug impossible if this happens. We don't have (yet) a generic rate limiting mechanism for prints. Maybe we could go with trace() for the moment, and the day we have it, switch this to a rate-limited debug() (or even info(), at that point).
proto_update_l2_buf(c->guest_mac, NULL); }
@@ -1355,6 +1357,11 @@ static void tap_start_connection(const struct ctx *c) ev.events = EPOLLIN | EPOLLRDHUP; ev.data.u64 = ref.u64; epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); + + if (c->ifi4) + arp_send_init_req(c); + if (c->ifi6 && !c->no_ndp && !c->no_icmp)
...here you could just check for c->no_ndp, if you set it on c->no_icmp.
+ ndp_send_init_req(c); }
/** @@ -1503,11 +1510,11 @@ void tap_backend_init(struct ctx *c) case MODE_PASST: tap_sock_unix_init(c);
- /* In passt mode, we don't know the guest's MAC address until it - * sends us packets. Use the broadcast address so that our - * first packets will reach it. + /* In passt mode, we don't know the guest's MAC address until + * it sends us packets. Until then, use the broadcast address + * so that our first packets will have a chance to reach it. */ - memset(&c->guest_mac, 0xff, sizeof(c->guest_mac)); + memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac)); break; }
diff --git a/util.h b/util.h index 2a8c38f..22eaac5 100644 --- a/util.h +++ b/util.h @@ -97,6 +97,8 @@ void abort_with_msg(const char *fmt, ...) #define FD_PROTO(x, proto) \ (IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
+#define MAC_BROADCAST \ + ((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }) #define MAC_ZERO ((uint8_t [ETH_ALEN]){ 0 }) #define MAC_IS_ZERO(addr) (!memcmp((addr), MAC_ZERO, ETH_ALEN))
The rest all looks good to me. -- Stefano