[PATCH v3 1/8] inany: Let length of sockaddr_inany be implicit from the family
sockaddr_inany can contain either an IPv4 or IPv6 socket address, so the
relevant length for bind() or connect() can vary. In pif_sockaddr() we
return that length, and in sock_l4_sa() we take it as an extra parameter.
However, sockaddr_inany always contains exactly a sockaddr_in or a
sockaddr_in6 each with a fixed size. Therefore we can derive the relevant
length from the family, and don't need to pass it around separately.
Make a tiny helper to get the relevant address length, and update all
interfaces to use that approach instead.
In the process, fix a buglet in tcp_flow_repair_bind(): we passed
sizeof(union sockaddr_inany) to bind() instead of the specific length for
the address family. Since the sizeof() is always longer than the specific
length, this is probably fine, but not theoretically correct.
Signed-off-by: David Gibson
Apologies for the very substantial delay. I have to admit I
procrastinated around 3/8 and 4/8 quite a bit as they look scary (but
much needed, of course).
Nits only, here:
On Wed, 29 Oct 2025 17:26:21 +1100
David Gibson
sockaddr_inany can contain either an IPv4 or IPv6 socket address, so the relevant length for bind() or connect() can vary. In pif_sockaddr() we return that length, and in sock_l4_sa() we take it as an extra parameter.
However, sockaddr_inany always contains exactly a sockaddr_in or a sockaddr_in6 each with a fixed size. Therefore we can derive the relevant length from the family, and don't need to pass it around separately.
Make a tiny helper to get the relevant address length, and update all interfaces to use that approach instead.
In the process, fix a buglet in tcp_flow_repair_bind(): we passed sizeof(union sockaddr_inany) to bind() instead of the specific length for the address family. Since the sizeof() is always longer than the specific length, this is probably fine, but not theoretically correct.
(Whoops.)
Signed-off-by: David Gibson
--- flow.c | 17 +++++++---------- icmp.c | 3 ++- inany.h | 18 ++++++++++++++++++ pif.c | 14 ++++---------- pif.h | 2 +- tcp.c | 20 +++++++++----------- tcp_splice.c | 5 ++--- udp.c | 8 +++----- util.c | 9 ++++----- util.h | 5 +++-- 10 files changed, 53 insertions(+), 48 deletions(-) diff --git a/flow.c b/flow.c index feefda3c..9926f408 100644 --- a/flow.c +++ b/flow.c @@ -170,8 +170,7 @@ struct flowside_sock_args { int fd; int err; enum epoll_type type; - const struct sockaddr *sa; - socklen_t sl;
This should be dropped from the struct documentation.
+ const union sockaddr_inany *sa; const char *path; uint32_t data;
Note: you'll get a trivial conflict here with "util: Move epoll registration out of sock_l4_sa()" if you rebase.
}; @@ -187,7 +186,7 @@ static int flowside_sock_splice(void *arg)
ns_enter(a->c);
- a->fd = sock_l4_sa(a->c, a->type, a->sa, a->sl, NULL, + a->fd = sock_l4_sa(a->c, a->type, a->sa, NULL, a->sa->sa_family == AF_INET6, a->data); a->err = errno;
@@ -209,11 +208,10 @@ int flowside_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, { const char *ifname = NULL; union sockaddr_inany sa; - socklen_t sl;
ASSERT(pif_is_socket(pif));
- pif_sockaddr(c, &sa, &sl, pif, &tgt->oaddr, tgt->oport); + pif_sockaddr(c, &sa, pif, &tgt->oaddr, tgt->oport);
switch (pif) { case PIF_HOST: @@ -224,13 +222,13 @@ int flowside_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, else if (sa.sa_family == AF_INET6) ifname = c->ip6.ifname_out;
- return sock_l4_sa(c, type, &sa, sl, ifname, + return sock_l4_sa(c, type, &sa, ifname, sa.sa_family == AF_INET6, data);
case PIF_SPLICE: { struct flowside_sock_args args = { .c = c, .type = type, - .sa = &sa.sa, .sl = sl, .data = data, + .sa = &sa, .data = data, }; NS_CALL(flowside_sock_splice, &args); errno = args.err; @@ -259,10 +257,9 @@ int flowside_connect(const struct ctx *c, int s, uint8_t pif, const struct flowside *tgt) { union sockaddr_inany sa; - socklen_t sl;
- pif_sockaddr(c, &sa, &sl, pif, &tgt->eaddr, tgt->eport); - return connect(s, &sa.sa, sl); + pif_sockaddr(c, &sa, pif, &tgt->eaddr, tgt->eport); + return connect(s, &sa.sa, socklen_inany(&sa)); }
/** flow_log_ - Log flow-related message diff --git a/icmp.c b/icmp.c index 6dffafb0..62b3bed8 100644 --- a/icmp.c +++ b/icmp.c @@ -301,8 +301,9 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af, ASSERT(flow_proto[pingf->f.type] == proto); pingf->ts = now->tv_sec;
- pif_sockaddr(c, &sa, &msh.msg_namelen, PIF_HOST, &tgt->eaddr, 0); + pif_sockaddr(c, &sa, PIF_HOST, &tgt->eaddr, 0); msh.msg_name = &sa; + msh.msg_namelen = socklen_inany(&sa); msh.msg_iov = iov; msh.msg_iovlen = cnt; msh.msg_control = NULL; diff --git a/inany.h b/inany.h index 7ca5cbd3..e3cae2a8 100644 --- a/inany.h +++ b/inany.h @@ -67,6 +67,24 @@ union sockaddr_inany { struct sockaddr_in6 sa6; };
+/** socklen_inany() - Return relevant size of an sockaddr_inany
'of a' ...but the fact that it returns that sounds kind of pleonastic. What about: /** socklen_inany() - Get relevant address length for sockaddr_inany address
+ * @sa: sockaddr_iany socket address
sockaddr_inany
+ * + * Returns the correct socket address length to pass to bind() or connect() with + * @sa, based on whether it is an IPv4 or IPv6 address.
Same here, I guess we obviously want it to be correct, and we have a somewhat standard syntax for return values in documentation, what about: * Return: socket address length for bind() or connect(), from IP family in @sa ?
+ */ +static inline socklen_t socklen_inany(const union sockaddr_inany *sa) +{ + switch (sa->sa_family) { + case AF_INET: + return sizeof(sa->sa4);
Wouldn't it be more obvious to return sizeof(struct sockaddr_in), and sizeof(struct sockaddr_in6) below, instead? I actually had to check that sa->sa4 matched that (I didn't remember if that was a union). Not a strong preference from my side, both are obviously correct anyway.
+ case AF_INET6: + return sizeof(sa->sa6); + default: + ASSERT(0); + } +}
-- Stefano
On Thu, Nov 13, 2025 at 07:33:04AM +0100, Stefano Brivio wrote:
Apologies for the very substantial delay. I have to admit I procrastinated around 3/8 and 4/8 quite a bit as they look scary (but much needed, of course).
Nits only, here:
On Wed, 29 Oct 2025 17:26:21 +1100 David Gibson
wrote: sockaddr_inany can contain either an IPv4 or IPv6 socket address, so the relevant length for bind() or connect() can vary. In pif_sockaddr() we return that length, and in sock_l4_sa() we take it as an extra parameter.
However, sockaddr_inany always contains exactly a sockaddr_in or a sockaddr_in6 each with a fixed size. Therefore we can derive the relevant length from the family, and don't need to pass it around separately.
Make a tiny helper to get the relevant address length, and update all interfaces to use that approach instead.
In the process, fix a buglet in tcp_flow_repair_bind(): we passed sizeof(union sockaddr_inany) to bind() instead of the specific length for the address family. Since the sizeof() is always longer than the specific length, this is probably fine, but not theoretically correct.
(Whoops.)
Signed-off-by: David Gibson
--- flow.c | 17 +++++++---------- icmp.c | 3 ++- inany.h | 18 ++++++++++++++++++ pif.c | 14 ++++---------- pif.h | 2 +- tcp.c | 20 +++++++++----------- tcp_splice.c | 5 ++--- udp.c | 8 +++----- util.c | 9 ++++----- util.h | 5 +++-- 10 files changed, 53 insertions(+), 48 deletions(-) diff --git a/flow.c b/flow.c index feefda3c..9926f408 100644 --- a/flow.c +++ b/flow.c @@ -170,8 +170,7 @@ struct flowside_sock_args { int fd; int err; enum epoll_type type; - const struct sockaddr *sa; - socklen_t sl;
This should be dropped from the struct documentation.
Good point, done.
+ const union sockaddr_inany *sa; const char *path; uint32_t data;
Note: you'll get a trivial conflict here with "util: Move epoll registration out of sock_l4_sa()" if you rebase.
Actually, there are a heap of conflicts with that patch. I'll definitely respin for that reason if no other.
}; @@ -187,7 +186,7 @@ static int flowside_sock_splice(void *arg)
ns_enter(a->c);
- a->fd = sock_l4_sa(a->c, a->type, a->sa, a->sl, NULL, + a->fd = sock_l4_sa(a->c, a->type, a->sa, NULL, a->sa->sa_family == AF_INET6, a->data); a->err = errno;
@@ -209,11 +208,10 @@ int flowside_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, { const char *ifname = NULL; union sockaddr_inany sa; - socklen_t sl;
ASSERT(pif_is_socket(pif));
- pif_sockaddr(c, &sa, &sl, pif, &tgt->oaddr, tgt->oport); + pif_sockaddr(c, &sa, pif, &tgt->oaddr, tgt->oport);
switch (pif) { case PIF_HOST: @@ -224,13 +222,13 @@ int flowside_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, else if (sa.sa_family == AF_INET6) ifname = c->ip6.ifname_out;
- return sock_l4_sa(c, type, &sa, sl, ifname, + return sock_l4_sa(c, type, &sa, ifname, sa.sa_family == AF_INET6, data);
case PIF_SPLICE: { struct flowside_sock_args args = { .c = c, .type = type, - .sa = &sa.sa, .sl = sl, .data = data, + .sa = &sa, .data = data, }; NS_CALL(flowside_sock_splice, &args); errno = args.err; @@ -259,10 +257,9 @@ int flowside_connect(const struct ctx *c, int s, uint8_t pif, const struct flowside *tgt) { union sockaddr_inany sa; - socklen_t sl;
- pif_sockaddr(c, &sa, &sl, pif, &tgt->eaddr, tgt->eport); - return connect(s, &sa.sa, sl); + pif_sockaddr(c, &sa, pif, &tgt->eaddr, tgt->eport); + return connect(s, &sa.sa, socklen_inany(&sa)); }
/** flow_log_ - Log flow-related message diff --git a/icmp.c b/icmp.c index 6dffafb0..62b3bed8 100644 --- a/icmp.c +++ b/icmp.c @@ -301,8 +301,9 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af, ASSERT(flow_proto[pingf->f.type] == proto); pingf->ts = now->tv_sec;
- pif_sockaddr(c, &sa, &msh.msg_namelen, PIF_HOST, &tgt->eaddr, 0); + pif_sockaddr(c, &sa, PIF_HOST, &tgt->eaddr, 0); msh.msg_name = &sa; + msh.msg_namelen = socklen_inany(&sa); msh.msg_iov = iov; msh.msg_iovlen = cnt; msh.msg_control = NULL; diff --git a/inany.h b/inany.h index 7ca5cbd3..e3cae2a8 100644 --- a/inany.h +++ b/inany.h @@ -67,6 +67,24 @@ union sockaddr_inany { struct sockaddr_in6 sa6; };
+/** socklen_inany() - Return relevant size of an sockaddr_inany
'of a'
...but the fact that it returns that sounds kind of pleonastic. What about:
/** socklen_inany() - Get relevant address length for sockaddr_inany address
Good idea, done.
+ * @sa: sockaddr_iany socket address
sockaddr_inany
Fixed.
+ * + * Returns the correct socket address length to pass to bind() or connect() with + * @sa, based on whether it is an IPv4 or IPv6 address.
Same here, I guess we obviously want it to be correct, and we have a somewhat standard syntax for return values in documentation, what about:
* Return: socket address length for bind() or connect(), from IP family in @sa
?
Again, good idea, done.
+ */ +static inline socklen_t socklen_inany(const union sockaddr_inany *sa) +{ + switch (sa->sa_family) { + case AF_INET: + return sizeof(sa->sa4);
Wouldn't it be more obvious to return sizeof(struct sockaddr_in), and sizeof(struct sockaddr_in6) below, instead? I actually had to check that sa->sa4 matched that (I didn't remember if that was a union).
Eh, maybe?
Not a strong preference from my side, both are obviously correct anyway.
I usually prefer to use sizeof() on variables not types when possible - it means things tend to either work or break more obviously if the types of the variables get changed. Leaving this one for now.
+ case AF_INET6: + return sizeof(sa->sa6); + default: + ASSERT(0); + } +}
-- Stefano
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
participants (2)
-
David Gibson
-
Stefano Brivio