[PATCH v3 00/10] Optionally copy all routes and addresses for pasta, allow gateway-less routes
This series, along with pseudo-related fixes, enables: - optional copy of all routes from selected interface in outer namespace, to fix the issue reported by Callum at: https://github.com/containers/podman/issues/18539 - optional copy of all addresses, mostly for consistency. It doesn't, however, enable assignment of multiple addresses in the sense requested at: https://bugs.passt.top/show_bug.cgi?id=47 because the addresses still need to be present on the host, and the "outer" address isn't selected depending on the address used inside the container - operation without a gateway address. This is related to: https://bugs.passt.top/show_bug.cgi?id=49 but Wireguard endpoints established outside the container can't be used yet as outbound interface (without the workaround reported there) for a number of reasons I'm still investigating. In any case, the correct route is now configured in the container, even without a default gateway on the corresponding host route, so we're a bit closer to support that configuration out of the box. v3: - in 5/10 and 9/10: mark the new --no-copy-routes and --no-copy-addrs options as deprecated to address David's concern. They are hopefully not useful and we can drop those, but they're nice to have around at the moment in case to debug issues that might be related to this series v2: - in 3/10, repeat the netlink request once for each RTM_NEWROUTE we're going to send as part of the request: routes might depend on each other, and this is a somewhat rudimentary, but seemingly robust approach to insert all the routes we can insert, without explicitly calculating dependencies - Cc: Andrea, reporter for the issue fixed in 4/10 Stefano Brivio (10): netlink: Fix comment about response buffer size for nl_req() pasta: Improve error handling on failure to join network namespace netlink: Add functionality to copy routes from outer namespace conf: --config-net option is for pasta mode only conf, pasta: With --config-net, copy all routes by default Revert "conf: Adjust netmask on mismatch between IPv4 address/netmask and gateway" conf: Don't exit if sourced default route has no gateway netlink: Add functionality to copy addresses from outer namespace conf, pasta: With --config-net, copy all addresses by default passt.h: Fix description of pasta_ifi in struct ctx conf.c | 85 +++++++++++++++++++++------------- netlink.c | 135 +++++++++++++++++++++++++++++++++++++++++------------- netlink.h | 13 ++++-- passt.1 | 35 +++++++++++++- passt.h | 8 +++- pasta.c | 26 +++++++---- 6 files changed, 221 insertions(+), 81 deletions(-) -- 2.39.2
Fixes: fde8004ab0b4 ("netlink: Use 8 KiB * netlink message header size as response buffer")
Signed-off-by: Stefano Brivio
In pasta_wait_for_ns(), open() failing with ENOENT is expected: we're
busy-looping until the network namespace appears. But any other
failure is not something we're going to recover from: return right
away if we don't get either success or ENOENT.
Now that pasta_wait_for_ns() can actually fail, handle that in
pasta_start_ns() by reporting the issue and exiting.
Looping on EPERM, when pasta doesn't actually have the permissions to
join a given namespace, isn't exactly a productive thing to do.
Signed-off-by: Stefano Brivio
On Mon, 22 May 2023 19:45:59 +0200
Stefano Brivio
In pasta_wait_for_ns(), open() failing with ENOENT is expected: we're busy-looping until the network namespace appears. But any other failure is not something we're going to recover from: return right away if we don't get either success or ENOENT.
Now that pasta_wait_for_ns() can actually fail, handle that in pasta_start_ns() by reporting the issue and exiting.
Looping on EPERM, when pasta doesn't actually have the permissions to join a given namespace, isn't exactly a productive thing to do.
Signed-off-by: Stefano Brivio
Reviewed-by: David Gibson --- pasta.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/pasta.c b/pasta.c index b30ce70..2a6fb60 100644 --- a/pasta.c +++ b/pasta.c @@ -95,8 +95,11 @@ static int pasta_wait_for_ns(void *arg) char ns[PATH_MAX];
snprintf(ns, PATH_MAX, "/proc/%i/ns/net", pasta_child_pid); - do - while ((c->pasta_netns_fd = open(ns, flags)) < 0); + while ((c->pasta_netns_fd = open(ns, flags)) < 0) { + if (errno != ENOENT) + return 0; + } + while (setns(c->pasta_netns_fd, CLONE_NEWNET) && !close(c->pasta_netns_fd));
Oops, what did I do here... :( On a failed setns(), we need (in most cases) to close and reopen the file. The fix and intention are quite obvious so I'm just fixing this up now as I'm applying it. -- Stefano
Instead of just fetching the default gateway and configuring a single
equivalent route in the target namespace, on 'pasta --config-net', it
might be desirable in some cases to copy the whole set of routes
corresponding to a given output interface.
For instance, in:
https://github.com/containers/podman/issues/18539
IPv4 Default Route Does Not Propagate to Pasta Containers on Hetzner VPSes
configuring the default gateway won't work without a gateway-less
route (specifying the output interface only), because the default
gateway is, somewhat dubiously, not on the same subnet as the
container.
This is a similar case to the one covered by commit 7656a6f88882
("conf: Adjust netmask on mismatch between IPv4 address/netmask and
gateway"), and I'm not exactly proud of that workaround.
We also have:
https://bugs.passt.top/show_bug.cgi?id=49
pasta does not work with tap-style interface
for which, eventually, we should be able to configure a gateway-less
route in the target namespace.
Introduce different operation modes for nl_route(), including a new
NL_DUP one, not exposed yet, which simply parrots back to the kernel
the route dump for a given interface from the outer namespace, fixing
up flags and interface indices on the way, and requesting to add the
same routes in the target namespace, on the interface we manage.
For n routes we want to duplicate, send n identical netlink requests
including the full dump: routes might depend on each other and the
kernel processes RTM_NEWROUTE messages sequentially, not atomically,
and repeating the full dump naturally resolves dependencies without
the need to actually calculate them.
I'm not kidding, it actually works pretty well.
Link: https://github.com/containers/podman/issues/18539
Link: https://bugs.passt.top/show_bug.cgi?id=49
Signed-off-by: Stefano Brivio
Reported-by: Andrea Arcangeli
Use the newly-introduced NL_DUP mode for nl_route() to copy all the
routes associated to the template interface in the outer namespace,
unless --no-copy-routes (also implied by -g) is given.
This option is introduced as deprecated right away: it's not expected
to be of any use, but it's helpful to keep it around for a while to
debug any suspected issue with this change.
Otherwise, we can't use default gateways which are not, address-wise,
on the same subnet as the container, as reported by Callum.
Reported-by: Callum Parsey
On Mon, May 22, 2023 at 07:46:02PM +0200, Stefano Brivio wrote:
Use the newly-introduced NL_DUP mode for nl_route() to copy all the routes associated to the template interface in the outer namespace, unless --no-copy-routes (also implied by -g) is given.
This option is introduced as deprecated right away: it's not expected to be of any use, but it's helpful to keep it around for a while to debug any suspected issue with this change.
Otherwise, we can't use default gateways which are not, address-wise, on the same subnet as the container, as reported by Callum.
Reported-by: Callum Parsey
Link: https://github.com/containers/podman/issues/18539 Signed-off-by: Stefano Brivio
Reviewed-by: David Gibson
--- conf.c | 16 ++++++++++++++++ passt.1 | 15 +++++++++++++++ passt.h | 4 +++- pasta.c | 6 ++++-- 4 files changed, 38 insertions(+), 3 deletions(-)
diff --git a/conf.c b/conf.c index 3ee6ae0..0a01983 100644 --- a/conf.c +++ b/conf.c @@ -923,6 +923,8 @@ pasta_opts: info( " --no-netns-quit Don't quit if filesystem-bound target"); info( " network namespace is deleted"); info( " --config-net Configure tap interface in namespace"); + info( " --no-copy-routes DEPRECATED:"); + info( " Don't copy all routes to namespace"); info( " --ns-mac-addr ADDR Set MAC address on tap interface");
exit(EXIT_FAILURE); @@ -1198,6 +1200,7 @@ void conf(struct ctx *c, int argc, char **argv) {"outbound-if4", required_argument, NULL, 15 }, {"outbound-if6", required_argument, NULL, 16 }, {"config-net", no_argument, NULL, 17 }, + {"no-copy-routes", no_argument, NULL, 18 }, { 0 }, }; struct get_bound_ports_ns_arg ns_ports_arg = { .c = c }; @@ -1362,6 +1365,13 @@ void conf(struct ctx *c, int argc, char **argv)
c->pasta_conf_ns = 1; break; + case 18: + if (c->mode != MODE_PASTA) + die("--no-copy-routes is for pasta mode only"); + + warn("--no-copy-routes will be dropped soon"); + c->no_copy_routes = 1; + break; case 'd': if (c->debug) die("Multiple --debug options given"); @@ -1510,6 +1520,9 @@ void conf(struct ctx *c, int argc, char **argv) } break; case 'g': + if (c->mode == MODE_PASTA) + c->no_copy_routes = 1; + if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.gw) && inet_pton(AF_INET6, optarg, &c->ip6.gw) && !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.gw) && @@ -1644,6 +1657,9 @@ void conf(struct ctx *c, int argc, char **argv) if (*c->sock_path && c->fd_tap >= 0) die("Options --socket and --fd are mutually exclusive");
+ if (c->mode == MODE_PASTA && c->no_copy_routes && !c->pasta_conf_ns) + die("Option --no-copy-routes needs --config-net"); + if (!ifi4 && *c->ip4.ifname_out) ifi4 = if_nametoindex(c->ip4.ifname_out);
diff --git a/passt.1 b/passt.1 index 19e01d5..ee2803a 100644 --- a/passt.1 +++ b/passt.1 @@ -546,6 +546,21 @@ NAME are given as target), do not exit once the network namespace is deleted. Configure networking in the namespace: set up addresses and routes as configured or sourced from the host, and bring up the tap interface.
+.TP +.BR \-\-no-copy-routes " " (DEPRECATED) +With \-\-config-net, do not copy all the routes associated to the interface we +derive addresses and routes from: set up only the default gateway. Implied by +-g, \-\-gateway. + +Default is to copy all the routing entries from the interface in the outer +namespace to the target namespace, translating the output interface attribute to +the outbound interface in the namespace. + +Note that this configuration option is \fBdeprecated\fR and will be removed in a +future version. It is not expected to be of any use, and it simply reflects a +legacy behaviour. If you have any use for this, refer to \fBREPORTING BUGS\fR +below. + .TP .BR \-\-ns-mac-addr " " \fIaddr Configure MAC address \fIaddr\fR on the tap interface in the namespace. diff --git a/passt.h b/passt.h index 73fe808..d314596 100644 --- a/passt.h +++ b/passt.h @@ -181,7 +181,8 @@ struct ip6_ctx { * @ip6: IPv6 configuration * @pasta_ifn: Name of namespace interface for pasta * @pasta_ifn: Index of namespace interface for pasta - * @pasta_conf_ns: Configure namespace interface after creating it + * @pasta_conf_ns: Configure namespace after creating it + * @no_copy_routes: Don't copy all routes when configuring target namespace * @no_tcp: Disable TCP operation * @tcp: Context for TCP protocol handler * @no_tcp: Disable UDP operation @@ -240,6 +241,7 @@ struct ctx { char pasta_ifn[IF_NAMESIZE]; unsigned int pasta_ifi; int pasta_conf_ns; + int no_copy_routes;
int no_tcp; struct tcp_ctx tcp; diff --git a/pasta.c b/pasta.c index 01109f5..b546c93 100644 --- a/pasta.c +++ b/pasta.c @@ -273,12 +273,14 @@ void pasta_ns_conf(struct ctx *c) nl_link(1, 1 /* lo */, MAC_ZERO, 1, 0);
if (c->pasta_conf_ns) { + enum nl_op op_routes = c->no_copy_routes ? NL_SET : NL_DUP; + nl_link(1, c->pasta_ifi, c->mac_guest, 1, c->mtu);
if (c->ifi4) { nl_addr(1, c->pasta_ifi, AF_INET, &c->ip4.addr, &c->ip4.prefix_len, NULL); - nl_route(NL_SET, c->ifi4, c->pasta_ifi, AF_INET, + nl_route(op_routes, c->ifi4, c->pasta_ifi, AF_INET, &c->ip4.gw); }
@@ -286,7 +288,7 @@ void pasta_ns_conf(struct ctx *c) int prefix_len = 64; nl_addr(1, c->pasta_ifi, AF_INET6, &c->ip6.addr, &prefix_len, NULL); - nl_route(NL_SET, c->ifi6, c->pasta_ifi, AF_INET6, + nl_route(op_routes, c->ifi6, c->pasta_ifi, AF_INET6, &c->ip6.gw); } } else {
-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
This reverts commit 7656a6f8888237b9e23d63666e921528b6aaf950: now, by
default, we copy all the routes associated to the outbound interface
into the routing table of the container, so there's no need for this
horrible workaround anymore.
Signed-off-by: Stefano Brivio
If we use a template interface without a gateway on the default
route, we can still offer almost complete functionality, except that,
of course, we can't map the gateway address to the outer namespace or
host, and that we have no obvious server address or identifier for
use in DHCP's siaddr and option 54 (Server identifier, mandatory).
Continue, if we have a default route but no default gateway, and
imply --no-map-gw and --no-dhcp in that case. NDP responder and
DHCPv6 should be able to work as usual because we require a
link-local address to be present, and we'll fall back to that.
Together with the previous commits implementing an actual copy of
routes from the outer namespace, this should finally fix the
operation of 'pasta --config-net' for cases where we have a default
route on the host, but no default gateway, as it's the case for
tap-style routes, including typical Wireguard endpoints.
Reported-by: me@yawnt.com
Link: https://bugs.passt.top/show_bug.cgi?id=49
Signed-off-by: Stefano Brivio
Similarly to what we've just done with routes, support NL_DUP for
addresses (currently not exposed): nl_addr() can optionally copy
mulitple addresses to the target namespace, by fixing up data from
the dump with appropriate flags and interface index, and repeating
it back to the kernel on the socket opened in the target namespace.
Link-local addresses are not copied: the family is set to AF_UNSPEC,
which means the kernel will ignore them. Same for addresses from a
mismatching address (pre-4.19 kernels without support for
NETLINK_GET_STRICT_CHK).
Ignore IFA_LABEL attributes by changing their type to IFA_UNSPEC,
because in general they will report mismatching names, and we don't
really need to use labels as we already know the interface index.
Signed-off-by: Stefano Brivio
Use the newly-introduced NL_DUP mode for nl_addr() to copy all the
addresses associated to the template interface in the outer
namespace, unless --no-copy-addrs (also implied by -a) is given.
This option is introduced as deprecated right away: it's not expected
to be of any use, but it's helpful to keep it around for a while to
debug any suspected issue with this change.
This is done mostly for consistency with routes. It might partially
cover the issue at:
https://bugs.passt.top/show_bug.cgi?id=47
Support multiple addresses per address family
for some use cases, but not the originally intended one: we'll still
use a single outbound address (unless the routing table specifies
different preferred source addresses depending on the destination),
regardless of the address used in the target namespace.
Link: https://bugs.passt.top/show_bug.cgi?id=47
Signed-off-by: Stefano Brivio
On Mon, May 22, 2023 at 07:46:06PM +0200, Stefano Brivio wrote:
Use the newly-introduced NL_DUP mode for nl_addr() to copy all the addresses associated to the template interface in the outer namespace, unless --no-copy-addrs (also implied by -a) is given.
This option is introduced as deprecated right away: it's not expected to be of any use, but it's helpful to keep it around for a while to debug any suspected issue with this change.
This is done mostly for consistency with routes. It might partially cover the issue at: https://bugs.passt.top/show_bug.cgi?id=47 Support multiple addresses per address family
for some use cases, but not the originally intended one: we'll still use a single outbound address (unless the routing table specifies different preferred source addresses depending on the destination), regardless of the address used in the target namespace.
Link: https://bugs.passt.top/show_bug.cgi?id=47 Signed-off-by: Stefano Brivio
Reviewed-by: David Gibson
--- conf.c | 18 ++++++++++++++++-- passt.1 | 14 ++++++++++++++ passt.h | 2 ++ pasta.c | 5 +++-- 4 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/conf.c b/conf.c index cb1e09a..d8414fe 100644 --- a/conf.c +++ b/conf.c @@ -902,6 +902,8 @@ pasta_opts: info( " --config-net Configure tap interface in namespace"); info( " --no-copy-routes DEPRECATED:"); info( " Don't copy all routes to namespace"); + info( " --no-copy-addrs DEPRECATED:"); + info( " Don't copy all addresses to namespace"); info( " --ns-mac-addr ADDR Set MAC address on tap interface");
exit(EXIT_FAILURE); @@ -1178,6 +1180,7 @@ void conf(struct ctx *c, int argc, char **argv) {"outbound-if6", required_argument, NULL, 16 }, {"config-net", no_argument, NULL, 17 }, {"no-copy-routes", no_argument, NULL, 18 }, + {"no-copy-addrs", no_argument, NULL, 19 }, { 0 }, }; struct get_bound_ports_ns_arg ns_ports_arg = { .c = c }; @@ -1349,6 +1352,13 @@ void conf(struct ctx *c, int argc, char **argv) warn("--no-copy-routes will be dropped soon"); c->no_copy_routes = 1; break; + case 19: + if (c->mode != MODE_PASTA) + die("--no-copy-addrs is for pasta mode only"); + + warn("--no-copy-addrs will be dropped soon"); + c->no_copy_addrs = 1; + break; case 'd': if (c->debug) die("Multiple --debug options given"); @@ -1634,8 +1644,12 @@ void conf(struct ctx *c, int argc, char **argv) if (*c->sock_path && c->fd_tap >= 0) die("Options --socket and --fd are mutually exclusive");
- if (c->mode == MODE_PASTA && c->no_copy_routes && !c->pasta_conf_ns) - die("Option --no-copy-routes needs --config-net"); + if (c->mode == MODE_PASTA && !c->pasta_conf_ns) { + if (c->no_copy_routes) + die("Option --no-copy-routes needs --config-net"); + if (c->no_copy_addrs) + die("Option --no-copy-addrs needs --config-net"); + }
if (!ifi4 && *c->ip4.ifname_out) ifi4 = if_nametoindex(c->ip4.ifname_out); diff --git a/passt.1 b/passt.1 index 96ce96e..1ad4276 100644 --- a/passt.1 +++ b/passt.1 @@ -563,6 +563,20 @@ future version. It is not expected to be of any use, and it simply reflects a legacy behaviour. If you have any use for this, refer to \fBREPORTING BUGS\fR below.
+.TP +.BR \-\-no-copy-addrs " " (DEPRECATED) +With \-\-config-net, do not copy all the addresses associated to the interface +we derive addresses and routes from: set up a single one. Implied by \-a, +\-\-address. + +Default is to copy all the addresses, except for link-local ones, from the +interface from the outer namespace to the target namespace. + +Note that this configuration option is \fBdeprecated\fR and will be removed in a +future version. It is not expected to be of any use, and it simply reflects a +legacy behaviour. If you have any use for this, refer to \fBREPORTING BUGS\fR +below. + .TP .BR \-\-ns-mac-addr " " \fIaddr Configure MAC address \fIaddr\fR on the tap interface in the namespace. diff --git a/passt.h b/passt.h index d314596..b51a1e5 100644 --- a/passt.h +++ b/passt.h @@ -183,6 +183,7 @@ struct ip6_ctx { * @pasta_ifn: Index of namespace interface for pasta * @pasta_conf_ns: Configure namespace after creating it * @no_copy_routes: Don't copy all routes when configuring target namespace + * @no_copy_addrs: Don't copy all addresses when configuring namespace * @no_tcp: Disable TCP operation * @tcp: Context for TCP protocol handler * @no_tcp: Disable UDP operation @@ -242,6 +243,7 @@ struct ctx { unsigned int pasta_ifi; int pasta_conf_ns; int no_copy_routes; + int no_copy_addrs;
int no_tcp; struct tcp_ctx tcp; diff --git a/pasta.c b/pasta.c index 99ef3fc..4054e9a 100644 --- a/pasta.c +++ b/pasta.c @@ -274,11 +274,12 @@ void pasta_ns_conf(struct ctx *c)
if (c->pasta_conf_ns) { enum nl_op op_routes = c->no_copy_routes ? NL_SET : NL_DUP; + enum nl_op op_addrs = c->no_copy_addrs ? NL_SET : NL_DUP;
nl_link(1, c->pasta_ifi, c->mac_guest, 1, c->mtu);
if (c->ifi4) { - nl_addr(NL_SET, c->ifi4, c->pasta_ifi, AF_INET, + nl_addr(op_addrs, c->ifi4, c->pasta_ifi, AF_INET, &c->ip4.addr, &c->ip4.prefix_len, NULL); nl_route(op_routes, c->ifi4, c->pasta_ifi, AF_INET, &c->ip4.gw); @@ -286,7 +287,7 @@ void pasta_ns_conf(struct ctx *c)
if (c->ifi6) { int prefix_len = 64; - nl_addr(NL_SET, c->ifi6, c->pasta_ifi, AF_INET6, + nl_addr(op_addrs, c->ifi6, c->pasta_ifi, AF_INET6, &c->ip6.addr, &prefix_len, NULL); nl_route(op_routes, c->ifi6, c->pasta_ifi, AF_INET6, &c->ip6.gw);
-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Signed-off-by: Stefano Brivio
participants (2)
-
David Gibson
-
Stefano Brivio