[PATCH v5 00/18] RFC: Dynamic configuration update implementation
Here's the next draft of dynamic configuration updates. This now can successfully update rules, though I've not tested it very extensively. Patches 1..8/18 are preliminary reworks that make sense even without pesto - feel free to apply if you're happy with them. I don't think the rest should be applied yet; we need to at least harden it so passt can't be blocked indefinitely by a client which sends a partial update then waits. Based on my earlier series reworking static checking invocation. TODO: - Don't allow a client which sends a partial configuration then blocks also block passt - Allow pesto to clear existing configuration, not just add - Allow pesto selectively delete existing rules, not just add Changes in v5: * If multiple clients connect at once, they're now blocked until the first one finishes, instead of later ones being discarded Changes in v4: * Merged with remainder of forward rule parsing rework series * Fix some bugs in rule checking pointed out by Laurent * Significantly cleaned up option parsing code * Changed from replacing all existing rules to adding new rules (clear and remove still TBD) * Somewhat simplified protocol (pif names and rules sent in a single pass) * pesto is now allocation free * Fixed commit message and style nits pointed out by Stefano Changes in v3: * Removed already applied ASSERT() rename * Renamed serialisation functions * Incorporated Stefano's extensions, reworked and fixed * Several additional cleanups / preliminary reworks Changes in v2: * Removed already applied cleanups * Reworked assert() patch to handle -DNDEBUG properly * Numerous extra patches: * Factored out serialisation helpers and use them for migration as well * Reworked to allow ip.[ch] and inany.[ch] to be shared with pesto * Reworks to share some forwarding rule datatypes with pesto * Implemented sending pif names and current ruleset to pesto David Gibson (18): conf, fwd: Stricter rule checking in fwd_rule_add() fwd_rule: Move ephemeral port probing to fwd_rule.c fwd, conf: Move rule parsing code to fwd_rule.[ch] fwd_rule: Move conflict checking back within fwd_rule_add() fwd: Generalise fwd_rules_info() pif: Limit pif names to 128 bytes fwd_rule: Fix some format specifiers tap, repair: Use SOCK_NONBLOCK and SOCK_CLOEXEC on Unix sockets pesto: Introduce stub configuration tool pesto, log: Share log.h (but not log.c) with pesto tool pesto, conf: Have pesto connect to passt and check versions pesto: Expose list of pifs to pesto and optionally display ip: Prepare ip.[ch] for sharing with pesto tool inany: Prepare inany.[ch] for sharing with pesto tool pesto: Read current ruleset from passt/pasta and optionally display it pesto: Parse and add new rules from command line pesto, conf: Send updated rules from pesto back to passt/pasta conf, fwd: Allow switching to new rules received from pesto .gitignore | 2 + Makefile | 54 ++-- common.h | 122 +++++++++ conf.c | 686 ++++++++++++++++++++++----------------------------- conf.h | 2 + epoll_type.h | 4 + flow.c | 4 +- fwd.c | 169 ++++--------- fwd.h | 41 +-- fwd_rule.c | 603 ++++++++++++++++++++++++++++++++++++++++++-- fwd_rule.h | 66 ++++- inany.c | 19 +- inany.h | 17 +- ip.c | 56 +---- ip.h | 4 +- lineread.c | 2 +- log.h | 59 ++++- passt.1 | 5 + passt.c | 8 + passt.h | 8 + pesto.1 | 46 ++++ pesto.c | 470 +++++++++++++++++++++++++++++++++++ pesto.h | 55 +++++ pif.c | 2 +- pif.h | 8 +- repair.c | 9 +- serialise.c | 7 + serialise.h | 1 + siphash.h | 13 + tap.c | 64 ++++- util.c | 2 +- util.h | 110 +-------- 32 files changed, 1921 insertions(+), 797 deletions(-) create mode 100644 common.h create mode 100644 pesto.1 create mode 100644 pesto.c create mode 100644 pesto.h -- 2.53.0
All current pif names are quite short, and we expect them to remain short
when/if we allow arbitrary pifs. However, because of the structure of
the current code we don't enforce any limit on the length.
This will become more important with dynamic configuration updates, so
start enforcing a length limit. Specifically we allow pif names to be up
to 128 bytes (PIF_NAME_SIZE), including the terminating \0. This is
more or less arbitrary, but seems like it should be comfortably enough for
all the cases we have in mind.
Signed-off-by: David Gibson
sock_unix(), which creates a listening Unix socket, doesn't set the
SOCK_NONBLOCK flag. Generally, this doesn't matter because we only
accept() once we've received an epoll event awaiting a connection. However
we will need non-blocking accept() for the upcoming control/configuration
socket. Always add SOCK_NONBLOCK, which is more robust and in keeping with
the normal non-blocking style of passt.
In tap.c, always set SOCK_NONBLOCK and SOCK_CLOEXEC on the accept()ed
sockets as well, which we weren't doing in all cases before. According to
accept(2), in Linux accepted sockets do *not* inherit these flags from the
listening socket. Also check for failures of accept, discarding EAGAIN
silently (a spurious epoll event) and warning for other errors.
In repair.c, similarly always add CLOEXEC. Use NONBLOCK for discard
sockets, but *not* for the final repair socket, since we want blocking
transactions during migration.
Signed-off-by: David Gibson
The new warnings in fwd_rule_add() have some not quite technically correct
format specifiers. For some weird reason these don't trip warnings on
passt itself, but do when we re-use this code in the upcoming configuration
client. Fix them.
Signed-off-by: David Gibson
In pesto we're going to want several levels of error/warning messages, much
like passt itself. Particularly as we start to share mode code between
passt and pesto, we want to use a similar interface to emit those. However
we don't want to use the same implementation - logging to a file or syslog
doesn't make sense for the command line tool.
To accomplish this loosely share log.h, but not log.c between pesto and
passt. In fact, an #ifdef means even most of log.h isn't actually shared,
but we do provide similar warn(), die() etc. macros.
This includes the *_perror() variants, which need strerror(). However,
we want to avoid allocations for pesto as we do for passt, and strerror()
allocates in some libc versions. Therefore, also move our workaround for
this to be shared with pesto.
Signed-off-by: Stefano Brivio
fwd_rules_info() is used to print a full table of forwarding rules for
debugging or the like. Currently it has one caller, and uses info() to
dump the messages. However for the upcoming configuration client, we're
going to want to dump the rules in some similar, but not quite identical
ways. For example, at different severity levels, or to stdout instead of
stderr / system log / logfile.
So, generalise fwd_rules_info() to fwd_rules_dump() which takes a printing
function as a parameter. Because we want this to work with "functions"
like info, which is actually a macro, we have to convert fwd_rules_dump()
to a macro as well. We also allow the prefix and suffix for each rule /
line to be provided as a parameter.
Signed-off-by: David Gibson
Start implementing pesto in earnest. Create a control/configuration
socket in passt. Have pesto connect to it and retrieve a server greeting
Perform some basic version checking.
Signed-off-by: David Gibson
Extend the dynamic update protocol to expose the pif indices and names
from a running passt/pasta to the pesto tool. pesto records that data
and, if requested with a new --show flag, prints it out.
Signed-off-by: David Gibson
Build a new "pesto" binary, which will become the tool to update a running
passt/pasta's configuration. For now, we just build a stub binary which
sets up a basic environment, parses trivial command line options but does
nothing else.
Signed-off-by: David Gibson
Most things in ip.[ch] related purely to IP addresses and headers with
no dependency on other passt/pasta internals. A number of these will be
useful to re-use in pesto. The exception is ipv6_l4hdr() which uses
iov_tail.
The only caller of this is in tap.c, so move the function there. Along
with moving the constant byteswapping functions to common.h, that lets
ip.[ch] to be linked into pesto as well as passt/pasta.
Signed-off-by: David Gibson
inany contains a number of helpful functions for dealing with addresses
which might be IPv4 or IPv6. We're going to want to use that in pesto.
For the most part inany doesn't depend on other passt/pasta internals,
however it does depend on siphash.h, which pesto doesn't need.
Move the single dependent function, inany_siphash_feed() to siphash.h,
renaming to match. Use that include inany.[ch] into pesto as well as
passt/pasta. While we're there reformat pesto.c's header comment to match
the convention used in most other files.
Signed-off-by: David Gibson
Implement serialisation of our current forwarding rules in conf.c,
deserialising it to display in the pesto client. Doing this requires
adding ip.c, inany.c, bitmap.c, lineread.c and fwd_rule.c to the pesto
build. With previous preparations that now requires only a trivial change
to lineread.c.
Signed-off-by: David Gibson
Extend pesto to send the updated rule configuration back to passt/pasta.
Extend passt/pasta to read the new configuration and store the new rules in
a "pending" table. We don't yet attempt to activate them.
Signed-off-by: Stefano Brivio
This adds parsing of options using fwd_rule_parse(), validates them and
adds them to the existing rules. It doesn't yet send those rules back to
passt or pasta.
Signed-off-by: Stefano Brivio
We can now receive updates to the forwarding rules from the pesto client
and store them in a "pending" copy of the forwarding tables. Implement
switching to using the new rules.
The logic is in a new fwd_listen_switch(). For now this closes all
listening sockets related to the old tables, swaps the active and pending
tables, then listens based on the new tables. In future we look to improve
this so that we don't temporarily stop listening on ports that both the
old and new tables specify.
Signed-off-by: David Gibson
On 2026-04-21 02:25, David Gibson wrote:
Implement serialisation of our current forwarding rules in conf.c, deserialising it to display in the pesto client. Doing this requires adding ip.c, inany.c, bitmap.c, lineread.c and fwd_rule.c to the pesto build. With previous preparations that now requires only a trivial change [...]
+ + +/** + * fwd_rule_read() - Read serialised rule from an fd + * @fd: fd to serialise to + * @rule: Buffer to store rule into + * + * Return: 0 on success, -1 on error (with errno set) + */ +int fwd_rule_read(int fd, struct fwd_rule *rule) +{ + if (read_all_buf(fd, rule, sizeof(*rule))) + return -1; + + /* Byteswap for host */ + rule->first = ntohs(rule->first); + rule->last = ntohs(rule->last); + rule->to = htons(rule->to); Or ntohs() ?
/jon
+ + return 0; +} + +/** + * fwd_rule_write() - Serialise rule to an fd + * @fd: fd to serialise to + * @rule: Rule to send + * + * Return: 0 on success, -1 on error (with errno set) + */ +int fwd_rule_write(int fd, const struct fwd_rule *rule) +{ + struct fwd_rule tmp = *rule; + + /* Byteswap for transport */ + tmp.first = htons(tmp.first); + tmp.last = htons(tmp.last); + tmp.to = htons(tmp.to); + + return write_all_buf(fd, &tmp, sizeof(tmp)); +} diff --git a/fwd_rule.h b/fwd_rule.h index f51f1b4b..330d49eb 100644 --- a/fwd_rule.h +++ b/fwd_rule.h @@ -29,6 +29,8 @@ #define FWD_CAP_UDP BIT(3) #define FWD_CAP_SCAN BIT(4) #define FWD_CAP_IFNAME BIT(5) +#define FWD_CAP_ALL (FWD_CAP_IPV4 | FWD_CAP_IPV6 | FWD_CAP_TCP | \ + FWD_CAP_UDP | FWD_CAP_SCAN | FWD_CAP_IFNAME)
/** * struct fwd_rule - Forwarding rule governing a range of ports @@ -99,6 +101,8 @@ void fwd_probe_ephemeral(void); const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule); const char *fwd_rule_fmt(const struct fwd_rule *rule, char *dst, size_t size); void fwd_rule_parse(char optname, const char *optarg, struct fwd_table *fwd); +int fwd_rule_read(int fd, struct fwd_rule *rule); +int fwd_rule_write(int fd, const struct fwd_rule *rule);
/** * fwd_rules_dump() - Dump forwarding rules diff --git a/lineread.c b/lineread.c index b9ceae10..a4269a66 100644 --- a/lineread.c +++ b/lineread.c @@ -19,8 +19,8 @@ #include
#include +#include "common.h" #include "lineread.h" -#include "util.h"
/** * lineread_init() - Prepare for line by line file reading without allocation diff --git a/pesto.c b/pesto.c index 3e34bbac..35a4d559 100644 --- a/pesto.c +++ b/pesto.c @@ -34,6 +34,7 @@ #include "common.h" #include "seccomp_pesto.h" #include "serialise.h" +#include "fwd_rule.h" #include "pesto.h" #include "log.h"
@@ -66,6 +67,7 @@ static void usage(const char *name, FILE *f, int status) struct pif_configuration { uint8_t pif; char name[PIF_NAME_SIZE]; + struct fwd_table fwd; };
struct configuration { @@ -123,6 +125,7 @@ static bool read_pif_conf(int fd, struct configuration *conf) struct pif_configuration *pc; struct pesto_pif_info info; uint8_t pif; + unsigned i;
if (read_u8(fd, &pif) < 0) die("Error reading from control socket"); @@ -149,8 +152,17 @@ static bool read_pif_conf(int fd, struct configuration *conf) static_assert(sizeof(info.name) == sizeof(pc->name), "Mismatching pif name lengths"); memcpy(pc->name, info.name, sizeof(pc->name)); - - debug("PIF %"PRIu8": %s", pc->pif, pc->name); + pc->fwd.caps = ntohl(info.caps); + pc->fwd.count = ntohl(info.count); + + debug("PIF %"PRIu8": %s, %"PRIu32" rules, capabilities 0x%"PRIx32 + ":%s%s%s%s%s%s", pc->pif, pc->name, pc->fwd.count, pc->fwd.caps, + pc->fwd.caps & FWD_CAP_IPV4 ? " IPv4" : "", + pc->fwd.caps & FWD_CAP_IPV6 ? " IPv6" : "", + pc->fwd.caps & FWD_CAP_TCP ? " TCP" : "", + pc->fwd.caps & FWD_CAP_UDP ? " UDP" : "", + pc->fwd.caps & FWD_CAP_SCAN ? " scan" : "", + pc->fwd.caps & FWD_CAP_IFNAME ? " ifname" : "");
/* O(n^2), but n is bounded by MAX_PIFS */ if (pif_conf_by_num(conf, pc->pif)) @@ -160,6 +172,18 @@ static bool read_pif_conf(int fd, struct configuration *conf) if (pif_conf_by_name(conf, pc->name)) die("Received duplicate interface name");
+ /* NOTE: We read the fwd rules directly into fwd.rules, rather than + * using fwd_rule_add(). This means we can read and display rules even + * if something has gone wrong (in pesto or passt) and we get rules that + * fwd_rule_add() would reject. It does have the side effect that we + * never assign socket space for the fwd rules, but we don't need that + * within pesto. + */ + for (i = 0; i < pc->fwd.count; i++) { + if (fwd_rule_read(fd, &pc->fwd.rules[i]) < 0) + die("Error reading from control socket"); + } + conf->npifs++; return true; } @@ -175,7 +199,8 @@ static void show_conf(const struct configuration *conf) for (i = 0; i < conf->npifs; i++) { const struct pif_configuration *pc = &conf->pif[i]; printf(" %s\n", pc->name); - printf(" TBD\n"); + fwd_rules_dump(printf, pc->fwd.rules, pc->fwd.count, + " ", "\n"); } }
@@ -288,6 +313,12 @@ int main(int argc, char **argv) ntohl(hello.pif_name_size), PIF_NAME_SIZE); }
+ if (ntohl(hello.ifnamsiz) != IFNAMSIZ) { + die("Server has unexpected IFNAMSIZ (%" + PRIu32" not %"PRIu32"\n", + ntohl(hello.ifnamsiz), IFNAMSIZ); + } + while (read_pif_conf(s, &conf)) ;
diff --git a/pesto.h b/pesto.h index ac4c2b58..8f6bbf65 100644 --- a/pesto.h +++ b/pesto.h @@ -26,11 +26,13 @@ * @magic: PESTO_SERVER_MAGIC * @version: Version number * @pif_name_size: Server's value for PIF_NAME_SIZE + * @ifnamsiz: Server's value for IFNAMSIZ */ struct pesto_hello { char magic[8]; uint32_t version; uint32_t pif_name_size; + uint32_t ifnamsiz; } __attribute__ ((__packed__));
static_assert(sizeof(PESTO_SERVER_MAGIC) @@ -41,9 +43,13 @@ static_assert(sizeof(PESTO_SERVER_MAGIC) * struct pesto_pif_info - Message with basic metadata about a pif * @resv_: Alignment gap (must be 0) * @name: Name (\0 terminated) + * @caps: Forwarding capabilities for this pif + * @count: Number of forwarding rules for this pif */ struct pesto_pif_info { char name[PIF_NAME_SIZE]; + uint32_t caps; + uint32_t count; } __attribute__ ((__packed__));
#endif /* PESTO_H */
On 2026-04-21 02:25, David Gibson wrote:
Extend pesto to send the updated rule configuration back to passt/pasta. Extend passt/pasta to read the new configuration and store the new rules in a "pending" table. We don't yet attempt to activate them.
Signed-off-by: Stefano Brivio
Message-ID: <20260322141843.4095972-3-sbrivio@redhat.com> [dwg: Based on an early draft from Stefano]\ Signed-off-by: David Gibson
[...]
+/** + * conf_recv_rules() - Receive forwarding rules from configuration client + * @c: Execution context + * @fd: Socket to the client + * + * Return: 0 on success, -1 on failure + */ +static int conf_recv_rules(const struct ctx *c, int fd) +{ + while (1) { + struct fwd_table *fwd; + struct fwd_rule r; + uint32_t count; + uint8_t pif; + unsigned i; + + if (read_u8(fd, &pif)) + return -1; + + if (pif == PIF_NONE) + break; + + if (pif >= ARRAY_SIZE(c->fwd_pending) || + !(fwd = c->fwd_pending[pif])) { + err("Received rules for non-existent table"); + return -1; + } + + if (read_u32(fd, &count)) + return -1; + + if (count > MAX_FWD_RULES) { + err("Received %"PRIu32" rules (maximum %u)", + count, MAX_FWD_RULES); + return -1; + } + + for (i = 0; i < count; i++) { + fwd_rule_read(fd, &r);
Since we don't check the return value I think we risk passing an only partially initialized fwd_rule to fwd_rule_add() if the read fails. Maybe: if (fwd_rule_read(fd, &r)) return -1; /jon
+ if (fwd_rule_add(fwd, &r) < 0) + return -1; + } + } + + return 0; +} + /** * conf_close() - Close configuration / control socket and clean up * @c: Execution context @@ -2072,21 +2119,33 @@ fail: void conf_handler(struct ctx *c, uint32_t events) { if (events & EPOLLIN) { - char discard[BUFSIZ]; - ssize_t n; - - do { - n = read(c->fd_control, discard, sizeof(discard)); - if (n > 0) - debug("Discarded %zd bytes of config data", n); - } while (n > 0); - if (n == 0) { - debug("Configuration client EOF"); - goto close; + unsigned pif; + + /* Clear pending tables */ + for (pif = 0; pif < PIF_NUM_TYPES; pif++) { + struct fwd_table *fwd = c->fwd_pending[pif]; + + if (!fwd) + continue; + fwd->count = 0; + fwd->sock_count = 0; } - if (errno != EAGAIN && errno != EWOULDBLOCK) { - err_perror("Error reading config data"); + + /* FIXME: this could block indefinitely if the client doesn't + * write as much as it should + */ + if (conf_recv_rules(c, c->fd_control) < 0) goto close; + + for (pif = 0; pif < PIF_NUM_TYPES; pif++) { + struct fwd_table *fwd = c->fwd_pending[pif]; + + if (!fwd) + continue; + + info("New forwarding rules for %s:", pif_name(pif)); + fwd_rules_dump(info, fwd->rules, fwd->count, + " ", ""); } }
diff --git a/fwd.c b/fwd.c index 8849cfcd..d93d2e5d 100644 --- a/fwd.c +++ b/fwd.c @@ -247,6 +247,9 @@ void fwd_neigh_table_init(const struct ctx *c) static struct fwd_table fwd_in; static struct fwd_table fwd_out;
+static struct fwd_table fwd_in_pending; +static struct fwd_table fwd_out_pending; + /** * fwd_rule_init() - Initialise forwarding tables * @c: Execution context @@ -269,10 +272,15 @@ void fwd_rule_init(struct ctx *c) caps |= FWD_CAP_IFNAME;
fwd_in.caps = fwd_out.caps = caps; + fwd_in_pending.caps = fwd_out_pending.caps = caps;
c->fwd[PIF_HOST] = &fwd_in; - if (c->mode == MODE_PASTA) + c->fwd_pending[PIF_HOST] = &fwd_in_pending; + + if (c->mode == MODE_PASTA) { c->fwd[PIF_SPLICE] = &fwd_out; + c->fwd_pending[PIF_SPLICE] = &fwd_out_pending; + } }
/** diff --git a/passt.h b/passt.h index b3f049de..1726965d 100644 --- a/passt.h +++ b/passt.h @@ -188,6 +188,7 @@ struct ip6_ctx { * @pasta_ifi: Index of namespace interface for pasta * @pasta_conf_ns: Configure namespace after creating it * @fwd: Forwarding tables + * @fwd_pending: Pending forward tables * @no_tcp: Disable TCP operation * @tcp: Context for TCP protocol handler * @no_udp: Disable UDP operation @@ -270,6 +271,7 @@ struct ctx { int pasta_conf_ns;
struct fwd_table *fwd[PIF_NUM_TYPES]; + struct fwd_table *fwd_pending[PIF_NUM_TYPES];
int no_tcp; struct tcp_ctx tcp; diff --git a/pesto.c b/pesto.c index ebac6bd6..c6c595a4 100644 --- a/pesto.c +++ b/pesto.c @@ -225,6 +225,39 @@ static bool read_pif_conf(int fd, struct configuration *conf) return true; }
+/** + * send_conf() - Send updated configuration to passt/pasta + * @fd: Control socket + * @conf: Updated configuration + */ +static void send_conf(int fd, const struct configuration *conf) +{ + unsigned i; + + for (i = 0; i < conf->npifs; i++) { + const struct pif_configuration *pc = &conf->pif[i]; + unsigned j; + + if (write_u8(fd, pc->pif) < 0) + goto fail; + + if (write_u32(fd, pc->fwd.count) < 0) + goto fail; + + for (j = 0; j < pc->fwd.count; j++) { + if (fwd_rule_write(fd, &pc->fwd.rules[j]) < 0) + goto fail; + } + } + + if (write_u8(fd, PIF_NONE) < 0) + goto fail; + return; + +fail: + die_perror("Error writing to control socket"); +} + /** * show_conf() - Show current configuration obtained from passt/pasta * @conf: Configuration description @@ -427,6 +460,8 @@ int main(int argc, char **argv) show_conf(&conf); }
+ send_conf(s, &conf); + noupdate: if (shutdown(s, SHUT_RDWR) < 0 || close(s) < 0) die_perror("Error shutting down control socket");
participants (2)
-
David Gibson
-
Jon Maloy