[PATCH v3 00/25] RFC: Read-only dynamic update implementation
Here's a new draft of dynamic updates. This now can successfully update rules, though I've not tested it very extensively. Essentially this is just barely enough to work, it still could do with rather a lot of polish. Patches 1..12/22 are preliminary reworks that make moderate sense even without pesto - feel free to apply if you're happy with them. Changes in v3: * Removed already applied ASSERT() rename * Renamed serialisation functions * Incorporated Stefano's extensions, reworked and fixed * Several additional cleanups / preliminary reworks Changes in v2: * Removed already applied cleanups * Reworked assert() patch to handle -DNDEBUG properly * Numerous extra patches: * Factored out serialisation helpers and use them for migration as well * Reworked to allow ip.[ch] and inany.[ch] to be shared with pesto * Reworks to share some forwarding rule datatypes with pesto * Implemented sending pif names and current ruleset to pesto David Gibson (22): conf: runas can be const vhost_user: Fix assorted minor cppcheck warnings serialise: Split functions user for serialisation from util.c serialise: Add helpers for serialising unsigned integers fwd: Move selecting correct scan bitmap into fwd_sync_one() fwd: Look up rule index in fwd_sync_one() fwd: Store forwarding tables indexed by (origin) pif fwd: Allow FWD_DUAL_STACK_ANY flag to be passed directly to fwd_rule_add() fwd, conf: Expose ephemeral ports as bitmap rather than function conf: Don't bother complaining about overlapping excluded ranges conf: Move check for mapping port 0 to caller conf: Move check for disabled interfaces earlier pesto: Introduce stub configuration interface and tool pesto: Add command line option parsing and debug messages pesto: Expose list of pifs to pesto ip: Prepare ip.[ch] for sharing with pesto tool inany: Prepare inany.[ch] for sharing with pesto tool fwd: Split forwading rule specification from its implementation state ip: Define a bound for the string returned by ipproto_name() fwd_rule: Move forwarding rule text formatting to common code pesto: Read current ruleset from passt/pasta and display it conf, fwd: Allow switching to new rules received from pesto Stefano Brivio (3): conf: Move port parsing functions to own file, ports.c conf, fwd, ports, util: Move things around for pesto pesto, conf: Parse, send and receive new rules .gitignore | 2 + Makefile | 42 +-- common.h | 178 +++++++++++++ conf.c | 740 ++++++++++++++++++++++++--------------------------- conf.h | 2 + epoll_type.h | 4 + flow.c | 37 ++- fwd.c | 347 +++++++----------------- fwd.h | 82 +----- fwd_rule.c | 204 ++++++++++++++ fwd_rule.h | 112 ++++++++ inany.c | 16 +- inany.h | 16 +- iov.c | 1 + ip.c | 74 ++---- ip.h | 4 +- lineread.c | 1 - log.c | 1 + log.h | 33 +++ migrate.c | 1 + passt.1 | 5 + passt.c | 9 + passt.h | 14 +- pcap.c | 1 + pesto.1 | 46 ++++ pesto.c | 412 ++++++++++++++++++++++++++++ pesto.h | 34 +++ pif.h | 2 + ports.c | 444 +++++++++++++++++++++++++++++++ ports.h | 48 ++++ serialise.c | 147 ++++++++++ serialise.h | 24 ++ siphash.h | 13 + tap.c | 52 ++++ tcp.c | 1 + util.c | 154 +---------- util.h | 87 ------ vhost_user.c | 16 +- virtio.c | 1 + virtio.h | 2 +- vu_common.c | 2 +- 41 files changed, 2334 insertions(+), 1077 deletions(-) create mode 100644 common.h create mode 100644 fwd_rule.c create mode 100644 fwd_rule.h create mode 100644 pesto.1 create mode 100644 pesto.c create mode 100644 pesto.h create mode 100644 ports.c create mode 100644 ports.h create mode 100644 serialise.c create mode 100644 serialise.h -- 2.53.0
Add helpers to serialise/deserialise unsigned integers, handling endian
conversion so that the stream format is consistent regardless of host
endiannness. Currently we only use this on one place: sending the number
of flows during migration. We're going to have more use for this as we
add dynamic configuration updates, so these will become more useful.
For now we only need a 32-bit version, however define the functions with
a macro so we can easily add other integer widths when we need them.
Signed-off-by: David Gibson
This fixes a batch of minor incorrect format specifier and "could be const"
warnings in the vhost_user code. For some very strange reason, cppcheck
doesn't catch these errors right now, but does after code rearrangements
we're making for the dynamic forwarding update stuff.
Signed-off-by: David Gibson
Currently fwd_listen_sync_() determines which of the two port scanned
bitmaps is the correct one for a rule before passing it into
fwd_sync_one(). However, fwd_sync_one() is already looking at the rule
internals so is better placed to do this.
Signed-off-by: David Gibson
Currently we have the slightly silly setup that fwd_listen_sync_() looks
up the pointer to a rule based on its index, then fwd_sync_one() turns it
back into an index. Avoid this by passing the index directly into
fwd_sync_one().
Signed-off-by: David Gibson
fwd_rule_add() takes a flags parameter, but it only allows the FWD_WEAK
and FWD_SCAN flags to be specified there. It doesn't allow the
FWD_DUAL_STACK_ANY flag to be set, instead expecting a [*] address to be
indicated by passing NULL as @addr.
However, for upcoming dynamic rule updates, it's more convenient to be able
to explicitly pass FWD_DUAL_STACK_ANY along with an address of ::. Allow
that mode of calling.
Signed-off-by: David Gibson
conf_ports_range_except() generates errors if asked to map an address of
a family that's not available. This is quite late to perform the check,
most callsites pass a NULL address, making the test moot, the one remaining
can check the address earlier. Move the check there, simplifying the
lower level function.
Signed-off-by: David Gibson
Currently we store the inbound (PIF_HOST) and outbound (PIF_SPLICE)
forwarding tables in separate fields of struct ctx. In a number of places
this requires somewhat awkward if or switch constructs to select the
right table for updates. Conceptually simplify that by using an index of
forwarding tables by pif, which as a bonus keeps track generically which
pifs have implemented forwarding tables so far.
For now this doesn't simplify a lot textually, because many places that
need this also have other special cases to apply by pif. It does simplify
a few crucial places though, and we expect it will become more useful as
the flexibility of the forwarding table is improved.
Signed-off-by: David Gibson
In conf_ports() we die with an error if a port specification includes
overlapping excluded ranges. This is contrary to our usual convention of
handling conflicting options (last option wins, rather than error). Plus,
these options don't even conflict, they're just redundant.
This behaviour potentially makes life harder for scripts or other tools
invoking pasta - if they might have the same port excluded for multiple
different reasons, they have to explicitly deduplicate the list, rather
than just list everything on the command line. So, don't give this error,
let a port be excluded as many times as you like.
Signed-off-by: David Gibson
Add a control socket to passt/pasta for updating configuration at runtime.
Add a tool (pesto) for communicating with the control socket.
For now this is a stub implementation that forms the connection and sends
some version information, but nothing more.
Example:
./pasta --debug --config-net -c /tmp/pasta.conf -t none
./pesto /tmp/pasta.conf
Signed-off-by: Stefano Brivio
It turns out the only callers of fwd_port_is_ephemeral() use it to build a
bitmap of ephemeral ports. So, replace it with fwd_port_map_ephemeral(),
which directly builds that bitmap. As a bonus this allows a slightly
cheaper implementation of building the map, since inside fwd.c we know that
the ephemeral ports form a single range.
Signed-off-by: David Gibson
Add basic command line option parsing using getopt_long() to pesto.
Implement --help, --version, --quiet and --verbose options, along with a
debug() macro to print debugging messages when given the verbose option.
Signed-off-by: David Gibson
Extend the dynamic update protocol to expose the pif indices and names
from a running passt/pasta to the pesto tool. pesto records that data
and (for now) prints it out.
Signed-off-by: David Gibson
conf_ports_range_except() checks that it hasn't been asked to map port 0,
which generally won't work. However, there's only one callsite that
doesn't statically pass a non-zero value here. Change the check to an
assert() and move the actual error message to that callsite. This will
allow further cleanups later.
Signed-off-by: David Gibson
Most things in ip.[ch] related purely to IP addresses and headers with
no dependency on other passt/pasta internals. A number of these will be
useful to re-use in pesto. The exception is ipv6_l4hdr() which uses
iov_tail.
The only caller of this is in tap.c, so move the function there. Along
with moving the constant byteswapping functions to common.h, that lets
ip.[ch] to be linked into pesto as well as passt/pasta.
Signed-off-by: David Gibson
ipproto_name() returns a static string of theoretically unbounded length.
That's going to be inconvenient in future, so introduce IPPROTO_STRLEN
giving an explicit bound on the length. Use static_assert() and some
macros to ensure nothing we return can exceed this.
Signed-off-by: David Gibson
Most of the fields in struct fwd_rule give the parameters of the forwarding
rule itself. The @socks field gives the list of listening sockets we use
to implement it. In order to share code with the configuraiton update tool
pesto, we want to split these. Move the rule specification itself into
fwd_rule.h, which can be shared with pesto.
Signed-off-by: David Gibson
inany contains a number of helpful functions for dealing with addresses
which might be IPv4 or IPv6. We're going to want to use that in pesto.
For the most part inany doesn't depend on other passt/pasta internals,
however it does depend on siphash.h, which pesto doesn't need.
Move the single dependent function, inany_siphash_feed() to siphash.h,
renaming to match. Use that include inany.[ch] into pesto as well as
passt/pasta. While we're there reformat pesto.c's header comment to match
the convention used in most other files.
Signed-off-by: David Gibson
Move the logic for formatting forwarding rules into strings from
fwd_rules_print() into fwd_rule.c where it can be shared with pesto.
We also make the function explicitly construct a string, rather than
directly printing with info(), for greater flexibility.
Signed-off-by: David Gibson
Implement serialisation of our current forwarding rules in conf.c,
deserialising it to display in the pesto client.
Signed-off-by: David Gibson
From: Stefano Brivio
From: Stefano Brivio
From: Stefano Brivio
We can now receive updates to the forwarding rules from the pesto client
and store them in a "pending" copy of the forwarding tables. Implement
switching to using the new rules.
The logic is in a new fwd_listen_switch(). For now this closes all
listening sockets related to the old tables, swaps the active and pending
tables, then listens based on the new tables. In future we look to improve
this so that we don't temporarily stop listening on ports that both the
old and new tables specify.
Signed-off-by: David Gibson
On Mon, Mar 23, 2026 at 06:37:07PM +1100, David Gibson wrote:
Here's a new draft of dynamic updates. This now can successfully update rules, though I've not tested it very extensively. Essentially this is just barely enough to work, it still could do with rather a lot of polish.
Patches 1..12/22 are preliminary reworks that make moderate sense even without pesto - feel free to apply if you're happy with them.
Oops, as well as miscounting here, I forgot to update the overall subject. This is no longer read-only. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Mon, 23 Mar 2026 18:37:14 +1100
David Gibson
Currently we store the inbound (PIF_HOST) and outbound (PIF_SPLICE) forwarding tables in separate fields of struct ctx. In a number of places this requires somewhat awkward if or switch constructs to select the right table for updates. Conceptually simplify that by using an index of forwarding tables by pif, which as a bonus keeps track generically which pifs have implemented forwarding tables so far.
For now this doesn't simplify a lot textually, because many places that need this also have other special cases to apply by pif. It does simplify a few crucial places though, and we expect it will become more useful as the flexibility of the forwarding table is improved.
Signed-off-by: David Gibson
--- conf.c | 58 +++++++++++++++++++++++++++++++------------------- flow.c | 22 +++++++------------ fwd.c | 65 ++++++++++++++++++++++++++++++--------------------------- fwd.h | 4 ++-- passt.h | 6 ++---- 5 files changed, 83 insertions(+), 72 deletions(-) diff --git a/conf.c b/conf.c index b1ebb4a4..6ca61b74 100644 --- a/conf.c +++ b/conf.c @@ -1252,11 +1252,17 @@ dns6: } }
- info("Inbound forwarding:"); - fwd_rules_print(&c->fwd_in); - if (c->mode == MODE_PASTA) { - info("Outbound forwarding:"); - fwd_rules_print(&c->fwd_out); + for (i = 0; i < PIF_NUM_TYPES; i++) { + const char *dir = "Outbound"; + + if (!c->fwd[i]) + continue; + + if (i == PIF_HOST) + dir = "Inbound"; + + info("%s forwarding rules (%s):", dir, pif_name(i)); + fwd_rules_print(c->fwd[i]); } }
@@ -2154,18 +2160,24 @@ void conf(struct ctx *c, int argc, char **argv)
/* Forwarding options can be parsed now, after IPv4/IPv6 settings */ fwd_probe_ephemeral(); + fwd_rule_init(c); optind = 0; do { name = getopt_long(argc, argv, optstring, options, NULL);
- if (name == 't') - conf_ports(c, name, optarg, &c->fwd_in, &tcp_in_mode); - else if (name == 'u') - conf_ports(c, name, optarg, &c->fwd_in, &udp_in_mode); - else if (name == 'T') - conf_ports(c, name, optarg, &c->fwd_out, &tcp_out_mode); - else if (name == 'U') - conf_ports(c, name, optarg, &c->fwd_out, &udp_out_mode); + if (name == 't') { + conf_ports(c, name, optarg, c->fwd[PIF_HOST], + &tcp_in_mode); + } else if (name == 'u') { + conf_ports(c, name, optarg, c->fwd[PIF_HOST], + &udp_in_mode); + } else if (name == 'T') { + conf_ports(c, name, optarg, c->fwd[PIF_SPLICE], + &tcp_out_mode); + } else if (name == 'U') { + conf_ports(c, name, optarg, c->fwd[PIF_SPLICE], + &udp_out_mode); + } } while (name != -1);
if (c->mode == MODE_PASTA) @@ -2224,20 +2236,24 @@ void conf(struct ctx *c, int argc, char **argv) udp_out_mode = fwd_default;
if (tcp_in_mode == FWD_MODE_AUTO) { - conf_ports_range_except(c, 't', "auto", &c->fwd_in, NULL, NULL, - 1, NUM_PORTS - 1, NULL, 1, FWD_SCAN); + conf_ports_range_except(c, 't', "auto", c->fwd[PIF_HOST], + NULL, NULL, 1, NUM_PORTS - 1, NULL, 1, + FWD_SCAN); } if (tcp_out_mode == FWD_MODE_AUTO) { - conf_ports_range_except(c, 'T', "auto", &c->fwd_out, NULL, "lo", - 1, NUM_PORTS - 1, NULL, 1, FWD_SCAN); + conf_ports_range_except(c, 'T', "auto", c->fwd[PIF_SPLICE], + NULL, "lo", 1, NUM_PORTS - 1, NULL, 1, + FWD_SCAN); } if (udp_in_mode == FWD_MODE_AUTO) { - conf_ports_range_except(c, 'u', "auto", &c->fwd_in, NULL, NULL, - 1, NUM_PORTS - 1, NULL, 1, FWD_SCAN); + conf_ports_range_except(c, 'u', "auto", c->fwd[PIF_HOST], + NULL, NULL, 1, NUM_PORTS - 1, NULL, 1, + FWD_SCAN); } if (udp_out_mode == FWD_MODE_AUTO) { - conf_ports_range_except(c, 'U', "auto", &c->fwd_out, NULL, "lo", - 1, NUM_PORTS - 1, NULL, 1, FWD_SCAN); + conf_ports_range_except(c, 'U', "auto", c->fwd[PIF_SPLICE], + NULL, "lo", 1, NUM_PORTS - 1, NULL, 1, + FWD_SCAN); }
if (!c->quiet) diff --git a/flow.c b/flow.c index c84857b2..2972ab87 100644 --- a/flow.c +++ b/flow.c @@ -503,10 +503,10 @@ struct flowside *flow_target(const struct ctx *c, union flow *flow, { char estr[INANY_ADDRSTRLEN], ostr[INANY_ADDRSTRLEN]; struct flow_common *f = &flow->f; + const struct fwd_table *fwd = c->fwd[f->pif[INISIDE]]; const struct flowside *ini = &f->side[INISIDE]; struct flowside *tgt = &f->side[TGTSIDE]; const struct fwd_rule *rule = NULL; - const struct fwd_table *fwd; uint8_t tgtpif = PIF_NONE;
assert(flow_new_entry == flow && f->state == FLOW_STATE_INI); @@ -514,6 +514,11 @@ struct flowside *flow_target(const struct ctx *c, union flow *flow, assert(f->pif[INISIDE] != PIF_NONE && f->pif[TGTSIDE] == PIF_NONE); assert(flow->f.state == FLOW_STATE_INI);
+ if (fwd) { + if (!(rule = fwd_rule_search(fwd, ini, proto, rule_hint))) + goto norule; + } + switch (f->pif[INISIDE]) { case PIF_TAP: memcpy(f->tap_omac, MAC_UNDEF, ETH_ALEN); @@ -521,20 +526,10 @@ struct flowside *flow_target(const struct ctx *c, union flow *flow, break;
case PIF_SPLICE: - fwd = &c->fwd_out; - - if (!(rule = fwd_rule_search(fwd, ini, proto, rule_hint))) - goto norule; -
Coverity Scan doesn't like this: /home/sbrivio/passt/flow.c:528:3: Type: Explicit null dereferenced (FORWARD_NULL) /home/sbrivio/passt/flow.c:508:2: 1. assign_zero: Assigning: "rule" = "NULL". /home/sbrivio/passt/flow.c:511:2: 2. path: Condition "flow_new_entry == flow", taking true branch. /home/sbrivio/passt/flow.c:511:2: 3. path: Condition "f->state == FLOW_STATE_INI", taking true branch. /home/sbrivio/passt/flow.c:512:2: 4. path: Condition "f->type == FLOW_TYPE_NONE", taking true branch. /home/sbrivio/passt/flow.c:513:2: 5. path: Condition "f->pif[0] != PIF_NONE", taking true branch. /home/sbrivio/passt/flow.c:513:2: 6. path: Condition "f->pif[1] == PIF_NONE", taking true branch. /home/sbrivio/passt/flow.c:514:2: 7. path: Condition "flow->f.state == FLOW_STATE_INI", taking true branch. /home/sbrivio/passt/flow.c:516:2: 8. path: Condition "fwd", taking false branch. /home/sbrivio/passt/flow.c:521:2: 9. path: Switch case value "PIF_SPLICE". /home/sbrivio/passt/flow.c:528:3: 10. var_deref_model: Passing null pointer "rule" to "fwd_nat_from_splice", which dereferences it. /home/sbrivio/passt/fwd.c:1095:2: 10.1. path: Condition "!inany_is_loopback(&ini->eaddr)", taking false branch. /home/sbrivio/passt/fwd.c:1095:2: 10.2. path: Condition "!inany_is_loopback(&ini->oaddr)", taking false branch. /home/sbrivio/passt/fwd.c:1112:2: 10.3. path: Condition "proto == IPPROTO_UDP", taking true branch. /home/sbrivio/passt/fwd.c:1116:2: 10.4. dereference: Dereferencing pointer "rule".
tgtpif = fwd_nat_from_splice(rule, proto, ini, tgt); break;
case PIF_HOST: - fwd = &c->fwd_in; - - if (!(rule = fwd_rule_search(fwd, ini, proto, rule_hint))) - goto norule; -
...and not this either: /home/sbrivio/passt/flow.c:532:3: Type: Explicit null dereferenced (FORWARD_NULL) /home/sbrivio/passt/flow.c:508:2: 1. assign_zero: Assigning: "rule" = "NULL". /home/sbrivio/passt/flow.c:511:2: 2. path: Condition "flow_new_entry == flow", taking true branch. /home/sbrivio/passt/flow.c:511:2: 3. path: Condition "f->state == FLOW_STATE_INI", taking true branch. /home/sbrivio/passt/flow.c:512:2: 4. path: Condition "f->type == FLOW_TYPE_NONE", taking true branch. /home/sbrivio/passt/flow.c:513:2: 5. path: Condition "f->pif[0] != PIF_NONE", taking true branch. /home/sbrivio/passt/flow.c:513:2: 6. path: Condition "f->pif[1] == PIF_NONE", taking true branch. /home/sbrivio/passt/flow.c:514:2: 7. path: Condition "flow->f.state == FLOW_STATE_INI", taking true branch. /home/sbrivio/passt/flow.c:516:2: 8. path: Condition "fwd", taking false branch. /home/sbrivio/passt/flow.c:521:2: 9. path: Switch case value "PIF_HOST". /home/sbrivio/passt/flow.c:532:3: 10. var_deref_model: Passing null pointer "rule" to "fwd_nat_from_host", which dereferences it. /home/sbrivio/passt/fwd.c:1173:2: 10.1. dereference: Dereferencing pointer "rule". I haven't checked why.
tgtpif = fwd_nat_from_host(c, rule, proto, ini, tgt); fwd_neigh_mac_get(c, &tgt->oaddr, f->tap_omac); break;
[...]
-- Stefano
Nits:
On Mon, 23 Mar 2026 18:37:15 +1100
David Gibson
fwd_rule_add() takes a flags parameter, but it only allows the FWD_WEAK and FWD_SCAN flags to be specified there. It doesn't allow the FWD_DUAL_STACK_ANY flag to be set, instead expecting a [*] address to be indicated by passing NULL as @addr.
However, for upcoming dynamic rule updates, it's more convenient to be able to explicitly pass FWD_DUAL_STACK_ANY along with an address of ::. Allow that mode of calling.
Signed-off-by: David Gibson
--- fwd.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fwd.c b/fwd.c index 3395a28e..d73b7ca7 100644 --- a/fwd.c +++ b/fwd.c @@ -362,18 +362,21 @@ void fwd_rule_add(struct fwd_table *fwd, uint8_t proto, uint8_t flags, in_port_t first, in_port_t last, in_port_t to) { /* Flags which can be set from the caller */ - const uint8_t allowed_flags = FWD_WEAK | FWD_SCAN; + const uint8_t allowed_flags = FWD_WEAK | FWD_SCAN | FWD_DUAL_STACK_ANY; unsigned num = (unsigned)last - first + 1; struct fwd_rule *new; unsigned i, port;
assert(!(flags & ~allowed_flags)); -
Spurious change, I think? The extra newline here looks good for readability. Maybe we should group the second assert with this one?
if (fwd->count >= ARRAY_SIZE(fwd->rules)) die("Too many port forwarding ranges"); if ((fwd->sock_count + num) > ARRAY_SIZE(fwd->socks)) die("Too many listening sockets");
+ /* Passing a non-wildcard address with DUAL_STACK_ANY is a bug */
Extra whitespace after DUAL_STACK_ANY.
+ assert(!(flags & FWD_DUAL_STACK_ANY) || !addr || + inany_equals(addr, &inany_any6)); + /* Check for any conflicting entries */ for (i = 0; i < fwd->count; i++) { char newstr[INANY_ADDRSTRLEN], rulestr[INANY_ADDRSTRLEN];
-- Stefano
On Mon, 23 Mar 2026 18:37:20 +1100
David Gibson
Add a control socket to passt/pasta for updating configuration at runtime. Add a tool (pesto) for communicating with the control socket.
For now this is a stub implementation that forms the connection and sends some version information, but nothing more. Example: ./pasta --debug --config-net -c /tmp/pasta.conf -t none ./pesto /tmp/pasta.conf
Signed-off-by: Stefano Brivio
[dwg: Based on an early draft from Stefano] Signed-off-by: David Gibson --- .gitignore | 2 + Makefile | 30 +++++++---- common.h | 14 +++++ conf.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++++++- conf.h | 2 + epoll_type.h | 4 ++ log.c | 1 + passt.1 | 5 ++ passt.c | 9 ++++ passt.h | 6 +++ pesto.1 | 46 ++++++++++++++++ pesto.c | 113 ++++++++++++++++++++++++++++++++++++++ pesto.h | 34 ++++++++++++ serialise.c | 3 ++ util.h | 3 -- 15 files changed, 407 insertions(+), 15 deletions(-) create mode 100644 common.h create mode 100644 pesto.1 create mode 100644 pesto.c create mode 100644 pesto.h diff --git a/.gitignore b/.gitignore index 3c16adc7..3e40d9f7 100644 --- a/.gitignore +++ b/.gitignore @@ -4,9 +4,11 @@ /pasta /pasta.avx2 /passt-repair +/pesto /qrap /pasta.1 /seccomp.h +/seccomp_pesto.h /seccomp_repair.h /c*.json README.plain.md diff --git a/Makefile b/Makefile index 5b6891d7..d085c9c1 100644 --- a/Makefile +++ b/Makefile @@ -44,17 +44,19 @@ PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c epoll_ctl.c \ udp.c udp_flow.c udp_vu.c util.c vhost_user.c virtio.c vu_common.c QRAP_SRCS = qrap.c PASST_REPAIR_SRCS = passt-repair.c -SRCS = $(PASST_SRCS) $(QRAP_SRCS) $(PASST_REPAIR_SRCS) +PESTO_SRCS = pesto.c serialise.c +SRCS = $(PASST_SRCS) $(QRAP_SRCS) $(PASST_REPAIR_SRCS) $(PESTO_SRCS)
-MANPAGES = passt.1 pasta.1 qrap.1 passt-repair.1 +MANPAGES = passt.1 pasta.1 pesto.1 qrap.1 passt-repair.1
+PESTO_HEADERS = common.h pesto.h serialise.h PASST_HEADERS = arch.h arp.h checksum.h conf.h dhcp.h dhcpv6.h epoll_ctl.h \ flow.h fwd.h flow_table.h icmp.h icmp_flow.h inany.h iov.h ip.h \ isolation.h lineread.h log.h migrate.h ndp.h netlink.h packet.h \ - passt.h pasta.h pcap.h pif.h repair.h serialise.h siphash.h tap.h tcp.h \ - tcp_buf.h tcp_conn.h tcp_internal.h tcp_splice.h tcp_vu.h udp.h \ - udp_flow.h udp_internal.h udp_vu.h util.h vhost_user.h virtio.h \ - vu_common.h + passt.h pasta.h pcap.h pif.h repair.h siphash.h tap.h tcp.h tcp_buf.h \ + tcp_conn.h tcp_internal.h tcp_splice.h tcp_vu.h udp.h udp_flow.h \ + udp_internal.h udp_vu.h util.h vhost_user.h virtio.h vu_common.h \ + $(PESTO_HEADERS) HEADERS = $(PASST_HEADERS) seccomp.h
C := \#include
\nint main(){int a=getrandom(0, 0, 0);} @@ -75,9 +77,9 @@ mandir ?= $(datarootdir)/man man1dir ?= $(mandir)/man1 ifeq ($(TARGET_ARCH),x86_64) -BIN := passt passt.avx2 pasta pasta.avx2 qrap passt-repair +BIN := passt passt.avx2 pasta pasta.avx2 qrap passt-repair pesto else -BIN := passt pasta qrap passt-repair +BIN := passt pasta qrap passt-repair pesto endif
all: $(BIN) $(MANPAGES) docs @@ -91,6 +93,9 @@ seccomp.h: seccomp.sh $(PASST_SRCS) $(PASST_HEADERS) seccomp_repair.h: seccomp.sh $(PASST_REPAIR_SRCS) @ ARCH="$(TARGET_ARCH)" CC="$(CC)" ./seccomp.sh seccomp_repair.h $(PASST_REPAIR_SRCS)
+seccomp_pesto.h: seccomp.sh $(PESTO_SRCS) + @ ARCH="$(TARGET_ARCH)" CC="$(CC)" ./seccomp.sh seccomp_pesto.h $(PESTO_SRCS) + passt: $(PASST_SRCS) $(HEADERS) $(CC) $(FLAGS) $(CFLAGS) $(CPPFLAGS) $(PASST_SRCS) -o passt $(LDFLAGS)
@@ -110,6 +115,9 @@ qrap: $(QRAP_SRCS) passt.h passt-repair: $(PASST_REPAIR_SRCS) seccomp_repair.h $(CC) $(FLAGS) $(CFLAGS) $(CPPFLAGS) $(PASST_REPAIR_SRCS) -o passt-repair $(LDFLAGS)
+pesto: $(PESTO_SRCS) $(PESTO_HEADERS) seccomp_pesto.h + $(CC) $(FLAGS) $(CFLAGS) $(CPPFLAGS) $(PESTO_SRCS) -o pesto $(LDFLAGS) + valgrind: EXTRA_SYSCALLS += rt_sigprocmask rt_sigtimedwait rt_sigaction \ rt_sigreturn getpid gettid kill clock_gettime \ mmap|mmap2 munmap open unlink gettimeofday futex \ @@ -119,7 +127,7 @@ valgrind: all
.PHONY: clean clean: - $(RM) $(BIN) *~ *.o seccomp.h seccomp_repair.h pasta.1 \ + $(RM) $(BIN) *~ *.o seccomp.h seccomp_repair.h seccomp_pesto.h pasta.1 \ passt.tar passt.tar.gz *.deb *.rpm \ passt.pid README.plain.md
@@ -173,11 +181,11 @@ docs: README.md done < README.md; \ ) > README.plain.md
-clang-tidy: $(PASST_SRCS) +clang-tidy: $(PASST_SRCS) $(PESTO_SRCS) clang-tidy $^ -- $(filter-out -pie,$(FLAGS) $(CFLAGS) $(CPPFLAGS)) \ -DCLANG_TIDY_58992
-cppcheck: $(PASST_SRCS) $(HEADERS) +cppcheck: $(PASST_SRCS) $(PESTO_SRCS) $(HEADERS) if cppcheck --check-level=exhaustive /dev/null > /dev/null 2>&1; then \ CPPCHECK_EXHAUSTIVE="--check-level=exhaustive"; \ else \ diff --git a/common.h b/common.h new file mode 100644 index 00000000..76a95609 --- /dev/null +++ b/common.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * Copyright Red Hat + * Author: David Gibson
+ * + * Definitions used by both passt/pasta and other tools + */ + +#ifndef COMMON_H +#define COMMON_H + +/* FPRINTF() intentionally silences cert-err33-c clang-tidy warnings */ +#define FPRINTF(f, ...) (void)fprintf(f, __VA_ARGS__) + +#endif /* _COMMON_H */ diff --git a/conf.c b/conf.c index d4c2a013..5e8bc665 100644 --- a/conf.c +++ b/conf.c @@ -35,6 +35,7 @@ #include #include +#include "common.h" #include "util.h" #include "ip.h" #include "passt.h" @@ -47,6 +48,10 @@ #include "isolation.h" #include "log.h" #include "vhost_user.h" +#include "epoll_ctl.h" +#include "conf.h" +#include "pesto.h" +#include "serialise.h"
#define NETNS_RUN_DIR "/run/netns"
@@ -888,6 +893,7 @@ static void usage(const char *name, FILE *f, int status) " --runas UID|UID:GID Run as given UID, GID, which can be\n" " numeric, or login and group names\n" " default: drop to user \"nobody\"\n" + " -c, --conf-path PATH Configuration socket path\n" " -h, --help Display this help message and exit\n" " --version Show version and exit\n");
@@ -1425,6 +1431,17 @@ static void conf_open_files(struct ctx *c) if (c->pidfile_fd < 0) die_perror("Couldn't open PID file %s", c->pidfile); } + + c->fd_conf = -1; + if (*c->conf_path) { + c->fd_conf_listen = sock_unix(c->conf_path); + if (c->fd_conf_listen < 0) { + die_perror("Couldn't open control socket %s", + c->conf_path); + } + } else { + c->fd_conf_listen = -1; + } }
/** @@ -1460,6 +1477,25 @@ fail: die("Invalid MAC address: %s", str); }
+/** + * conf_sock_listen() - Start listening for connections on configuration socket + * @c: Execution context + */ +static void conf_sock_listen(const struct ctx *c) +{ + union epoll_ref ref = { .type = EPOLL_TYPE_CONF_LISTEN }; + + if (c->fd_conf_listen < 0) + return; + + if (listen(c->fd_conf_listen, 0)) + die_perror("Couldn't listen on configuration socket"); + + ref.fd = c->fd_conf_listen; + if (epoll_add(c->epollfd, EPOLLIN | EPOLLET, ref)) + die_perror("Couldn't add configuration socket to epoll"); +} + /** * conf() - Process command-line arguments and set configuration * @c: Execution context @@ -1542,9 +1578,10 @@ void conf(struct ctx *c, int argc, char **argv) {"migrate-exit", no_argument, NULL, 29 }, {"migrate-no-linger", no_argument, NULL, 30 }, {"stats", required_argument, NULL, 31 }, + {"conf-path", required_argument, NULL, 'c' }, { 0 }, }; - const char *optstring = "+dqfel:hs:F:I:p:P:m:a:n:M:g:i:o:D:S:H:461t:u:T:U:"; + const char *optstring = "+dqfel:hs:c:F:I:p:P:m:a:n:M:g:i:o:D:S:H:461t:u:T:U:"; const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt"; char userns[PATH_MAX] = { 0 }, netns[PATH_MAX] = { 0 }; bool copy_addrs_opt = false, copy_routes_opt = false; @@ -1808,6 +1845,13 @@ void conf(struct ctx *c, int argc, char **argv)
c->fd_tap = -1; break; + case 'c': + ret = snprintf(c->conf_path, sizeof(c->conf_path), "%s", + optarg); + if (ret <= 0 || ret >= (int)sizeof(c->sock_path)) + die("Invalid configuration path: %s", optarg); + c->fd_conf_listen = c->fd_conf = -1; + break; case 'F': errno = 0; fd_tap_opt = strtol(optarg, NULL, 0); @@ -2252,4 +2296,108 @@ void conf(struct ctx *c, int argc, char **argv)
if (!c->quiet) conf_print(c); + + conf_sock_listen(c); +} + +/** + * conf_listen_handler() - Handle events on configuration listening socket + * @c: Execution context + * @events: epoll events + */ +void conf_listen_handler(struct ctx *c, uint32_t events) +{ + struct pesto_hello hello = { + .magic = PESTO_SERVER_MAGIC, + .version = htonl(PESTO_PROTOCOL_VERSION), + }; + union epoll_ref ref = { .type = EPOLL_TYPE_CONF }; + struct ucred uc = { 0 }; + socklen_t len = sizeof(uc); + int fd, rc; + + if (events != EPOLLIN) { + err("Unexpected event 0x%04x on configuration socket", events); + return; + } + + fd = accept4(c->fd_conf_listen, NULL, NULL, SOCK_NONBLOCK); + if (fd < 0) { + warn_perror("accept4() on configuration listening socket"); + return; + } + + if (getsockopt(fd, SOL_SOCKET, SO_PEERCRED, &uc, &len) < 0) + warn_perror("Can't get configuration client credentials"); + + /* Another client is already connected: accept and close right away. */ + if (c->fd_conf != -1) { + info("Discarding configuration client, PID %i", uc.pid); + goto fail; + } + + ref.fd = fd; + rc = epoll_add(c->epollfd, EPOLLIN | EPOLLET, ref); + if (rc < 0) { + warn_perror("epoll_ctl() on configuration socket"); + goto fail; + } + + rc = write_all_buf(fd, &hello, sizeof(hello)); + if (rc < 0) { + warn_perror("Error writing configuration protocol hello"); + goto fail; + } + + c->fd_conf = fd; + info("Accepted configuration client, PID %i", uc.pid); + if (!PESTO_PROTOCOL_VERSION) { + warn( +"Warning: Using experimental unsupported configuration protocol"); + } + + return; + +fail: + close(fd); +} + +/** + * conf_handler() - Handle events on configuration socket + * @c: Execution context + * @events: epoll events + */ +void conf_handler(struct ctx *c, uint32_t events) +{ + if (events & EPOLLIN) { + char discard[BUFSIZ]; + ssize_t n; + + do { + n = read(c->fd_conf, discard, sizeof(discard)); + if (n > 0) + debug("Discarded %zd bytes of config data", n); + } while (n > 0); + if (n == 0) { + debug("Configuration client EOF"); + goto close; + } + if (errno != EAGAIN && errno != EWOULDBLOCK) { + err_perror("Error reading config data"); + goto close; + } + } + + if (events & EPOLLHUP) { + debug("Configuration client hangup"); + goto close; + } + + return; + +close: + debug("Closing configuration socket"); + epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_conf, NULL); + close(c->fd_conf); + c->fd_conf = -1; } diff --git a/conf.h b/conf.h index b45ad746..16f97189 100644 --- a/conf.h +++ b/conf.h @@ -8,5 +8,7 @@
enum passt_modes conf_mode(int argc, char *argv[]); void conf(struct ctx *c, int argc, char **argv); +void conf_listen_handler(struct ctx *c, uint32_t events); +void conf_handler(struct ctx *c, uint32_t events);
#endif /* CONF_H */ diff --git a/epoll_type.h b/epoll_type.h index a90ffb67..061325aa 100644 --- a/epoll_type.h +++ b/epoll_type.h @@ -46,6 +46,10 @@ enum epoll_type { EPOLL_TYPE_REPAIR, /* Netlink neighbour subscription socket */ EPOLL_TYPE_NL_NEIGH, + /* Configuration listening socket */ + EPOLL_TYPE_CONF_LISTEN, + /* Configuration socket */ + EPOLL_TYPE_CONF,
EPOLL_NUM_TYPES, }; diff --git a/log.c b/log.c index 21e3673e..99404e25 100644 --- a/log.c +++ b/log.c @@ -26,6 +26,7 @@ #include
#include +#include "common.h" #include "linux_dep.h" #include "log.h" #include "util.h" diff --git a/passt.1 b/passt.1 index 13e8df9d..32d70c8d 100644 --- a/passt.1 +++ b/passt.1 @@ -127,6 +127,11 @@ login name and group name can be passed. This requires privileges (either initial effective UID 0 or CAP_SETUID capability) to work. Default is to change to user \fInobody\fR if started as root.
+.TP +.BR \-c ", " \-\-conf-path " " \fIpath " " (EXPERIMENTAL) +Path for configuration and control socket used by \fBpesto\fR(1) to +dynamically update passt or pasta's configuration. + .TP .BR \-h ", " \-\-help Display a help message and exit. diff --git a/passt.c b/passt.c index f84419c7..d37beef3 100644 --- a/passt.c +++ b/passt.c @@ -36,6 +36,7 @@ #include
#include +#include "common.h" #include "util.h" #include "passt.h" #include "dhcp.h" @@ -80,6 +81,8 @@ char *epoll_type_str[] = { [EPOLL_TYPE_REPAIR_LISTEN] = "TCP_REPAIR helper listening socket", [EPOLL_TYPE_REPAIR] = "TCP_REPAIR helper socket", [EPOLL_TYPE_NL_NEIGH] = "netlink neighbour notifier socket", + [EPOLL_TYPE_CONF_LISTEN] = "configuration listening socket", + [EPOLL_TYPE_CONF] = "configuration socket", }; static_assert(ARRAY_SIZE(epoll_type_str) == EPOLL_NUM_TYPES, "epoll_type_str[] doesn't match enum epoll_type"); @@ -303,6 +306,12 @@ static void passt_worker(void *opaque, int nfds, struct epoll_event *events) case EPOLL_TYPE_NL_NEIGH: nl_neigh_notify_handler(c); break; + case EPOLL_TYPE_CONF_LISTEN: + conf_listen_handler(c, eventmask); + break; + case EPOLL_TYPE_CONF: + conf_handler(c, eventmask); + break; default: /* Can't happen */ assert(0); diff --git a/passt.h b/passt.h index 62b8dcdf..c38bb5ae 100644 --- a/passt.h +++ b/passt.h @@ -158,6 +158,7 @@ struct ip6_ctx { * @foreground: Run in foreground, don't log to stderr by default * @nofile: Maximum number of open files (ulimit -n) * @sock_path: Path for UNIX domain socket + * @conf_path: Path for configuration UNIX domain socket * @repair_path: TCP_REPAIR helper path, can be "none", empty for default * @pcap: Path for packet capture file * @pidfile: Path to PID file, empty string if not configured @@ -169,6 +170,8 @@ struct ip6_ctx { * @epollfd: File descriptor for epoll instance * @fd_tap_listen: File descriptor for listening AF_UNIX socket, if any * @fd_tap: AF_UNIX socket, tuntap device, or pre-opened socket + * @fd_conf_listen: Listening configuration socket, if any + * @fd_conf: Configuration socket, if any * @fd_repair_listen: File descriptor for listening TCP_REPAIR socket, if any * @fd_repair: Connected AF_UNIX socket for TCP_REPAIR helper * @our_tap_mac: Pasta/passt's MAC on the tap link @@ -223,6 +226,7 @@ struct ctx { int foreground; int nofile; char sock_path[UNIX_PATH_MAX]; + char conf_path[UNIX_PATH_MAX]; char repair_path[UNIX_PATH_MAX]; char pcap[PATH_MAX];
@@ -240,6 +244,8 @@ struct ctx { int epollfd; int fd_tap_listen; int fd_tap; + int fd_conf_listen; + int fd_conf; int fd_repair_listen; int fd_repair; unsigned char our_tap_mac[ETH_ALEN]; diff --git a/pesto.1 b/pesto.1 new file mode 100644 index 00000000..338fb8a6 --- /dev/null +++ b/pesto.1 @@ -0,0 +1,46 @@ +.\" SPDX-License-Identifier: GPL-2.0-or-later +.\" Copyright Red Hat +.\" Author: David Gibson
+.TH pesto 1 + +.SH NAME +.B pesto +\- Configure a running \fBpasst\fR(1) or \fBpasta\fR(1) instance. + +.SH SYNOPSIS +.B pesto +\fIPATH\fR + +.SH DESCRIPTION + +.B pesto +is an experimental client to view and update the port forwarding +configuration of a running \fBpasst\fR(1) or \fBpasta\fR(1) instance. + +\fIPATH\fR gives the path to the UNIX domain socket created by +\fBpasst\fR or \fBpasta\fR. It should match the \fB-c\fR command line +option given to that instance. + +.SH AUTHORS + +Stefano Brivio , +David Gibson . + +.SH REPORTING BUGS + +Please report issues on the bug tracker at https://bugs.passt.top/, or +send a message to the passt-user@passt.top mailing list, see +https://lists.passt.top/. + +.SH COPYRIGHT + +Copyright Red Hat + +\fBpesto\fR is free software: you can redistribute them and/or modify +them under the terms of the GNU General Public License as published by +the Free Software Foundation, either version 2 of the License, or (at +your option) any later version. + +.SH SEE ALSO + +\fBpasst\fR(1), \fBpasta\fR(1), \fBunix\fR(7). diff --git a/pesto.c b/pesto.c new file mode 100644 index 00000000..6d066084 --- /dev/null +++ b/pesto.c @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +/* PESTO - Programmable Extensible Socket Translation Orchestrator + * front-end for passt(1) and pasta(1) forwarding configuration + * + * pesto.c - Main program (it's not actually extensible) + * + * Copyright (c) 2026 Red Hat GmbH + * Author: Stefano Brivio + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "common.h" +#include "seccomp_pesto.h" +#include "serialise.h" +#include "pesto.h" + +#define die(...) \ + do { \ + FPRINTF(stderr, __VA_ARGS__); \ + FPRINTF(stderr, "\n"); \ + exit(EXIT_FAILURE); \ + } while (0) + +/** + * main() - Entry point and whole program with loop + * @argc: Argument count + * @argv: Arguments: socket path, operation, port specifiers + * + * Return: 0 on success, won't return on failure + * + * #syscalls:pesto connect write close exit_group fstat brk
In case we end up dropping the malloc() in the next patches, as a reminder, we should drop brk here as well.
+ * #syscalls:pesto socket s390x:socketcall i686:socketcall + * #syscalls:pesto recvfrom recvmsg arm:recv ppc64le:recv + * #syscalls:pesto sendto sendmsg arm:send ppc64le:send + */ +int main(int argc, char **argv) +{ + struct sockaddr_un a = { AF_UNIX, "" }; + struct pesto_hello hello; + struct sock_fprog prog; + uint32_t s_version; + int ret, s; + + prctl(PR_SET_DUMPABLE, 0); + + prog.len = (unsigned short)sizeof(filter_pesto) / + sizeof(filter_pesto[0]); + prog.filter = filter_pesto; + if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) || + prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog)) + die("Failed to apply seccomp filter"); + + if (argc < 2) + die("Usage: %s CONTROLPATH", argv[0]); + + if ((s = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) + die("Failed to create AF_UNIX socket: %s", strerror(errno)); + + ret = snprintf(a.sun_path, sizeof(a.sun_path), "%s", argv[1]); + if (ret <= 0 || ret >= (int)sizeof(a.sun_path)) + die("Invalid socket path \"%s\"", argv[1]); + + ret = connect(s, (struct sockaddr *)&a, sizeof(a)); + if (ret < 0) { + die("Failed to connect to %s: %s", + a.sun_path, strerror(errno)); + } + + ret = read_all_buf(s, &hello, sizeof(hello)); + if (ret < 0) + die("Couldn't read server greeting: %s", strerror(errno)); + + if (memcmp(hello.magic, PESTO_SERVER_MAGIC, sizeof(hello.magic))) + die("Bad magic number from server"); + + s_version = ntohl(hello.version); + + if (s_version > PESTO_PROTOCOL_VERSION) { + die("Unknown server protocol version %"PRIu32" > %"PRIu32"\n", + s_version, PESTO_PROTOCOL_VERSION); + } + + /* cppcheck-suppress knownConditionTrueFalse */ + if (!s_version) { + if (PESTO_PROTOCOL_VERSION) + die("Unsupported experimental server protocol"); + FPRINTF(stderr, +"Warning: Using experimental protocol version, client and server must match\n"); + } + + exit(0); +} diff --git a/pesto.h b/pesto.h new file mode 100644 index 00000000..92d4df3a --- /dev/null +++ b/pesto.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * Copyright Red Hat + * Author: David Gibson
+ * + * Definitions and functions used by both client and server of the configuration + * update protocol (pesto). + */ + +#ifndef PESTO_H +#define PESTO_H + +#include +#include + +#define PESTO_SERVER_MAGIC "pesto:s"
Note to self: sneakily change this to basil:s on merge.
+ +/* Version 0 is reserved for unreleased / unsupported experimental versions */ +#define PESTO_PROTOCOL_VERSION 0 + +/** + * struct pesto_hello - Server introduction message + * @magic: PESTO_SERVER_MAGIC + * @version: Version number + */ +struct pesto_hello { + char magic[8]; + uint32_t version; +} __attribute__ ((__packed__)); + +static_assert(sizeof(PESTO_SERVER_MAGIC) + == sizeof(((struct pesto_hello *)0)->magic), + "PESTO_SERVER_MAGIC has wrong size"); + +#endif /* PESTO_H */ diff --git a/serialise.c b/serialise.c index d6c3396a..e3ea86e3 100644 --- a/serialise.c +++ b/serialise.c @@ -6,6 +6,9 @@ * PASTA - Pack A Subtle Tap Abstraction * for network namespace/tap device mode * + * PESTO - Programmable Extensible Socket Translation Orchestrator + * front-end for passt(1) and pasta(1) forwarding configuration + * * serialise.c - Serialisation of data structures over bytestreams * * Copyright Red Hat diff --git a/util.h b/util.h index cb669105..e7993f4d 100644 --- a/util.h +++ b/util.h @@ -317,9 +317,6 @@ static inline bool mod_between(unsigned x, unsigned i, unsigned j, unsigned m) return mod_sub(x, i, m) < mod_sub(j, i, m); }
-/* FPRINTF() intentionally silences cert-err33-c clang-tidy warnings */ -#define FPRINTF(f, ...) (void)fprintf(f, __VA_ARGS__) - void raw_random(void *buf, size_t buflen);
/*
-- Stefano
On Mon, 23 Mar 2026 18:37:21 +1100
David Gibson
Add basic command line option parsing using getopt_long() to pesto. Implement --help, --version, --quiet and --verbose options, along with a
Two questions: - do we really need --quiet? - shouldn't we stick to the passt(1) convention where we avoided -v altogether to avoid confusion, and we just have --version and -d / --debug?
debug() macro to print debugging messages when given the verbose option.
Signed-off-by: David Gibson
--- common.h | 8 ++++++ conf.c | 45 +++++++++++++++++---------------- passt.h | 12 ++++----- pesto.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++------ util.h | 8 ------ 5 files changed, 108 insertions(+), 42 deletions(-) diff --git a/common.h b/common.h index 76a95609..927d20a4 100644 --- a/common.h +++ b/common.h @@ -8,6 +8,14 @@ #ifndef COMMON_H #define COMMON_H
+#define VERSION_BLOB \ + VERSION "\n" \ + "Copyright Red Hat\n" \ + "GNU General Public License, version 2 or later\n" \ + " https://www.gnu.org/licenses/old-licenses/gpl-2.0.html\n" \ + "This is free software: you are free to change and redistribute it.\n" \ + "There is NO WARRANTY, to the extent permitted by law.\n\n" + /* FPRINTF() intentionally silences cert-err33-c clang-tidy warnings */ #define FPRINTF(f, ...) (void)fprintf(f, __VA_ARGS__)
diff --git a/conf.c b/conf.c index 5e8bc665..7b960fe9 100644 --- a/conf.c +++ b/conf.c @@ -1140,6 +1140,9 @@ static void conf_print(const struct ctx *c) char bufmac[ETH_ADDRSTRLEN], ifn[IFNAMSIZ]; int i;
+ if (c->fd_control_listen >= 0) + info("Configuration socket: %s", c->control_path); + if (c->ifi4 > 0 || c->ifi6 > 0) { info("Template interface: %s%s%s%s%s", c->ifi4 > 0 ? if_indextoname(c->ifi4, ifn) : "", @@ -1432,15 +1435,15 @@ static void conf_open_files(struct ctx *c) die_perror("Couldn't open PID file %s", c->pidfile); }
- c->fd_conf = -1; - if (*c->conf_path) { - c->fd_conf_listen = sock_unix(c->conf_path); - if (c->fd_conf_listen < 0) { + c->fd_control = -1; + if (*c->control_path) {
For later: maybe I guess we should squash the renaming of this file descriptor with the earlier patch adding it.
+ c->fd_control_listen = sock_unix(c->control_path); + if (c->fd_control_listen < 0) { die_perror("Couldn't open control socket %s", - c->conf_path); + c->control_path); } } else { - c->fd_conf_listen = -1; + c->fd_control_listen = -1; } }
@@ -1485,13 +1488,13 @@ static void conf_sock_listen(const struct ctx *c) { union epoll_ref ref = { .type = EPOLL_TYPE_CONF_LISTEN };
- if (c->fd_conf_listen < 0) + if (c->fd_control_listen < 0) return;
- if (listen(c->fd_conf_listen, 0)) + if (listen(c->fd_control_listen, 0)) die_perror("Couldn't listen on configuration socket");
- ref.fd = c->fd_conf_listen; + ref.fd = c->fd_control_listen; if (epoll_add(c->epollfd, EPOLLIN | EPOLLET, ref)) die_perror("Couldn't add configuration socket to epoll"); } @@ -1846,11 +1849,11 @@ void conf(struct ctx *c, int argc, char **argv) c->fd_tap = -1; break; case 'c': - ret = snprintf(c->conf_path, sizeof(c->conf_path), "%s", - optarg); + ret = snprintf(c->control_path, sizeof(c->control_path), + "%s", optarg); if (ret <= 0 || ret >= (int)sizeof(c->sock_path)) die("Invalid configuration path: %s", optarg); - c->fd_conf_listen = c->fd_conf = -1; + c->fd_control_listen = c->fd_control = -1; break; case 'F': errno = 0; @@ -2294,10 +2297,10 @@ void conf(struct ctx *c, int argc, char **argv) FWD_SCAN); }
+ conf_sock_listen(c); + if (!c->quiet) conf_print(c); - - conf_sock_listen(c); }
/** @@ -2321,7 +2324,7 @@ void conf_listen_handler(struct ctx *c, uint32_t events) return; }
- fd = accept4(c->fd_conf_listen, NULL, NULL, SOCK_NONBLOCK); + fd = accept4(c->fd_control_listen, NULL, NULL, SOCK_NONBLOCK); if (fd < 0) { warn_perror("accept4() on configuration listening socket"); return; @@ -2331,7 +2334,7 @@ void conf_listen_handler(struct ctx *c, uint32_t events) warn_perror("Can't get configuration client credentials");
/* Another client is already connected: accept and close right away. */ - if (c->fd_conf != -1) { + if (c->fd_control != -1) { info("Discarding configuration client, PID %i", uc.pid); goto fail; } @@ -2349,7 +2352,7 @@ void conf_listen_handler(struct ctx *c, uint32_t events) goto fail; }
- c->fd_conf = fd; + c->fd_control = fd; info("Accepted configuration client, PID %i", uc.pid); if (!PESTO_PROTOCOL_VERSION) { warn( @@ -2374,7 +2377,7 @@ void conf_handler(struct ctx *c, uint32_t events) ssize_t n;
do { - n = read(c->fd_conf, discard, sizeof(discard)); + n = read(c->fd_control, discard, sizeof(discard)); if (n > 0) debug("Discarded %zd bytes of config data", n); } while (n > 0); @@ -2397,7 +2400,7 @@ void conf_handler(struct ctx *c, uint32_t events)
close: debug("Closing configuration socket"); - epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_conf, NULL); - close(c->fd_conf); - c->fd_conf = -1; + epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_control, NULL); + close(c->fd_control); + c->fd_control = -1; } diff --git a/passt.h b/passt.h index c38bb5ae..b3f049de 100644 --- a/passt.h +++ b/passt.h @@ -158,7 +158,7 @@ struct ip6_ctx { * @foreground: Run in foreground, don't log to stderr by default * @nofile: Maximum number of open files (ulimit -n) * @sock_path: Path for UNIX domain socket - * @conf_path: Path for configuration UNIX domain socket + * @control_path: Path for control/configuration UNIX domain socket * @repair_path: TCP_REPAIR helper path, can be "none", empty for default * @pcap: Path for packet capture file * @pidfile: Path to PID file, empty string if not configured @@ -170,8 +170,8 @@ struct ip6_ctx { * @epollfd: File descriptor for epoll instance * @fd_tap_listen: File descriptor for listening AF_UNIX socket, if any * @fd_tap: AF_UNIX socket, tuntap device, or pre-opened socket - * @fd_conf_listen: Listening configuration socket, if any - * @fd_conf: Configuration socket, if any + * @fd_control_listen: Listening control/configuration socket, if any + * @fd_control: Control/configuration socket, if any * @fd_repair_listen: File descriptor for listening TCP_REPAIR socket, if any * @fd_repair: Connected AF_UNIX socket for TCP_REPAIR helper * @our_tap_mac: Pasta/passt's MAC on the tap link @@ -226,7 +226,7 @@ struct ctx { int foreground; int nofile; char sock_path[UNIX_PATH_MAX]; - char conf_path[UNIX_PATH_MAX]; + char control_path[UNIX_PATH_MAX]; char repair_path[UNIX_PATH_MAX]; char pcap[PATH_MAX];
@@ -244,8 +244,8 @@ struct ctx { int epollfd; int fd_tap_listen; int fd_tap; - int fd_conf_listen; - int fd_conf; + int fd_control_listen; + int fd_control; int fd_repair_listen; int fd_repair; unsigned char our_tap_mac[ETH_ALEN]; diff --git a/pesto.c b/pesto.c index 6d066084..f8c9e01d 100644 --- a/pesto.c +++ b/pesto.c @@ -15,6 +15,7 @@ #include
#include #include +#include #include #include #include @@ -42,6 +43,32 @@ exit(EXIT_FAILURE); \ } while (0) +#define debug(...) \ + do { \ + if (verbosity > 1) { \ + FPRINTF(stderr, __VA_ARGS__); \ + FPRINTF(stderr, "\n"); \ + } \ + } while (0) + +/** + * usage() - Print usage, exit with given status code + * @name: Executable name + * @f: Stream to print usage info to + * @status: Status code for exit(2) + */ +static void usage(const char *name, FILE *f, int status) +{ + FPRINTF(f, "Usage: %s [OPTION]... PATH\n", name); + FPRINTF(f, + "\n" + " -v, --verbose Be more verbose\n" + " -q, --quiet Be less verbose\n" + " -h, --help Display this help message and exit\n" + " --version Show version and exit\n"); + exit(status); +} + /** * main() - Entry point and whole program with loop * @argc: Argument count @@ -56,13 +83,20 @@ */ int main(int argc, char **argv) { + const struct option options[] = { + {"quiet", no_argument, NULL, 'q' }, + {"verbose", no_argument, NULL, 'v' }, + {"help", no_argument, NULL, 'h' }, + {"version", no_argument, NULL, 1 }, + { 0 }, + }; struct sockaddr_un a = { AF_UNIX, "" }; + const char *optstring = "vh"; struct pesto_hello hello; struct sock_fprog prog; + int optname, ret, s; uint32_t s_version; - int ret, s; - - prctl(PR_SET_DUMPABLE, 0);
I would, at least by default, keep this exactly in line with the paranoid stuff I added to passt, for consistency and ease of auditing: what will happen if we run *this* tool as root? If we choose to drop to nobody in that case, removing this will be a problem. And we'll probably not remember to add it back.
+ int verbosity = 1;
If we skip --quiet, this can be simplified.
prog.len = (unsigned short)sizeof(filter_pesto) / sizeof(filter_pesto[0]); @@ -71,15 +105,40 @@ int main(int argc, char **argv) prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog)) die("Failed to apply seccomp filter");
- if (argc < 2) - die("Usage: %s CONTROLPATH", argv[0]);
I was suggesting to use PATH instead of CONTROLPATH in the earlier patch, but this goes away here and we'll probably want to squash this anyway so it shouldn't matter.
+ do { + optname = getopt_long(argc, argv, optstring, options, NULL); + + switch (optname) { + case -1: + case 0: + break; + case 'h': + usage(argv[0], stdout, EXIT_SUCCESS); + break; + case 'q': + verbosity--; + break; + case 'v': + verbosity++; + break; + case 1: + FPRINTF(stdout, "pesto "); + FPRINTF(stdout, VERSION_BLOB); + exit(EXIT_SUCCESS); + default: + usage(argv[0], stderr, EXIT_FAILURE); + } + } while (optname != -1); + + if (argc - optind != 1) + usage(argv[0], stderr, EXIT_FAILURE);
if ((s = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) die("Failed to create AF_UNIX socket: %s", strerror(errno));
- ret = snprintf(a.sun_path, sizeof(a.sun_path), "%s", argv[1]); + ret = snprintf(a.sun_path, sizeof(a.sun_path), "%s", argv[optind]); if (ret <= 0 || ret >= (int)sizeof(a.sun_path)) - die("Invalid socket path \"%s\"", argv[1]); + die("Invalid socket path \"%s\"", argv[optind]);
ret = connect(s, (struct sockaddr *)&a, sizeof(a)); if (ret < 0) { @@ -87,6 +146,8 @@ int main(int argc, char **argv) a.sun_path, strerror(errno)); }
+ debug("Connected to passt/pasta control socket"); + ret = read_all_buf(s, &hello, sizeof(hello)); if (ret < 0) die("Couldn't read server greeting: %s", strerror(errno)); @@ -96,6 +157,8 @@ int main(int argc, char **argv)
s_version = ntohl(hello.version);
+ debug("Server protocol version: %"PRIu32, s_version); + if (s_version > PESTO_PROTOCOL_VERSION) { die("Unknown server protocol version %"PRIu32" > %"PRIu32"\n", s_version, PESTO_PROTOCOL_VERSION); diff --git a/util.h b/util.h index e7993f4d..d7c397f6 100644 --- a/util.h +++ b/util.h @@ -21,14 +21,6 @@
#include "log.h"
-#define VERSION_BLOB \ - VERSION "\n" \ - "Copyright Red Hat\n" \ - "GNU General Public License, version 2 or later\n" \ - " https://www.gnu.org/licenses/old-licenses/gpl-2.0.html\n" \ - "This is free software: you are free to change and redistribute it.\n" \ - "There is NO WARRANTY, to the extent permitted by law.\n\n" - #ifndef SECCOMP_RET_KILL_PROCESS #define SECCOMP_RET_KILL_PROCESS SECCOMP_RET_KILL #endif
-- Stefano
On Mon, 23 Mar 2026 18:37:22 +1100
David Gibson
Extend the dynamic update protocol to expose the pif indices and names from a running passt/pasta to the pesto tool. pesto records that data and (for now) prints it out.
Signed-off-by: David Gibson
--- conf.c | 38 +++++++++++++++++++++ pesto.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++++++++- serialise.c | 21 ++++++++++++ serialise.h | 3 ++ 4 files changed, 159 insertions(+), 1 deletion(-) diff --git a/conf.c b/conf.c index 7b960fe9..b878db94 100644 --- a/conf.c +++ b/conf.c @@ -2303,6 +2303,41 @@ void conf(struct ctx *c, int argc, char **argv) conf_print(c); }
+/** + * conf_send_pifs() - Send list of pifs to dynamic update client (pesto) + * @c: Execution context + * @fd: Socket to the client + * + * Return: 0 on success, -1 on failure + */ +static int conf_send_pifs(const struct ctx *c, int fd) +{ + uint32_t num = 0; + unsigned pif; + + /* First count the number of pifs with tables */ + for (pif = 0; pif < PIF_NUM_TYPES; pif++) { + if (c->fwd[pif]) + num++; + } + + if (write_u32(fd, num)) + return -1; + + for (pif = 0; pif < PIF_NUM_TYPES; pif++) { + if (!c->fwd[pif]) + continue; + + if (write_u8(fd, pif)) + return -1; + + if (write_str(fd, pif_name(pif)) < 0) + return -1; + } + + return 0; +} + /** * conf_listen_handler() - Handle events on configuration listening socket * @c: Execution context @@ -2359,6 +2394,9 @@ void conf_listen_handler(struct ctx *c, uint32_t events) "Warning: Using experimental unsupported configuration protocol"); }
+ if (conf_send_pifs(c, fd) < 0) + goto fail; + return;
fail: diff --git a/pesto.c b/pesto.c index f8c9e01d..ed4f2aab 100644 --- a/pesto.c +++ b/pesto.c @@ -36,6 +36,8 @@ #include "serialise.h" #include "pesto.h"
+static int verbosity = 1; + #define die(...) \ do { \ FPRINTF(stderr, __VA_ARGS__); \ @@ -51,6 +53,19 @@ } \ } while (0)
+/** + * xmalloc() - Allocate memory, with fatal error on failure + * @size: Number of bytes to allocate + */ +static void *xmalloc(size_t size) +{ + void *p = malloc(size);
I would really really prefer to skip this unless it's absolutely necessary, for a number of reasons: 1. this thing is talking to passt which talks to untrusted workloads, and there's a risk of arbitrary code execution, which implies the possibility of specially crafted messages that might make us allocate: a. in specific regions and build a heap spraying attack based on that b. a lot, and cause a denial of service ...and, while this process should be short lived, we can't assume that, if somebody achieves arbitrary code execution in passt and malicious messages, because we risk looping on rules, or having passt send us information very slowly (minutes / hours), etc. 2. besides, we might want to add a "daemon" mode one day, and then it's not short lived anymore, which opens the door to leaks and double-frees (adding to a. and b. above) 3. we might need to reuse functions from passt and not notice right away that they call this xmalloc(), or have to eventually adapt them anyway 4. consistency, in messaging ("passt doesn't allocate dynamic memory", without caveats) and for auditing reasons Strings can be 32 bytes, or 1024 if we want to splurge. Unless we need to pile a lot of them on the stack (but it's not the case in pesto) a malloc() doesn't really bring any advantage compared to static buffers here and there. It's not megabytes.
+ + if (!p) + die("Memory allocation failure"); + return p; +} + /** * usage() - Print usage, exit with given status code * @name: Executable name @@ -69,6 +84,83 @@ static void usage(const char *name, FILE *f, int status) exit(status); }
+/** + * pesto_recv_str() - Receive a string from passt/pasta + * @fd: Control socket + * + * Return: pointer to malloc()ed string + */ +static const char *pesto_recv_str(int fd) +{ + uint32_t len; + char *buf; + + if (read_u32(fd, &len) < 0) + die("Error reading from control socket"); + + buf = xmalloc(len); + if (read_all_buf(fd, buf, len) < 0) + die("Error reading from control socket"); + + return buf; +} + +struct pif_state { + uint8_t pif; + const char *name; +}; + +struct conf_state {
More as a to-do list item rather than as a review comment: we need documentation for those structs.
+ uint32_t npifs; + struct pif_state pif[]; +}; + +/** + * pesto_read_pifs() - Read pif names and IDs from passt/pasta + * @fd: Control socket + */ +static const struct conf_state *pesto_read_pifs(int fd) +{ + uint32_t num; + struct conf_state *state; + unsigned i; + + if (read_u32(fd, &num) < 0) + die("Error reading from control socket"); + + debug("Receiving %"PRIu32" interface names", num); + + state = xmalloc(sizeof(*state) + num * sizeof(struct pif_state)); + state->npifs = num; + + for (i = 0; i < num; i++) { + struct pif_state *ps = &state->pif[i]; + + if (read_u8(fd, &ps->pif) < 0) + die("Error reading from control socket"); + ps->name = pesto_recv_str(fd); + + debug("%u: %s", ps->pif, ps->name); + } + + return state; +} + +/** + * show_state() - Show current rule state obtained from passt/pasta + * @pifs: PIF name information + */ +static void show_state(const struct conf_state *state) +{ + unsigned i; + + for (i = 0; i < state->npifs; i++) { + const struct pif_state *ps = &state->pif[i]; + printf("Forwarding rules for %s interface\n", ps->name); + printf("\tTBD\n"); + } +} + /** * main() - Entry point and whole program with loop * @argc: Argument count @@ -91,12 +183,12 @@ int main(int argc, char **argv) { 0 }, }; struct sockaddr_un a = { AF_UNIX, "" }; + const struct conf_state *state; const char *optstring = "vh"; struct pesto_hello hello; struct sock_fprog prog; int optname, ret, s; uint32_t s_version; - int verbosity = 1;
prog.len = (unsigned short)sizeof(filter_pesto) / sizeof(filter_pesto[0]); @@ -172,5 +264,9 @@ int main(int argc, char **argv) "Warning: Using experimental protocol version, client and server must match\n"); }
+ state = pesto_read_pifs(s); + + show_state(state); + exit(0); } diff --git a/serialise.c b/serialise.c index e3ea86e3..94c34d3c 100644 --- a/serialise.c +++ b/serialise.c @@ -19,6 +19,7 @@ #include
#include #include +#include #include #include "serialise.h" @@ -121,6 +122,26 @@ int write_all_buf(int fd, const void *buf, size_t len) return write_all_buf(fd, &beval, sizeof(beval)); \ }
+#define be8toh(x) (x) +#define htobe8(x) (x) + +SERIALISE_UINT(8) SERIALISE_UINT(32)
#undef SERIALISE_UNIT + +/** + * write_str() - Write a string to an fd in length/value format + * @fd: Socket to the client + * @s: String to send + * + * Return: 0 on success, -1 on error + */ +int write_str(int fd, const char *s) +{ + uint32_t len = strlen(s) + 1; /* Include \0 */ + + if (write_u32(fd, len) < 0) + return -1;
Even if we don't need to allocate here, I would feel more comfortable if we passed fixed-size strings only to passt to have a slightly lower risk of buffer overflows, in general.
+ return write_all_buf(fd, s, len); +} diff --git a/serialise.h b/serialise.h index 0a4ed086..7ef35786 100644 --- a/serialise.h +++ b/serialise.h @@ -16,6 +16,9 @@ int write_all_buf(int fd, const void *buf, size_t len); int read_u##bits(int fd, uint##bits##_t *val); \ int write_u##bits(int fd, uint##bits##_t val);
+SERIALISE_UINT_DECL(8) SERIALISE_UINT_DECL(32)
+int write_str(int fd, const char *s); + #endif /* _SERIALISE_H */
-- Stefano
On Mon, 23 Mar 2026 18:37:27 +1100
David Gibson
Move the logic for formatting forwarding rules into strings from fwd_rules_print() into fwd_rule.c where it can be shared with pesto. We also make the function explicitly construct a string, rather than directly printing with info(), for greater flexibility.
Signed-off-by: David Gibson
--- Makefile | 11 ++++---- fwd.c | 40 +++--------------------------- fwd_rule.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ fwd_rule.h | 11 ++++++++ 4 files changed, 94 insertions(+), 41 deletions(-) create mode 100644 fwd_rule.c diff --git a/Makefile b/Makefile index bc325482..44c396e7 100644 --- a/Makefile +++ b/Makefile @@ -38,13 +38,14 @@ FLAGS += -DVERSION=\"$(VERSION)\" FLAGS += -DDUAL_STACK_SOCKETS=$(DUAL_STACK_SOCKETS)
PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c epoll_ctl.c \ - flow.c fwd.c icmp.c igmp.c inany.c iov.c ip.c isolation.c lineread.c \ - log.c mld.c ndp.c netlink.c migrate.c packet.c passt.c pasta.c pcap.c \ - pif.c repair.c serialise.c tap.c tcp.c tcp_buf.c tcp_splice.c tcp_vu.c \ - udp.c udp_flow.c udp_vu.c util.c vhost_user.c virtio.c vu_common.c + flow.c fwd.c fwd_rule.c icmp.c igmp.c inany.c iov.c ip.c isolation.c \ + lineread.c log.c mld.c ndp.c netlink.c migrate.c packet.c passt.c \ + pasta.c pcap.c pif.c repair.c serialise.c tap.c tcp.c tcp_buf.c \ + tcp_splice.c tcp_vu.c udp.c udp_flow.c udp_vu.c util.c vhost_user.c \ + virtio.c vu_common.c QRAP_SRCS = qrap.c PASST_REPAIR_SRCS = passt-repair.c -PESTO_SRCS = pesto.c inany.c ip.c serialise.c +PESTO_SRCS = pesto.c fwd_rule.c inany.c ip.c serialise.c SRCS = $(PASST_SRCS) $(QRAP_SRCS) $(PASST_REPAIR_SRCS) $(PESTO_SRCS)
MANPAGES = passt.1 pasta.1 pesto.1 qrap.1 passt-repair.1 diff --git a/fwd.c b/fwd.c index a32d0a20..20409c62 100644 --- a/fwd.c +++ b/fwd.c @@ -304,20 +304,6 @@ parse_err: warn("Unable to parse %s", PORT_RANGE_SYSCTL); }
-/** - * fwd_rule_addr() - Return match address for a rule - * @rule: Forwarding rule - * - * Return: matching address for rule, NULL if it matches all addresses - */ -static const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule) -{ - if (rule->flags & FWD_DUAL_STACK_ANY) - return NULL; - - return &rule->addr; -} - /** * fwd_port_map_ephemeral() - Mark ephemeral ports in a bitmap * @map: Bitmap to update @@ -497,28 +483,10 @@ void fwd_rules_print(const struct fwd_table *fwd) unsigned i;
for (i = 0; i < fwd->count; i++) { - const struct fwd_rule *rule = &fwd->rules[i].rule; - const char *percent = *rule->ifname ? "%" : ""; - const char *weak = "", *scan = ""; - char addr[INANY_ADDRSTRLEN]; - - inany_ntop(fwd_rule_addr(rule), addr, sizeof(addr)); - if (rule->flags & FWD_WEAK) - weak = " (best effort)"; - if (rule->flags & FWD_SCAN) - scan = " (auto-scan)"; - - if (rule->first == rule->last) { - info(" %s [%s]%s%s:%hu => %hu %s%s", - ipproto_name(rule->proto), addr, percent, - rule->ifname, rule->first, rule->to, weak, scan); - } else { - info(" %s [%s]%s%s:%hu-%hu => %hu-%hu %s%s", - ipproto_name(rule->proto), addr, percent, - rule->ifname, rule->first, rule->last, - rule->to, rule->last - rule->first + rule->to, - weak, scan); - } + char rulestr[FWD_RULE_STRLEN]; + + info(" %s", fwd_rule_ntop(&fwd->rules[i].rule, + rulestr, sizeof(rulestr))); } }
diff --git a/fwd_rule.c b/fwd_rule.c new file mode 100644 index 00000000..dfbdf683 --- /dev/null +++ b/fwd_rule.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +/* PASST - Plug A Simple Socket Transport + * for qemu/UNIX domain socket mode + * + * PASTA - Pack A Subtle Tap Abstraction + * for network namespace/tap device mode + * + * PESTO - Programmable Extensible Socket Translation Orchestrator + * front-end for passt(1) and pasta(1) forwarding configuration + * + * fwd_rule.c - Helpers for working with forwarding rule specifications + * + * Copyright Red Hat + * Author: David Gibson
+ */ + +#include + +#include "fwd_rule.h" + +/** + * fwd_rule_addr() - Return match address for a rule + * @rule: Forwarding rule + * + * Return: matching address for rule, NULL if it matches all addresses + */ +const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule) +{ + if (rule->flags & FWD_DUAL_STACK_ANY) + return NULL; + + return &rule->addr; +} + +/** + * fwd_rule_ntop() - Format forwarding rule as a string + * @rule: Rule to format + * @dst: Buffer to store output (should have FWD_RULE_STRLEN bytes) + * @size: Size of @dst + */ +const char *fwd_rule_ntop(const struct fwd_rule *rule, char *dst, size_t size) +{ + const char *percent = *rule->ifname ? "%" : ""; + const char *weak = "", *scan = ""; + char addr[INANY_ADDRSTRLEN]; + int len; + + inany_ntop(fwd_rule_addr(rule), addr, sizeof(addr)); + if (rule->flags & FWD_WEAK) + weak = " (best effort)"; + if (rule->flags & FWD_SCAN) + scan = " (auto-scan)"; + + if (rule->first == rule->last) { + len = snprintf(dst, size, + "%s [%s]%s%s:%hu => %hu %s%s", + ipproto_name(rule->proto), addr, percent, + rule->ifname, rule->first, rule->to, weak, scan); + } else { + in_port_t tolast = rule->last - rule->first + rule->to; + len = snprintf(dst, size, + "%s [%s]%s%s:%hu-%hu => %hu-%hu %s%s", + ipproto_name(rule->proto), addr, percent, + rule->ifname, rule->first, rule->last, + rule->to, tolast, weak, scan); + } + + if (len < 0 || (size_t)len >= size) + return NULL; + + return dst; +} diff --git a/fwd_rule.h b/fwd_rule.h index 84ec5cbe..59db0e95 100644 --- a/fwd_rule.h +++ b/fwd_rule.h @@ -42,4 +42,15 @@ struct fwd_rule { uint8_t flags; }; +const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule); + +#define FWD_RULE_STRLEN \ + (IPPROTO_STRLEN - 1 \ + + INANY_ADDRSTRLEN - 1 \ + + IFNAMSIZ - 1 \ + + sizeof(" (best effort)") - 1 \ + + sizeof(" (auto-scan)") - 1 \ + + 15)
I'm not quite sure about the reason for the 15 here, is there some constant we can use perhaps?
+const char *fwd_rule_ntop(const struct fwd_rule *rule, char *dst, size_t size); + #endif /* FWD_RULE_H */
-- Stefano
On Mon, 23 Mar 2026 18:37:28 +1100
David Gibson
Implement serialisation of our current forwarding rules in conf.c, deserialising it to display in the pesto client.
Signed-off-by: David Gibson
--- conf.c | 44 +++++++++++++++++++++++++++++++ fwd_rule.c | 40 ++++++++++++++++++++++++++++ fwd_rule.h | 3 +++ pesto.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++--- 4 files changed, 160 insertions(+), 4 deletions(-) diff --git a/conf.c b/conf.c index b878db94..7f311914 100644 --- a/conf.c +++ b/conf.c @@ -2338,6 +2338,47 @@ static int conf_send_pifs(const struct ctx *c, int fd) return 0; }
+/** + * conf_send_rules() - Send current forwarding rules to dynamic update client (pesto)
What about: * conf_send_rules() - Send forwarding rules to configuration client (pesto) ?
+ * @c: Execution context + * @fd: Socket to the client + * + * Return: 0 on success, -1 on failure + */ +static int conf_send_rules(const struct ctx *c, int fd) +{ + unsigned pif; + + for (pif = 0; pif < PIF_NUM_TYPES; pif++) { + const struct fwd_table *fwd = c->fwd[pif]; + unsigned i; + + if (!fwd) + continue; + + assert(pif); + + /* PIF id */ + if (write_u8(fd, pif)) + return -1; + + /* Number of rules */ + if (write_u32(fd, fwd->count)) + return -1; + + for (i = 0; i < fwd->count; i++) { + if (fwd_rule_write(fd, &fwd->rules[i].rule)) + return -1; + } + } + + /* Write 0 PIF id to finish */ + if (write_u8(fd, 0))
We have PIF_NONE, shouldn't we use that to make it obvious that it's an invalid value?
+ return -1; + + return 0; +}
[...]
I stopped reviewing here, the next patches in my series are anyway mine and after those I guess it doesn't make sense, yet, that I review in fine detail, as we'll probably need to change a bunch of things. -- Stefano
On Mon, 23 Mar 2026 18:37:07 +1100
David Gibson
Here's a new draft of dynamic updates. This now can successfully update rules, though I've not tested it very extensively. Essentially this is just barely enough to work, it still could do with rather a lot of polish.
Patches 1..12/22 are preliminary reworks that make moderate sense even without pesto - feel free to apply if you're happy with them.
I'm applying (in a moment, tests are running now): - 1/25, 2/25, - not 3/25 (comments pending) - not 4/25 (depends on 3/25) - 5/25, 6/25, - not 7/25 (Coverity Scan warnings), - not 8/25 (comments pending), - 9/25, 10/25, 11/25, 12/25. -- Stefano
On Wed, 25 Mar 2026 01:56:51 +0100
Stefano Brivio
On Mon, 23 Mar 2026 18:37:07 +1100 David Gibson
wrote: Here's a new draft of dynamic updates. This now can successfully update rules, though I've not tested it very extensively. Essentially this is just barely enough to work, it still could do with rather a lot of polish.
Patches 1..12/22 are preliminary reworks that make moderate sense even without pesto - feel free to apply if you're happy with them.
I'm applying (in a moment, tests are running now):
- 1/25, 2/25, - not 3/25 (comments pending) - not 4/25 (depends on 3/25) - 5/25, 6/25, - not 7/25 (Coverity Scan warnings), - not 8/25 (comments pending), - 9/25, 10/25, 11/25, 12/25.
Never mind, most of the "TCP/IPv4: ns to host" tests fail, probably I missed something while cherry-picking patches from this series. I guess a re-spin up to 12/25 would be more convenient in this case. -- Stefano
On Wed, Mar 25, 2026 at 01:56:34AM +0100, Stefano Brivio wrote:
On Mon, 23 Mar 2026 18:37:27 +1100 David Gibson
wrote: Move the logic for formatting forwarding rules into strings from fwd_rules_print() into fwd_rule.c where it can be shared with pesto. We also make the function explicitly construct a string, rather than directly printing with info(), for greater flexibility.
Signed-off-by: David Gibson
--- Makefile | 11 ++++---- fwd.c | 40 +++--------------------------- fwd_rule.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ fwd_rule.h | 11 ++++++++ 4 files changed, 94 insertions(+), 41 deletions(-) create mode 100644 fwd_rule.c diff --git a/Makefile b/Makefile index bc325482..44c396e7 100644 --- a/Makefile +++ b/Makefile @@ -38,13 +38,14 @@ FLAGS += -DVERSION=\"$(VERSION)\" FLAGS += -DDUAL_STACK_SOCKETS=$(DUAL_STACK_SOCKETS)
PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c epoll_ctl.c \ - flow.c fwd.c icmp.c igmp.c inany.c iov.c ip.c isolation.c lineread.c \ - log.c mld.c ndp.c netlink.c migrate.c packet.c passt.c pasta.c pcap.c \ - pif.c repair.c serialise.c tap.c tcp.c tcp_buf.c tcp_splice.c tcp_vu.c \ - udp.c udp_flow.c udp_vu.c util.c vhost_user.c virtio.c vu_common.c + flow.c fwd.c fwd_rule.c icmp.c igmp.c inany.c iov.c ip.c isolation.c \ + lineread.c log.c mld.c ndp.c netlink.c migrate.c packet.c passt.c \ + pasta.c pcap.c pif.c repair.c serialise.c tap.c tcp.c tcp_buf.c \ + tcp_splice.c tcp_vu.c udp.c udp_flow.c udp_vu.c util.c vhost_user.c \ + virtio.c vu_common.c QRAP_SRCS = qrap.c PASST_REPAIR_SRCS = passt-repair.c -PESTO_SRCS = pesto.c inany.c ip.c serialise.c +PESTO_SRCS = pesto.c fwd_rule.c inany.c ip.c serialise.c SRCS = $(PASST_SRCS) $(QRAP_SRCS) $(PASST_REPAIR_SRCS) $(PESTO_SRCS)
MANPAGES = passt.1 pasta.1 pesto.1 qrap.1 passt-repair.1 diff --git a/fwd.c b/fwd.c index a32d0a20..20409c62 100644 --- a/fwd.c +++ b/fwd.c @@ -304,20 +304,6 @@ parse_err: warn("Unable to parse %s", PORT_RANGE_SYSCTL); }
-/** - * fwd_rule_addr() - Return match address for a rule - * @rule: Forwarding rule - * - * Return: matching address for rule, NULL if it matches all addresses - */ -static const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule) -{ - if (rule->flags & FWD_DUAL_STACK_ANY) - return NULL; - - return &rule->addr; -} - /** * fwd_port_map_ephemeral() - Mark ephemeral ports in a bitmap * @map: Bitmap to update @@ -497,28 +483,10 @@ void fwd_rules_print(const struct fwd_table *fwd) unsigned i;
for (i = 0; i < fwd->count; i++) { - const struct fwd_rule *rule = &fwd->rules[i].rule; - const char *percent = *rule->ifname ? "%" : ""; - const char *weak = "", *scan = ""; - char addr[INANY_ADDRSTRLEN]; - - inany_ntop(fwd_rule_addr(rule), addr, sizeof(addr)); - if (rule->flags & FWD_WEAK) - weak = " (best effort)"; - if (rule->flags & FWD_SCAN) - scan = " (auto-scan)"; - - if (rule->first == rule->last) { - info(" %s [%s]%s%s:%hu => %hu %s%s", - ipproto_name(rule->proto), addr, percent, - rule->ifname, rule->first, rule->to, weak, scan); - } else { - info(" %s [%s]%s%s:%hu-%hu => %hu-%hu %s%s", - ipproto_name(rule->proto), addr, percent, - rule->ifname, rule->first, rule->last, - rule->to, rule->last - rule->first + rule->to, - weak, scan); - } + char rulestr[FWD_RULE_STRLEN]; + + info(" %s", fwd_rule_ntop(&fwd->rules[i].rule, + rulestr, sizeof(rulestr))); } }
diff --git a/fwd_rule.c b/fwd_rule.c new file mode 100644 index 00000000..dfbdf683 --- /dev/null +++ b/fwd_rule.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +/* PASST - Plug A Simple Socket Transport + * for qemu/UNIX domain socket mode + * + * PASTA - Pack A Subtle Tap Abstraction + * for network namespace/tap device mode + * + * PESTO - Programmable Extensible Socket Translation Orchestrator + * front-end for passt(1) and pasta(1) forwarding configuration + * + * fwd_rule.c - Helpers for working with forwarding rule specifications + * + * Copyright Red Hat + * Author: David Gibson
+ */ + +#include + +#include "fwd_rule.h" + +/** + * fwd_rule_addr() - Return match address for a rule + * @rule: Forwarding rule + * + * Return: matching address for rule, NULL if it matches all addresses + */ +const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule) +{ + if (rule->flags & FWD_DUAL_STACK_ANY) + return NULL; + + return &rule->addr; +} + +/** + * fwd_rule_ntop() - Format forwarding rule as a string + * @rule: Rule to format + * @dst: Buffer to store output (should have FWD_RULE_STRLEN bytes) + * @size: Size of @dst + */ +const char *fwd_rule_ntop(const struct fwd_rule *rule, char *dst, size_t size) +{ + const char *percent = *rule->ifname ? "%" : ""; + const char *weak = "", *scan = ""; + char addr[INANY_ADDRSTRLEN]; + int len; + + inany_ntop(fwd_rule_addr(rule), addr, sizeof(addr)); + if (rule->flags & FWD_WEAK) + weak = " (best effort)"; + if (rule->flags & FWD_SCAN) + scan = " (auto-scan)"; + + if (rule->first == rule->last) { + len = snprintf(dst, size, + "%s [%s]%s%s:%hu => %hu %s%s", + ipproto_name(rule->proto), addr, percent, + rule->ifname, rule->first, rule->to, weak, scan); + } else { + in_port_t tolast = rule->last - rule->first + rule->to; + len = snprintf(dst, size, + "%s [%s]%s%s:%hu-%hu => %hu-%hu %s%s", + ipproto_name(rule->proto), addr, percent, + rule->ifname, rule->first, rule->last, + rule->to, tolast, weak, scan); + } + + if (len < 0 || (size_t)len >= size) + return NULL; + + return dst; +} diff --git a/fwd_rule.h b/fwd_rule.h index 84ec5cbe..59db0e95 100644 --- a/fwd_rule.h +++ b/fwd_rule.h @@ -42,4 +42,15 @@ struct fwd_rule { uint8_t flags; }; +const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule); + +#define FWD_RULE_STRLEN \ + (IPPROTO_STRLEN - 1 \ + + INANY_ADDRSTRLEN - 1 \ + + IFNAMSIZ - 1 \ + + sizeof(" (best effort)") - 1 \ + + sizeof(" (auto-scan)") - 1 \ + + 15)
I'm not quite sure about the reason for the 15 here, is there some constant we can use perhaps?
It's the spacing and punctuation around the other displayed parameters, roughly the number of non-format characters in "%s [%s]%s%s:%hu-%hu => %hu-%hu %s%s" (plus one for the \0). I don't love it either, but I wasn't sure how to make it less obscure. Is sizeof(" []%:- => - ") an improvement? I'm not sure. Oh... that made me realise I'm missing 4 * (UINT16_STRLEN - 1). I've fixed that now.
+const char *fwd_rule_ntop(const struct fwd_rule *rule, char *dst, size_t size); + #endif /* FWD_RULE_H */
-- Stefano
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Mar 25, 2026 at 01:54:24AM +0100, Stefano Brivio wrote:
On Mon, 23 Mar 2026 18:37:14 +1100 David Gibson
wrote: Currently we store the inbound (PIF_HOST) and outbound (PIF_SPLICE) forwarding tables in separate fields of struct ctx. In a number of places this requires somewhat awkward if or switch constructs to select the right table for updates. Conceptually simplify that by using an index of forwarding tables by pif, which as a bonus keeps track generically which pifs have implemented forwarding tables so far.
For now this doesn't simplify a lot textually, because many places that need this also have other special cases to apply by pif. It does simplify a few crucial places though, and we expect it will become more useful as the flexibility of the forwarding table is improved.
Signed-off-by: David Gibson
[snip]
/home/sbrivio/passt/flow.c:508:2: 1. assign_zero: Assigning: "rule" = "NULL". /home/sbrivio/passt/flow.c:511:2: 2. path: Condition "flow_new_entry == flow", taking true branch. /home/sbrivio/passt/flow.c:511:2: 3. path: Condition "f->state == FLOW_STATE_INI", taking true branch. /home/sbrivio/passt/flow.c:512:2: 4. path: Condition "f->type == FLOW_TYPE_NONE", taking true branch. /home/sbrivio/passt/flow.c:513:2: 5. path: Condition "f->pif[0] != PIF_NONE", taking true branch. /home/sbrivio/passt/flow.c:513:2: 6. path: Condition "f->pif[1] == PIF_NONE", taking true branch. /home/sbrivio/passt/flow.c:514:2: 7. path: Condition "flow->f.state == FLOW_STATE_INI", taking true branch. /home/sbrivio/passt/flow.c:516:2: 8. path: Condition "fwd", taking false branch. /home/sbrivio/passt/flow.c:521:2: 9. path: Switch case value "PIF_SPLICE". /home/sbrivio/passt/flow.c:528:3: 10. var_deref_model: Passing null pointer "rule" to "fwd_nat_from_splice", which dereferences it. /home/sbrivio/passt/fwd.c:1095:2: 10.1. path: Condition "!inany_is_loopback(&ini->eaddr)", taking false branch. /home/sbrivio/passt/fwd.c:1095:2: 10.2. path: Condition "!inany_is_loopback(&ini->oaddr)", taking false branch. /home/sbrivio/passt/fwd.c:1112:2: 10.3. path: Condition "proto == IPPROTO_UDP", taking true branch. /home/sbrivio/passt/fwd.c:1116:2: 10.4. dereference: Dereferencing pointer "rule".
tgtpif = fwd_nat_from_splice(rule, proto, ini, tgt); break;
case PIF_HOST: - fwd = &c->fwd_in; - - if (!(rule = fwd_rule_search(fwd, ini, proto, rule_hint))) - goto norule; -
...and not this either:
/home/sbrivio/passt/flow.c:532:3: Type: Explicit null dereferenced (FORWARD_NULL)
/home/sbrivio/passt/flow.c:508:2: 1. assign_zero: Assigning: "rule" = "NULL". /home/sbrivio/passt/flow.c:511:2: 2. path: Condition "flow_new_entry == flow", taking true branch. /home/sbrivio/passt/flow.c:511:2: 3. path: Condition "f->state == FLOW_STATE_INI", taking true branch. /home/sbrivio/passt/flow.c:512:2: 4. path: Condition "f->type == FLOW_TYPE_NONE", taking true branch. /home/sbrivio/passt/flow.c:513:2: 5. path: Condition "f->pif[0] != PIF_NONE", taking true branch. /home/sbrivio/passt/flow.c:513:2: 6. path: Condition "f->pif[1] == PIF_NONE", taking true branch. /home/sbrivio/passt/flow.c:514:2: 7. path: Condition "flow->f.state == FLOW_STATE_INI", taking true branch. /home/sbrivio/passt/flow.c:516:2: 8. path: Condition "fwd", taking false branch. /home/sbrivio/passt/flow.c:521:2: 9. path: Switch case value "PIF_HOST". /home/sbrivio/passt/flow.c:532:3: 10. var_deref_model: Passing null pointer "rule" to "fwd_nat_from_host", which dereferences it. /home/sbrivio/passt/fwd.c:1173:2: 10.1. dereference: Dereferencing pointer "rule".
I haven't checked why.
tgtpif = fwd_nat_from_host(c, rule, proto, ini, tgt); fwd_neigh_mac_get(c, &tgt->oaddr, f->tap_omac); break;
Oops. That's because we're relying on the fact that we _do_ have populated tables for HOST and SPLICE. That's constructed a very long way away, so of course coverity can't figure that out. I've fixed it with assert()s for now, and those should go away if/when we implement forwarding tables for TAP. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Mar 25, 2026 at 01:55:07AM +0100, Stefano Brivio wrote:
On Mon, 23 Mar 2026 18:37:21 +1100 David Gibson
wrote: Add basic command line option parsing using getopt_long() to pesto. Implement --help, --version, --quiet and --verbose options, along with a
Two questions:
- do we really need --quiet?
- shouldn't we stick to the passt(1) convention where we avoided -v altogether to avoid confusion, and we just have --version and -d / --debug?
Fair point. I've reworked it that way.
debug() macro to print debugging messages when given the verbose option.
Signed-off-by: David Gibson
--- common.h | 8 ++++++ conf.c | 45 +++++++++++++++++---------------- passt.h | 12 ++++----- pesto.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++------ util.h | 8 ------ 5 files changed, 108 insertions(+), 42 deletions(-) diff --git a/common.h b/common.h index 76a95609..927d20a4 100644 --- a/common.h +++ b/common.h @@ -8,6 +8,14 @@ #ifndef COMMON_H #define COMMON_H
+#define VERSION_BLOB \ + VERSION "\n" \ + "Copyright Red Hat\n" \ + "GNU General Public License, version 2 or later\n" \ + " https://www.gnu.org/licenses/old-licenses/gpl-2.0.html\n" \ + "This is free software: you are free to change and redistribute it.\n" \ + "There is NO WARRANTY, to the extent permitted by law.\n\n" + /* FPRINTF() intentionally silences cert-err33-c clang-tidy warnings */ #define FPRINTF(f, ...) (void)fprintf(f, __VA_ARGS__)
diff --git a/conf.c b/conf.c index 5e8bc665..7b960fe9 100644 --- a/conf.c +++ b/conf.c @@ -1140,6 +1140,9 @@ static void conf_print(const struct ctx *c) char bufmac[ETH_ADDRSTRLEN], ifn[IFNAMSIZ]; int i;
+ if (c->fd_control_listen >= 0) + info("Configuration socket: %s", c->control_path); + if (c->ifi4 > 0 || c->ifi6 > 0) { info("Template interface: %s%s%s%s%s", c->ifi4 > 0 ? if_indextoname(c->ifi4, ifn) : "", @@ -1432,15 +1435,15 @@ static void conf_open_files(struct ctx *c) die_perror("Couldn't open PID file %s", c->pidfile); }
- c->fd_conf = -1; - if (*c->conf_path) { - c->fd_conf_listen = sock_unix(c->conf_path); - if (c->fd_conf_listen < 0) { + c->fd_control = -1; + if (*c->control_path) {
For later: maybe I guess we should squash the renaming of this file descriptor with the earlier patch adding it.
Oops, that was my intention, but clearly I updated the wrong patch.
+ c->fd_control_listen = sock_unix(c->control_path); + if (c->fd_control_listen < 0) { die_perror("Couldn't open control socket %s", - c->conf_path); + c->control_path); } } else { - c->fd_conf_listen = -1; + c->fd_control_listen = -1; } }
@@ -1485,13 +1488,13 @@ static void conf_sock_listen(const struct ctx *c) { union epoll_ref ref = { .type = EPOLL_TYPE_CONF_LISTEN };
- if (c->fd_conf_listen < 0) + if (c->fd_control_listen < 0) return;
- if (listen(c->fd_conf_listen, 0)) + if (listen(c->fd_control_listen, 0)) die_perror("Couldn't listen on configuration socket");
- ref.fd = c->fd_conf_listen; + ref.fd = c->fd_control_listen; if (epoll_add(c->epollfd, EPOLLIN | EPOLLET, ref)) die_perror("Couldn't add configuration socket to epoll"); } @@ -1846,11 +1849,11 @@ void conf(struct ctx *c, int argc, char **argv) c->fd_tap = -1; break; case 'c': - ret = snprintf(c->conf_path, sizeof(c->conf_path), "%s", - optarg); + ret = snprintf(c->control_path, sizeof(c->control_path), + "%s", optarg); if (ret <= 0 || ret >= (int)sizeof(c->sock_path)) die("Invalid configuration path: %s", optarg); - c->fd_conf_listen = c->fd_conf = -1; + c->fd_control_listen = c->fd_control = -1; break; case 'F': errno = 0; @@ -2294,10 +2297,10 @@ void conf(struct ctx *c, int argc, char **argv) FWD_SCAN); }
+ conf_sock_listen(c); + if (!c->quiet) conf_print(c); - - conf_sock_listen(c); }
/** @@ -2321,7 +2324,7 @@ void conf_listen_handler(struct ctx *c, uint32_t events) return; }
- fd = accept4(c->fd_conf_listen, NULL, NULL, SOCK_NONBLOCK); + fd = accept4(c->fd_control_listen, NULL, NULL, SOCK_NONBLOCK); if (fd < 0) { warn_perror("accept4() on configuration listening socket"); return; @@ -2331,7 +2334,7 @@ void conf_listen_handler(struct ctx *c, uint32_t events) warn_perror("Can't get configuration client credentials");
/* Another client is already connected: accept and close right away. */ - if (c->fd_conf != -1) { + if (c->fd_control != -1) { info("Discarding configuration client, PID %i", uc.pid); goto fail; } @@ -2349,7 +2352,7 @@ void conf_listen_handler(struct ctx *c, uint32_t events) goto fail; }
- c->fd_conf = fd; + c->fd_control = fd; info("Accepted configuration client, PID %i", uc.pid); if (!PESTO_PROTOCOL_VERSION) { warn( @@ -2374,7 +2377,7 @@ void conf_handler(struct ctx *c, uint32_t events) ssize_t n;
do { - n = read(c->fd_conf, discard, sizeof(discard)); + n = read(c->fd_control, discard, sizeof(discard)); if (n > 0) debug("Discarded %zd bytes of config data", n); } while (n > 0); @@ -2397,7 +2400,7 @@ void conf_handler(struct ctx *c, uint32_t events)
close: debug("Closing configuration socket"); - epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_conf, NULL); - close(c->fd_conf); - c->fd_conf = -1; + epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_control, NULL); + close(c->fd_control); + c->fd_control = -1; } diff --git a/passt.h b/passt.h index c38bb5ae..b3f049de 100644 --- a/passt.h +++ b/passt.h @@ -158,7 +158,7 @@ struct ip6_ctx { * @foreground: Run in foreground, don't log to stderr by default * @nofile: Maximum number of open files (ulimit -n) * @sock_path: Path for UNIX domain socket - * @conf_path: Path for configuration UNIX domain socket + * @control_path: Path for control/configuration UNIX domain socket * @repair_path: TCP_REPAIR helper path, can be "none", empty for default * @pcap: Path for packet capture file * @pidfile: Path to PID file, empty string if not configured @@ -170,8 +170,8 @@ struct ip6_ctx { * @epollfd: File descriptor for epoll instance * @fd_tap_listen: File descriptor for listening AF_UNIX socket, if any * @fd_tap: AF_UNIX socket, tuntap device, or pre-opened socket - * @fd_conf_listen: Listening configuration socket, if any - * @fd_conf: Configuration socket, if any + * @fd_control_listen: Listening control/configuration socket, if any + * @fd_control: Control/configuration socket, if any * @fd_repair_listen: File descriptor for listening TCP_REPAIR socket, if any * @fd_repair: Connected AF_UNIX socket for TCP_REPAIR helper * @our_tap_mac: Pasta/passt's MAC on the tap link @@ -226,7 +226,7 @@ struct ctx { int foreground; int nofile; char sock_path[UNIX_PATH_MAX]; - char conf_path[UNIX_PATH_MAX]; + char control_path[UNIX_PATH_MAX]; char repair_path[UNIX_PATH_MAX]; char pcap[PATH_MAX];
@@ -244,8 +244,8 @@ struct ctx { int epollfd; int fd_tap_listen; int fd_tap; - int fd_conf_listen; - int fd_conf; + int fd_control_listen; + int fd_control; int fd_repair_listen; int fd_repair; unsigned char our_tap_mac[ETH_ALEN]; diff --git a/pesto.c b/pesto.c index 6d066084..f8c9e01d 100644 --- a/pesto.c +++ b/pesto.c @@ -15,6 +15,7 @@ #include
#include #include +#include #include #include #include @@ -42,6 +43,32 @@ exit(EXIT_FAILURE); \ } while (0) +#define debug(...) \ + do { \ + if (verbosity > 1) { \ + FPRINTF(stderr, __VA_ARGS__); \ + FPRINTF(stderr, "\n"); \ + } \ + } while (0) + +/** + * usage() - Print usage, exit with given status code + * @name: Executable name + * @f: Stream to print usage info to + * @status: Status code for exit(2) + */ +static void usage(const char *name, FILE *f, int status) +{ + FPRINTF(f, "Usage: %s [OPTION]... PATH\n", name); + FPRINTF(f, + "\n" + " -v, --verbose Be more verbose\n" + " -q, --quiet Be less verbose\n" + " -h, --help Display this help message and exit\n" + " --version Show version and exit\n"); + exit(status); +} + /** * main() - Entry point and whole program with loop * @argc: Argument count @@ -56,13 +83,20 @@ */ int main(int argc, char **argv) { + const struct option options[] = { + {"quiet", no_argument, NULL, 'q' }, + {"verbose", no_argument, NULL, 'v' }, + {"help", no_argument, NULL, 'h' }, + {"version", no_argument, NULL, 1 }, + { 0 }, + }; struct sockaddr_un a = { AF_UNIX, "" }; + const char *optstring = "vh"; struct pesto_hello hello; struct sock_fprog prog; + int optname, ret, s; uint32_t s_version; - int ret, s; - - prctl(PR_SET_DUMPABLE, 0);
I would, at least by default, keep this exactly in line with the paranoid stuff I added to passt, for consistency and ease of auditing: what will happen if we run *this* tool as root? If we choose to drop to nobody in that case, removing this will be a problem. And we'll probably not remember to add it back.
So, I made that change as a temporary debugging hack which I forgot to revert. That said, I don't really see the case for disabling dump in pesto - if you can trace pesto you could just connect directly to the control socket. The possibility of dropping to nobody is the closest to a case I've seen... but I don't really see a reason we'd want to do that. It would be quite awkward for one thing, since nobody probably won't have permission to the control socket. Running this as root wouldn't be the wisest choice, but unlike for passt/pasta I don't think it's scary enough that we should actively try to stop the user from shooting themselves in the foot. That said, it obviously shouldn't be removed in this patch, having just been added in the previous one. I've left it in for now.
+ int verbosity = 1;
If we skip --quiet, this can be simplified.
Done.
prog.len = (unsigned short)sizeof(filter_pesto) / sizeof(filter_pesto[0]); @@ -71,15 +105,40 @@ int main(int argc, char **argv) prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog)) die("Failed to apply seccomp filter");
- if (argc < 2) - die("Usage: %s CONTROLPATH", argv[0]);
I was suggesting to use PATH instead of CONTROLPATH in the earlier patch, but this goes away here and we'll probably want to squash this anyway so it shouldn't matter.
Ah, yes, another update in the wrong patch. I've fixed it in the earlier patch anyway, to save you the bother of remembering it goes away when you next look.
+ do { + optname = getopt_long(argc, argv, optstring, options, NULL); + + switch (optname) { + case -1: + case 0: + break; + case 'h': + usage(argv[0], stdout, EXIT_SUCCESS); + break; + case 'q': + verbosity--; + break; + case 'v': + verbosity++; + break; + case 1: + FPRINTF(stdout, "pesto "); + FPRINTF(stdout, VERSION_BLOB); + exit(EXIT_SUCCESS); + default: + usage(argv[0], stderr, EXIT_FAILURE); + } + } while (optname != -1); + + if (argc - optind != 1) + usage(argv[0], stderr, EXIT_FAILURE);
if ((s = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) die("Failed to create AF_UNIX socket: %s", strerror(errno));
- ret = snprintf(a.sun_path, sizeof(a.sun_path), "%s", argv[1]); + ret = snprintf(a.sun_path, sizeof(a.sun_path), "%s", argv[optind]); if (ret <= 0 || ret >= (int)sizeof(a.sun_path)) - die("Invalid socket path \"%s\"", argv[1]); + die("Invalid socket path \"%s\"", argv[optind]);
ret = connect(s, (struct sockaddr *)&a, sizeof(a)); if (ret < 0) { @@ -87,6 +146,8 @@ int main(int argc, char **argv) a.sun_path, strerror(errno)); }
+ debug("Connected to passt/pasta control socket"); + ret = read_all_buf(s, &hello, sizeof(hello)); if (ret < 0) die("Couldn't read server greeting: %s", strerror(errno)); @@ -96,6 +157,8 @@ int main(int argc, char **argv)
s_version = ntohl(hello.version);
+ debug("Server protocol version: %"PRIu32, s_version); + if (s_version > PESTO_PROTOCOL_VERSION) { die("Unknown server protocol version %"PRIu32" > %"PRIu32"\n", s_version, PESTO_PROTOCOL_VERSION); diff --git a/util.h b/util.h index e7993f4d..d7c397f6 100644 --- a/util.h +++ b/util.h @@ -21,14 +21,6 @@
#include "log.h"
-#define VERSION_BLOB \ - VERSION "\n" \ - "Copyright Red Hat\n" \ - "GNU General Public License, version 2 or later\n" \ - " https://www.gnu.org/licenses/old-licenses/gpl-2.0.html\n" \ - "This is free software: you are free to change and redistribute it.\n" \ - "There is NO WARRANTY, to the extent permitted by law.\n\n" - #ifndef SECCOMP_RET_KILL_PROCESS #define SECCOMP_RET_KILL_PROCESS SECCOMP_RET_KILL #endif
-- Stefano
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Mar 25, 2026 at 01:56:22AM +0100, Stefano Brivio wrote:
On Mon, 23 Mar 2026 18:37:22 +1100 David Gibson
wrote: Extend the dynamic update protocol to expose the pif indices and names from a running passt/pasta to the pesto tool. pesto records that data and (for now) prints it out.
Signed-off-by: David Gibson
--- conf.c | 38 +++++++++++++++++++++ pesto.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++++++++- serialise.c | 21 ++++++++++++ serialise.h | 3 ++ 4 files changed, 159 insertions(+), 1 deletion(-) diff --git a/conf.c b/conf.c index 7b960fe9..b878db94 100644 --- a/conf.c +++ b/conf.c @@ -2303,6 +2303,41 @@ void conf(struct ctx *c, int argc, char **argv) conf_print(c); }
+/** + * conf_send_pifs() - Send list of pifs to dynamic update client (pesto) + * @c: Execution context + * @fd: Socket to the client + * + * Return: 0 on success, -1 on failure + */ +static int conf_send_pifs(const struct ctx *c, int fd) +{ + uint32_t num = 0; + unsigned pif; + + /* First count the number of pifs with tables */ + for (pif = 0; pif < PIF_NUM_TYPES; pif++) { + if (c->fwd[pif]) + num++; + } + + if (write_u32(fd, num)) + return -1; + + for (pif = 0; pif < PIF_NUM_TYPES; pif++) { + if (!c->fwd[pif]) + continue; + + if (write_u8(fd, pif)) + return -1; + + if (write_str(fd, pif_name(pif)) < 0) + return -1; + } + + return 0; +} + /** * conf_listen_handler() - Handle events on configuration listening socket * @c: Execution context @@ -2359,6 +2394,9 @@ void conf_listen_handler(struct ctx *c, uint32_t events) "Warning: Using experimental unsupported configuration protocol"); }
+ if (conf_send_pifs(c, fd) < 0) + goto fail; + return;
fail: diff --git a/pesto.c b/pesto.c index f8c9e01d..ed4f2aab 100644 --- a/pesto.c +++ b/pesto.c @@ -36,6 +36,8 @@ #include "serialise.h" #include "pesto.h"
+static int verbosity = 1; + #define die(...) \ do { \ FPRINTF(stderr, __VA_ARGS__); \ @@ -51,6 +53,19 @@ } \ } while (0)
+/** + * xmalloc() - Allocate memory, with fatal error on failure + * @size: Number of bytes to allocate + */ +static void *xmalloc(size_t size) +{ + void *p = malloc(size);
I would really really prefer to skip this unless it's absolutely necessary, for a number of reasons:
1. this thing is talking to passt which talks to untrusted workloads, and there's a risk of arbitrary code execution, which implies the possibility of specially crafted messages that might make us allocate:
a. in specific regions and build a heap spraying attack based on that
b. a lot, and cause a denial of service
Eh, I suppose.
...and, while this process should be short lived, we can't assume that, if somebody achieves arbitrary code execution in passt and malicious messages, because we risk looping on rules, or having passt send us information very slowly (minutes / hours), etc.
...ah, those are good points.
2. besides, we might want to add a "daemon" mode one day, and then it's not short lived anymore, which opens the door to leaks and double-frees (adding to a. and b. above)
I really don't see any reason we'd want that, but I guess that's irrelevant given that previous paragraph.
3. we might need to reuse functions from passt and not notice right away that they call this xmalloc(), or have to eventually adapt them anyway
I mean.. when we test the path in question, we should hit the seccomp filter, which should make it pretty obvious.
4. consistency, in messaging ("passt doesn't allocate dynamic memory", without caveats) and for auditing reasons
Yeah, fair.
Strings can be 32 bytes, or 1024 if we want to splurge. Unless we need to pile a lot of them on the stack (but it's not the case in pesto) a malloc() doesn't really bring any advantage compared to static buffers here and there. It's not megabytes.
So, you've convinced me in principle, but I'm not putting this at the top of my priorities. Using malloc() makes things a bit easier while we're playing around with the protocol a bunch. Once we look reasonably close to a v1 protocol, I'll do the malloc() removal.
+ + if (!p) + die("Memory allocation failure"); + return p; +} + /** * usage() - Print usage, exit with given status code * @name: Executable name @@ -69,6 +84,83 @@ static void usage(const char *name, FILE *f, int status) exit(status); }
+/** + * pesto_recv_str() - Receive a string from passt/pasta + * @fd: Control socket + * + * Return: pointer to malloc()ed string + */ +static const char *pesto_recv_str(int fd) +{ + uint32_t len; + char *buf; + + if (read_u32(fd, &len) < 0) + die("Error reading from control socket"); + + buf = xmalloc(len); + if (read_all_buf(fd, buf, len) < 0) + die("Error reading from control socket"); + + return buf; +} + +struct pif_state { + uint8_t pif; + const char *name; +}; + +struct conf_state {
More as a to-do list item rather than as a review comment: we need documentation for those structs.
Ah, yes.
+ uint32_t npifs; + struct pif_state pif[]; +}; + +/** + * pesto_read_pifs() - Read pif names and IDs from passt/pasta + * @fd: Control socket + */ +static const struct conf_state *pesto_read_pifs(int fd) +{ + uint32_t num; + struct conf_state *state; + unsigned i; + + if (read_u32(fd, &num) < 0) + die("Error reading from control socket"); + + debug("Receiving %"PRIu32" interface names", num); + + state = xmalloc(sizeof(*state) + num * sizeof(struct pif_state)); + state->npifs = num; + + for (i = 0; i < num; i++) { + struct pif_state *ps = &state->pif[i]; + + if (read_u8(fd, &ps->pif) < 0) + die("Error reading from control socket"); + ps->name = pesto_recv_str(fd); + + debug("%u: %s", ps->pif, ps->name); + } + + return state; +} + +/** + * show_state() - Show current rule state obtained from passt/pasta + * @pifs: PIF name information + */ +static void show_state(const struct conf_state *state) +{ + unsigned i; + + for (i = 0; i < state->npifs; i++) { + const struct pif_state *ps = &state->pif[i]; + printf("Forwarding rules for %s interface\n", ps->name); + printf("\tTBD\n"); + } +} + /** * main() - Entry point and whole program with loop * @argc: Argument count @@ -91,12 +183,12 @@ int main(int argc, char **argv) { 0 }, }; struct sockaddr_un a = { AF_UNIX, "" }; + const struct conf_state *state; const char *optstring = "vh"; struct pesto_hello hello; struct sock_fprog prog; int optname, ret, s; uint32_t s_version; - int verbosity = 1;
prog.len = (unsigned short)sizeof(filter_pesto) / sizeof(filter_pesto[0]); @@ -172,5 +264,9 @@ int main(int argc, char **argv) "Warning: Using experimental protocol version, client and server must match\n"); }
+ state = pesto_read_pifs(s); + + show_state(state); + exit(0); } diff --git a/serialise.c b/serialise.c index e3ea86e3..94c34d3c 100644 --- a/serialise.c +++ b/serialise.c @@ -19,6 +19,7 @@ #include
#include #include +#include #include #include "serialise.h" @@ -121,6 +122,26 @@ int write_all_buf(int fd, const void *buf, size_t len) return write_all_buf(fd, &beval, sizeof(beval)); \ }
+#define be8toh(x) (x) +#define htobe8(x) (x) + +SERIALISE_UINT(8) SERIALISE_UINT(32)
#undef SERIALISE_UNIT + +/** + * write_str() - Write a string to an fd in length/value format + * @fd: Socket to the client + * @s: String to send + * + * Return: 0 on success, -1 on error + */ +int write_str(int fd, const char *s) +{ + uint32_t len = strlen(s) + 1; /* Include \0 */ + + if (write_u32(fd, len) < 0) + return -1;
Even if we don't need to allocate here, I would feel more comfortable if we passed fixed-size strings only to passt to have a slightly lower risk of buffer overflows, in general.
That's fair. I'd basically already decided to move to fixed length pif names, just haven't implemented it yet.
+ return write_all_buf(fd, s, len); +} diff --git a/serialise.h b/serialise.h index 0a4ed086..7ef35786 100644 --- a/serialise.h +++ b/serialise.h @@ -16,6 +16,9 @@ int write_all_buf(int fd, const void *buf, size_t len); int read_u##bits(int fd, uint##bits##_t *val); \ int write_u##bits(int fd, uint##bits##_t val);
+SERIALISE_UINT_DECL(8) SERIALISE_UINT_DECL(32)
+int write_str(int fd, const char *s); + #endif /* _SERIALISE_H */
-- Stefano
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Mar 25, 2026 at 01:54:33AM +0100, Stefano Brivio wrote:
Nits:
On Mon, 23 Mar 2026 18:37:15 +1100 David Gibson
wrote: fwd_rule_add() takes a flags parameter, but it only allows the FWD_WEAK and FWD_SCAN flags to be specified there. It doesn't allow the FWD_DUAL_STACK_ANY flag to be set, instead expecting a [*] address to be indicated by passing NULL as @addr.
However, for upcoming dynamic rule updates, it's more convenient to be able to explicitly pass FWD_DUAL_STACK_ANY along with an address of ::. Allow that mode of calling.
Signed-off-by: David Gibson
--- fwd.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fwd.c b/fwd.c index 3395a28e..d73b7ca7 100644 --- a/fwd.c +++ b/fwd.c @@ -362,18 +362,21 @@ void fwd_rule_add(struct fwd_table *fwd, uint8_t proto, uint8_t flags, in_port_t first, in_port_t last, in_port_t to) { /* Flags which can be set from the caller */ - const uint8_t allowed_flags = FWD_WEAK | FWD_SCAN; + const uint8_t allowed_flags = FWD_WEAK | FWD_SCAN | FWD_DUAL_STACK_ANY; unsigned num = (unsigned)last - first + 1; struct fwd_rule *new; unsigned i, port;
assert(!(flags & ~allowed_flags)); -
Spurious change, I think? The extra newline here looks good for readability.
Oops, yes.
Maybe we should group the second assert with this one?
Good idea, done.
if (fwd->count >= ARRAY_SIZE(fwd->rules)) die("Too many port forwarding ranges"); if ((fwd->sock_count + num) > ARRAY_SIZE(fwd->socks)) die("Too many listening sockets");
+ /* Passing a non-wildcard address with DUAL_STACK_ANY is a bug */
Extra whitespace after DUAL_STACK_ANY.
Fixed.
+ assert(!(flags & FWD_DUAL_STACK_ANY) || !addr || + inany_equals(addr, &inany_any6)); + /* Check for any conflicting entries */ for (i = 0; i < fwd->count; i++) { char newstr[INANY_ADDRSTRLEN], rulestr[INANY_ADDRSTRLEN];
-- Stefano
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Mar 25, 2026 at 01:56:39AM +0100, Stefano Brivio wrote:
On Mon, 23 Mar 2026 18:37:28 +1100 David Gibson
wrote: Implement serialisation of our current forwarding rules in conf.c, deserialising it to display in the pesto client.
Signed-off-by: David Gibson
--- conf.c | 44 +++++++++++++++++++++++++++++++ fwd_rule.c | 40 ++++++++++++++++++++++++++++ fwd_rule.h | 3 +++ pesto.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++--- 4 files changed, 160 insertions(+), 4 deletions(-) diff --git a/conf.c b/conf.c index b878db94..7f311914 100644 --- a/conf.c +++ b/conf.c @@ -2338,6 +2338,47 @@ static int conf_send_pifs(const struct ctx *c, int fd) return 0; }
+/** + * conf_send_rules() - Send current forwarding rules to dynamic update client (pesto)
What about:
* conf_send_rules() - Send forwarding rules to configuration client (pesto)
Done.
+ * @c: Execution context + * @fd: Socket to the client + * + * Return: 0 on success, -1 on failure + */ +static int conf_send_rules(const struct ctx *c, int fd) +{ + unsigned pif; + + for (pif = 0; pif < PIF_NUM_TYPES; pif++) { + const struct fwd_table *fwd = c->fwd[pif]; + unsigned i; + + if (!fwd) + continue; + + assert(pif); + + /* PIF id */ + if (write_u8(fd, pif)) + return -1; + + /* Number of rules */ + if (write_u32(fd, fwd->count)) + return -1; + + for (i = 0; i < fwd->count; i++) { + if (fwd_rule_write(fd, &fwd->rules[i].rule)) + return -1; + } + } + + /* Write 0 PIF id to finish */ + if (write_u8(fd, 0))
We have PIF_NONE, shouldn't we use that to make it obvious that it's an invalid value?
Done.
+ return -1; + + return 0; +}
[...]
I stopped reviewing here, the next patches in my series are anyway mine and after those I guess it doesn't make sense, yet, that I review in fine detail, as we'll probably need to change a bunch of things.
Understood. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Mar 25, 2026 at 02:00:23AM +0100, Stefano Brivio wrote:
On Wed, 25 Mar 2026 01:56:51 +0100 Stefano Brivio
wrote: On Mon, 23 Mar 2026 18:37:07 +1100 David Gibson
wrote: Here's a new draft of dynamic updates. This now can successfully update rules, though I've not tested it very extensively. Essentially this is just barely enough to work, it still could do with rather a lot of polish.
Patches 1..12/22 are preliminary reworks that make moderate sense even without pesto - feel free to apply if you're happy with them.
I'm applying (in a moment, tests are running now):
- 1/25, 2/25, - not 3/25 (comments pending) - not 4/25 (depends on 3/25) - 5/25, 6/25, - not 7/25 (Coverity Scan warnings), - not 8/25 (comments pending), - 9/25, 10/25, 11/25, 12/25.
Never mind, most of the "TCP/IPv4: ns to host" tests fail, probably I missed something while cherry-picking patches from this series. I guess a re-spin up to 12/25 would be more convenient in this case.
Will do. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, 25 Mar 2026 15:34:02 +1100
David Gibson
On Wed, Mar 25, 2026 at 01:56:22AM +0100, Stefano Brivio wrote:
[...]
So, you've convinced me in principle, but I'm not putting this at the top of my priorities. Using malloc() makes things a bit easier while we're playing around with the protocol a bunch. Once we look reasonably close to a v1 protocol, I'll do the malloc() removal.
Sure, it makes sense, it's definitely not needed right now. I guess it will mostly be a matter of passing a number to the _str() functions and share constants between client and server. -- Stefano
On Wed, 25 Mar 2026 15:42:00 +1100
David Gibson
On Wed, Mar 25, 2026 at 01:56:34AM +0100, Stefano Brivio wrote:
On Mon, 23 Mar 2026 18:37:27 +1100 David Gibson
wrote: Move the logic for formatting forwarding rules into strings from fwd_rules_print() into fwd_rule.c where it can be shared with pesto. We also make the function explicitly construct a string, rather than directly printing with info(), for greater flexibility.
Signed-off-by: David Gibson
--- Makefile | 11 ++++---- fwd.c | 40 +++--------------------------- fwd_rule.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ fwd_rule.h | 11 ++++++++ 4 files changed, 94 insertions(+), 41 deletions(-) create mode 100644 fwd_rule.c diff --git a/Makefile b/Makefile index bc325482..44c396e7 100644 --- a/Makefile +++ b/Makefile @@ -38,13 +38,14 @@ FLAGS += -DVERSION=\"$(VERSION)\" FLAGS += -DDUAL_STACK_SOCKETS=$(DUAL_STACK_SOCKETS)
PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c epoll_ctl.c \ - flow.c fwd.c icmp.c igmp.c inany.c iov.c ip.c isolation.c lineread.c \ - log.c mld.c ndp.c netlink.c migrate.c packet.c passt.c pasta.c pcap.c \ - pif.c repair.c serialise.c tap.c tcp.c tcp_buf.c tcp_splice.c tcp_vu.c \ - udp.c udp_flow.c udp_vu.c util.c vhost_user.c virtio.c vu_common.c + flow.c fwd.c fwd_rule.c icmp.c igmp.c inany.c iov.c ip.c isolation.c \ + lineread.c log.c mld.c ndp.c netlink.c migrate.c packet.c passt.c \ + pasta.c pcap.c pif.c repair.c serialise.c tap.c tcp.c tcp_buf.c \ + tcp_splice.c tcp_vu.c udp.c udp_flow.c udp_vu.c util.c vhost_user.c \ + virtio.c vu_common.c QRAP_SRCS = qrap.c PASST_REPAIR_SRCS = passt-repair.c -PESTO_SRCS = pesto.c inany.c ip.c serialise.c +PESTO_SRCS = pesto.c fwd_rule.c inany.c ip.c serialise.c SRCS = $(PASST_SRCS) $(QRAP_SRCS) $(PASST_REPAIR_SRCS) $(PESTO_SRCS)
MANPAGES = passt.1 pasta.1 pesto.1 qrap.1 passt-repair.1 diff --git a/fwd.c b/fwd.c index a32d0a20..20409c62 100644 --- a/fwd.c +++ b/fwd.c @@ -304,20 +304,6 @@ parse_err: warn("Unable to parse %s", PORT_RANGE_SYSCTL); }
-/** - * fwd_rule_addr() - Return match address for a rule - * @rule: Forwarding rule - * - * Return: matching address for rule, NULL if it matches all addresses - */ -static const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule) -{ - if (rule->flags & FWD_DUAL_STACK_ANY) - return NULL; - - return &rule->addr; -} - /** * fwd_port_map_ephemeral() - Mark ephemeral ports in a bitmap * @map: Bitmap to update @@ -497,28 +483,10 @@ void fwd_rules_print(const struct fwd_table *fwd) unsigned i;
for (i = 0; i < fwd->count; i++) { - const struct fwd_rule *rule = &fwd->rules[i].rule; - const char *percent = *rule->ifname ? "%" : ""; - const char *weak = "", *scan = ""; - char addr[INANY_ADDRSTRLEN]; - - inany_ntop(fwd_rule_addr(rule), addr, sizeof(addr)); - if (rule->flags & FWD_WEAK) - weak = " (best effort)"; - if (rule->flags & FWD_SCAN) - scan = " (auto-scan)"; - - if (rule->first == rule->last) { - info(" %s [%s]%s%s:%hu => %hu %s%s", - ipproto_name(rule->proto), addr, percent, - rule->ifname, rule->first, rule->to, weak, scan); - } else { - info(" %s [%s]%s%s:%hu-%hu => %hu-%hu %s%s", - ipproto_name(rule->proto), addr, percent, - rule->ifname, rule->first, rule->last, - rule->to, rule->last - rule->first + rule->to, - weak, scan); - } + char rulestr[FWD_RULE_STRLEN]; + + info(" %s", fwd_rule_ntop(&fwd->rules[i].rule, + rulestr, sizeof(rulestr))); } }
diff --git a/fwd_rule.c b/fwd_rule.c new file mode 100644 index 00000000..dfbdf683 --- /dev/null +++ b/fwd_rule.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +/* PASST - Plug A Simple Socket Transport + * for qemu/UNIX domain socket mode + * + * PASTA - Pack A Subtle Tap Abstraction + * for network namespace/tap device mode + * + * PESTO - Programmable Extensible Socket Translation Orchestrator + * front-end for passt(1) and pasta(1) forwarding configuration + * + * fwd_rule.c - Helpers for working with forwarding rule specifications + * + * Copyright Red Hat + * Author: David Gibson
+ */ + +#include + +#include "fwd_rule.h" + +/** + * fwd_rule_addr() - Return match address for a rule + * @rule: Forwarding rule + * + * Return: matching address for rule, NULL if it matches all addresses + */ +const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule) +{ + if (rule->flags & FWD_DUAL_STACK_ANY) + return NULL; + + return &rule->addr; +} + +/** + * fwd_rule_ntop() - Format forwarding rule as a string + * @rule: Rule to format + * @dst: Buffer to store output (should have FWD_RULE_STRLEN bytes) + * @size: Size of @dst + */ +const char *fwd_rule_ntop(const struct fwd_rule *rule, char *dst, size_t size) +{ + const char *percent = *rule->ifname ? "%" : ""; + const char *weak = "", *scan = ""; + char addr[INANY_ADDRSTRLEN]; + int len; + + inany_ntop(fwd_rule_addr(rule), addr, sizeof(addr)); + if (rule->flags & FWD_WEAK) + weak = " (best effort)"; + if (rule->flags & FWD_SCAN) + scan = " (auto-scan)"; + + if (rule->first == rule->last) { + len = snprintf(dst, size, + "%s [%s]%s%s:%hu => %hu %s%s", + ipproto_name(rule->proto), addr, percent, + rule->ifname, rule->first, rule->to, weak, scan); + } else { + in_port_t tolast = rule->last - rule->first + rule->to; + len = snprintf(dst, size, + "%s [%s]%s%s:%hu-%hu => %hu-%hu %s%s", + ipproto_name(rule->proto), addr, percent, + rule->ifname, rule->first, rule->last, + rule->to, tolast, weak, scan); + } + + if (len < 0 || (size_t)len >= size) + return NULL; + + return dst; +} diff --git a/fwd_rule.h b/fwd_rule.h index 84ec5cbe..59db0e95 100644 --- a/fwd_rule.h +++ b/fwd_rule.h @@ -42,4 +42,15 @@ struct fwd_rule { uint8_t flags; }; +const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule); + +#define FWD_RULE_STRLEN \ + (IPPROTO_STRLEN - 1 \ + + INANY_ADDRSTRLEN - 1 \ + + IFNAMSIZ - 1 \ + + sizeof(" (best effort)") - 1 \ + + sizeof(" (auto-scan)") - 1 \ + + 15)
I'm not quite sure about the reason for the 15 here, is there some constant we can use perhaps?
It's the spacing and punctuation around the other displayed parameters, roughly the number of non-format characters in "%s [%s]%s%s:%hu-%hu => %hu-%hu %s%s" (plus one for the \0). I don't love it either, but I wasn't sure how to make it less obscure. Is sizeof(" []%:- => - ") an improvement? I'm not sure.
I think it definitely is. I can't see a better explanation than that sizeof(), and it also looks like a good way to double check the number of additional characters. If we want to do overdo with robustness, we could perhaps think of separately defining the format string and using it here for the calculation as well, which makes future (albeit unlikely) updates of the format string safer. But, in that case, dropping characters for each format specifier is something we would have to do manually anyway... or not do at all and rely on the size of the format string as it is as an upper bound, perhaps. -- Stefano
On Wed, Mar 25, 2026 at 09:18:00AM +0100, Stefano Brivio wrote:
On Wed, 25 Mar 2026 15:34:02 +1100 David Gibson
wrote: On Wed, Mar 25, 2026 at 01:56:22AM +0100, Stefano Brivio wrote:
[...]
So, you've convinced me in principle, but I'm not putting this at the top of my priorities. Using malloc() makes things a bit easier while we're playing around with the protocol a bunch. Once we look reasonably close to a v1 protocol, I'll do the malloc() removal.
Sure, it makes sense, it's definitely not needed right now. I guess it will mostly be a matter of passing a number to the _str() functions and share constants between client and server.
I've actually already eliminated the _str() functions in my latest draft. The less convenient case is allocating the arrays of pifs and rules, I'll tackle that when I get to it. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Mar 25, 2026 at 09:18:07AM +0100, Stefano Brivio wrote:
On Wed, 25 Mar 2026 15:42:00 +1100 David Gibson
wrote: On Wed, Mar 25, 2026 at 01:56:34AM +0100, Stefano Brivio wrote:
On Mon, 23 Mar 2026 18:37:27 +1100 David Gibson
wrote: Move the logic for formatting forwarding rules into strings from fwd_rules_print() into fwd_rule.c where it can be shared with pesto. We also make the function explicitly construct a string, rather than directly printing with info(), for greater flexibility.
Signed-off-by: David Gibson
--- Makefile | 11 ++++---- fwd.c | 40 +++--------------------------- fwd_rule.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ fwd_rule.h | 11 ++++++++ 4 files changed, 94 insertions(+), 41 deletions(-) create mode 100644 fwd_rule.c diff --git a/Makefile b/Makefile index bc325482..44c396e7 100644 --- a/Makefile +++ b/Makefile @@ -38,13 +38,14 @@ FLAGS += -DVERSION=\"$(VERSION)\" FLAGS += -DDUAL_STACK_SOCKETS=$(DUAL_STACK_SOCKETS)
PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c epoll_ctl.c \ - flow.c fwd.c icmp.c igmp.c inany.c iov.c ip.c isolation.c lineread.c \ - log.c mld.c ndp.c netlink.c migrate.c packet.c passt.c pasta.c pcap.c \ - pif.c repair.c serialise.c tap.c tcp.c tcp_buf.c tcp_splice.c tcp_vu.c \ - udp.c udp_flow.c udp_vu.c util.c vhost_user.c virtio.c vu_common.c + flow.c fwd.c fwd_rule.c icmp.c igmp.c inany.c iov.c ip.c isolation.c \ + lineread.c log.c mld.c ndp.c netlink.c migrate.c packet.c passt.c \ + pasta.c pcap.c pif.c repair.c serialise.c tap.c tcp.c tcp_buf.c \ + tcp_splice.c tcp_vu.c udp.c udp_flow.c udp_vu.c util.c vhost_user.c \ + virtio.c vu_common.c QRAP_SRCS = qrap.c PASST_REPAIR_SRCS = passt-repair.c -PESTO_SRCS = pesto.c inany.c ip.c serialise.c +PESTO_SRCS = pesto.c fwd_rule.c inany.c ip.c serialise.c SRCS = $(PASST_SRCS) $(QRAP_SRCS) $(PASST_REPAIR_SRCS) $(PESTO_SRCS)
MANPAGES = passt.1 pasta.1 pesto.1 qrap.1 passt-repair.1 diff --git a/fwd.c b/fwd.c index a32d0a20..20409c62 100644 --- a/fwd.c +++ b/fwd.c @@ -304,20 +304,6 @@ parse_err: warn("Unable to parse %s", PORT_RANGE_SYSCTL); }
-/** - * fwd_rule_addr() - Return match address for a rule - * @rule: Forwarding rule - * - * Return: matching address for rule, NULL if it matches all addresses - */ -static const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule) -{ - if (rule->flags & FWD_DUAL_STACK_ANY) - return NULL; - - return &rule->addr; -} - /** * fwd_port_map_ephemeral() - Mark ephemeral ports in a bitmap * @map: Bitmap to update @@ -497,28 +483,10 @@ void fwd_rules_print(const struct fwd_table *fwd) unsigned i;
for (i = 0; i < fwd->count; i++) { - const struct fwd_rule *rule = &fwd->rules[i].rule; - const char *percent = *rule->ifname ? "%" : ""; - const char *weak = "", *scan = ""; - char addr[INANY_ADDRSTRLEN]; - - inany_ntop(fwd_rule_addr(rule), addr, sizeof(addr)); - if (rule->flags & FWD_WEAK) - weak = " (best effort)"; - if (rule->flags & FWD_SCAN) - scan = " (auto-scan)"; - - if (rule->first == rule->last) { - info(" %s [%s]%s%s:%hu => %hu %s%s", - ipproto_name(rule->proto), addr, percent, - rule->ifname, rule->first, rule->to, weak, scan); - } else { - info(" %s [%s]%s%s:%hu-%hu => %hu-%hu %s%s", - ipproto_name(rule->proto), addr, percent, - rule->ifname, rule->first, rule->last, - rule->to, rule->last - rule->first + rule->to, - weak, scan); - } + char rulestr[FWD_RULE_STRLEN]; + + info(" %s", fwd_rule_ntop(&fwd->rules[i].rule, + rulestr, sizeof(rulestr))); } }
diff --git a/fwd_rule.c b/fwd_rule.c new file mode 100644 index 00000000..dfbdf683 --- /dev/null +++ b/fwd_rule.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +/* PASST - Plug A Simple Socket Transport + * for qemu/UNIX domain socket mode + * + * PASTA - Pack A Subtle Tap Abstraction + * for network namespace/tap device mode + * + * PESTO - Programmable Extensible Socket Translation Orchestrator + * front-end for passt(1) and pasta(1) forwarding configuration + * + * fwd_rule.c - Helpers for working with forwarding rule specifications + * + * Copyright Red Hat + * Author: David Gibson
+ */ + +#include + +#include "fwd_rule.h" + +/** + * fwd_rule_addr() - Return match address for a rule + * @rule: Forwarding rule + * + * Return: matching address for rule, NULL if it matches all addresses + */ +const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule) +{ + if (rule->flags & FWD_DUAL_STACK_ANY) + return NULL; + + return &rule->addr; +} + +/** + * fwd_rule_ntop() - Format forwarding rule as a string + * @rule: Rule to format + * @dst: Buffer to store output (should have FWD_RULE_STRLEN bytes) + * @size: Size of @dst + */ +const char *fwd_rule_ntop(const struct fwd_rule *rule, char *dst, size_t size) +{ + const char *percent = *rule->ifname ? "%" : ""; + const char *weak = "", *scan = ""; + char addr[INANY_ADDRSTRLEN]; + int len; + + inany_ntop(fwd_rule_addr(rule), addr, sizeof(addr)); + if (rule->flags & FWD_WEAK) + weak = " (best effort)"; + if (rule->flags & FWD_SCAN) + scan = " (auto-scan)"; + + if (rule->first == rule->last) { + len = snprintf(dst, size, + "%s [%s]%s%s:%hu => %hu %s%s", + ipproto_name(rule->proto), addr, percent, + rule->ifname, rule->first, rule->to, weak, scan); + } else { + in_port_t tolast = rule->last - rule->first + rule->to; + len = snprintf(dst, size, + "%s [%s]%s%s:%hu-%hu => %hu-%hu %s%s", + ipproto_name(rule->proto), addr, percent, + rule->ifname, rule->first, rule->last, + rule->to, tolast, weak, scan); + } + + if (len < 0 || (size_t)len >= size) + return NULL; + + return dst; +} diff --git a/fwd_rule.h b/fwd_rule.h index 84ec5cbe..59db0e95 100644 --- a/fwd_rule.h +++ b/fwd_rule.h @@ -42,4 +42,15 @@ struct fwd_rule { uint8_t flags; }; +const union inany_addr *fwd_rule_addr(const struct fwd_rule *rule); + +#define FWD_RULE_STRLEN \ + (IPPROTO_STRLEN - 1 \ + + INANY_ADDRSTRLEN - 1 \ + + IFNAMSIZ - 1 \ + + sizeof(" (best effort)") - 1 \ + + sizeof(" (auto-scan)") - 1 \ + + 15)
I'm not quite sure about the reason for the 15 here, is there some constant we can use perhaps?
It's the spacing and punctuation around the other displayed parameters, roughly the number of non-format characters in "%s [%s]%s%s:%hu-%hu => %hu-%hu %s%s" (plus one for the \0). I don't love it either, but I wasn't sure how to make it less obscure. Is sizeof(" []%:- => - ") an improvement? I'm not sure.
I think it definitely is. I can't see a better explanation than that sizeof(), and it also looks like a good way to double check the number of additional characters.
Ok, done.
If we want to do overdo with robustness, we could perhaps think of separately defining the format string and using it here for the calculation as well, which makes future (albeit unlikely) updates of the format string safer.
Eh, it's a tradeoff with two poor options. Put the format next to the printf and we have the current trouble. Or, put the fomat next to the length calculation, but then it's more awkward looking at the printf to check if the parameters are correct.
But, in that case, dropping characters for each format specifier is something we would have to do manually anyway... or not do at all and rely on the size of the format string as it is as an upper bound, perhaps.
Right. Or a snprintf wrapper macro + external tool that derives a bound from the format string and builds a header with it. Sounds nice, but more work than I care to do at this time :). -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
participants (2)
-
David Gibson
-
Stefano Brivio