[PATCH v2 0/9] Improve handling of MTU limits
After considerable lead up, this corrects the handling of the --mtu option so that it will respect limits imposed by the backend tap layers, correctly accounting for L2 headers. This incorporates the earlier mode setting patches, on which the rest depend. There's been no change in those patches, there just included here for self-containedness. v2: * More static_assert() validation of frame limits in 6/9 * Cosmetic fixups in 6/9 David Gibson (9): conf: Use the same optstring for passt and pasta modes conf: Move mode detection into helper function conf: Detect vhost-user mode earlier packet: Give explicit name to maximum packet size packet: Remove redundant TAP_BUF_BYTES define tap: Use explicit defines for maximum length of L2 frame Simplify sizing of pkt_buf pcap: Correctly set snaplen based on tap backend type conf: Limit maximum MTU based on backend frame size conf.c | 98 +++++++++++++++++++++++++++++++++++++++++--------------- conf.h | 1 + packet.c | 4 +-- packet.h | 3 ++ passt.c | 16 ++------- passt.h | 7 ++-- pcap.c | 46 +++++++++++++------------- tap.c | 44 ++++++++++++++++++++++--- tap.h | 26 +++++++++++++++ util.h | 3 -- 10 files changed, 173 insertions(+), 75 deletions(-) -- 2.48.1
Currently we rely on detecting our mode first and use different sets of
(single character) options for each. This means that if you use an option
valid in only one mode in another you'll get the generic usage() message.
We can give more helpful errors with little extra effort by combining all
the options into a single value of the option string and giving bespoke
messages if an option for the wrong mode is used; in fact we already did
this for some single mode options like '-1'.
Signed-off-by: David Gibson
One of the first things we need to do is determine if we're in passt mode
or pasta mode. Currently this is open-coded in main(), by examining
argv[0]. We want to complexify this a bit in future to cover vhost-user
mode as well. Prepare for this, by moving the mode detection into a new
conf_mode() function.
Signed-off-by: David Gibson
We detect our operating mode in conf_mode(), unless we're using vhost-user
mode, in which case we change it later when we parse the --vhost-user
option. That means we need to delay parsing the --repair-path option (for
vhost-user only) until still later.
However, there are many other places in the main option parsing loop which
also rely on mode. We get away with those, because they happen to be able
to treat passt and vhost-user modes identically. This is potentially
confusing, though. So, move setting of MODE_VU into conf_mode() so
c->mode always has its final value from that point onwards.
To match, we move the parsing of --repair-path back into the main option
parsing loop.
Signed-off-by: David Gibson
We verify that every packet we store in a pool (and every partial packet
we retreive from it) has a length no longer than UINT16_MAX. This
originated in the older packet pool implementation which stored packet
lengths in a uint16_t. Now, that packets are represented by a struct
iovec with its size_t length, this check serves only as a sanity / security
check that we don't have some wildly out of range length due to a bug
elsewhere.
We have may reasons to (slightly) increase this limit in future, so in
preparation, give this quantity an explicit name - PACKET_MAX_LEN.
Signed-off-by: David Gibson
Currently we define both TAP_BUF_BYTES and PKT_BUF_BYTES as essentially
the same thing. They'll be different only if TAP_BUF_BYTES is negative,
which makes no sense. So, remove TAP_BUF_BYTES and just use PKT_BUF_BYTES.
In addition, most places we use this to just mean the size of the main
packet buffer (pkt_buf) for which we can just directly use sizeof.
Signed-off-by: David Gibson
Currently in tap.c we (mostly) use ETH_MAX_MTU as the maximum length of
an L2 frame. This define comes from the kernel, but it's badly named and
used confusingly.
First, it doesn't really have anything to do with Ethernet, which has no
structural limit on frame lengths. It comes more from either a) IP which
imposes a 64k datagram limit or b) from internal buffers used in various
places in the kernel (and in passt).
Worse, MTU generally means the maximum size of the IP (L3) datagram which
may be transferred, _not_ counting the L2 headers. In the kernel
ETH_MAX_MTU is sometimes used that way, but sometimes seems to be used as
a maximum frame length, _including_ L2 headers. In tap.c we're mostly
using it in the second way.
Finally, each of our tap backends could have different limits on the frame
size imposed by the mechanisms they're using.
Start clearing up this confusion by replacing it in tap.c with new
L2_MAX_LEN_* defines which specifically refer to the maximum L2 frame
length for each backend.
Signed-off-by: David Gibson
We define the size of pkt_buf as large enough to hold 128 maximum size
packets. Well, approximately, since we round down to the page size. We
don't have any specific reliance on how many packets can fit in the buffer,
we just want it to be big enough to allow reasonable batching. The
current definition relies on the confusingly named ETH_MAX_MTU and adds
in sizeof(uint32_t) rather non-obviously for the pseudo-physical header
used by the qemu socket (passt mode) protocol.
Instead, just define it to be 8MiB, which is what that complex calculation
works out to.
Signed-off-by: David Gibson
The pcap header includes a value indicating how much of each frame is
captured. We always capture the entire frame, so we want to set this to
the maximum possible frame size. Currently we do that by setting it to
ETH_MAX_MTU, but that's a confusingly named constant which might not always
be correct depending on the details of our tap backend.
Instead add a tap_l2_max_len() function that explicitly returns the maximum
frame size for the current mode and use that to set snaplen. While we're
there, there's no particular need for the pcap header to be defined in a
global; make it local to pcap_init() instead.
Signed-off-by: David Gibson
The -m option controls the MTU, that is the maximum transmissible L3
datagram, not including L2 headers. We currently limit it to ETH_MAX_MTU
which sounds like it makes sense. But ETH_MAX_MTU is confusing: it's not
consistently used as to whether it means the maximum L3 datagram size or
the maximum L2 frame size. Even within conf() we explicitly account for
the L2 header size when computing the default --mtu value, but not when
we compute the maximum --mtu value.
Clean this up by reworking the maximum MTU computation to be the minimum of
IP_MAX_MTU (65535) and the maximum sized IP datagram which can fit into
our L2 frames when we account for the L2 header. The latter can vary
depending on our tap backend, although it doesn't right now.
Link: https://bugs.passt.top/show_bug.cgi?id=66
Signed-off-by: David Gibson
On Wed, 12 Mar 2025 13:18:30 +1100
David Gibson
After considerable lead up, this corrects the handling of the --mtu option so that it will respect limits imposed by the backend tap layers, correctly accounting for L2 headers.
This incorporates the earlier mode setting patches, on which the rest depend. There's been no change in those patches, there just included here for self-containedness.
v2: * More static_assert() validation of frame limits in 6/9 * Cosmetic fixups in 6/9
Applied. -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio