[PATCH v2 00/14] Clean up checksum and header generation for inbound packets
The main packet "fast paths" for UDP and TCP mostly just forward packets rather than generating them from scratch. However the control paths for ICMP and DHCP sometimes generate packets more or less from scratch. Because these are relatively rare, it's not performance critical. The paths for sending these packets have some duplication of the header generation. There's also some layering violation in tap_ip_send() which both generates IP headers and updates the L4 (UDP or UCMP) checksum. Finally that checksum generation is a little awkward: it temporarily generates the IP pseudo header (or something close enough to serve) in the place of the actual header, generates the checksum, then replaces it with the real IP header. This approach seems to be causing miscompiles with some LTO optimization, because the stores to the pseudo header are being moved or elided across the code calculating the checksum. This series addresses all of these. We consolidate and clarify the packet sending helpers, and use them in some places there was previously duplicated code. In the process we use new checksum generation helpers which take a different approach which should avoid the LTO problems (this aspect I haven't tested yet though). Changes since v1: * Numerous minor style changes * Rename header generation helpers to make their behaviour clearer * Added several missing function doc comments * Corrected some erroneous statements and terms in comments David Gibson (14): Add csum_icmp6() helper for calculating ICMPv6 checksums Add csum_icmp4() helper for calculating ICMP checksums Add csum_udp6() helper for calculating UDP over IPv6 checksums Add csum_udp4() helper for calculating UDP over IPv4 checksums Add csum_ip4_header() helper to calculate IPv4 header checksums Add helpers for normal inbound packet destination addresses Remove support for TCP packets from tap_ip_send() tap: Remove unhelpeful vnet_pre optimization from tap_send() Split tap_ip_send() into IPv4 and IPv6 specific functions tap: Split tap_ip6_send() into UDP and ICMP variants ndp: Remove unneeded eh_source parameter ndp: Use tap_icmp6_send() helper tap: Split tap_ip4_send() into UDP and ICMP variants dhcp: Use tap_udp4_send() helper in dhcp() arp.c | 2 +- checksum.c | 120 ++++++++++++++++----- checksum.h | 15 ++- dhcp.c | 19 +--- dhcpv6.c | 21 +--- icmp.c | 12 +-- ndp.c | 28 +---- ndp.h | 3 +- tap.c | 312 ++++++++++++++++++++++++++++++++++------------------- tap.h | 19 +++- 10 files changed, 345 insertions(+), 206 deletions(-) -- 2.37.3
At least two places in passt calculate ICMPv6 checksums, ndp() and
tap_ip_send(). Add a helper to handle this calculation in both places.
For future flexibility, the new helper takes parameters for the fields in
the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to
be explicitly constructed. It also allows the ICMPv6 header and payload to
be in separate buffers, although we don't use this yet.
Signed-off-by: David Gibson
Although tap_ip_send() is currently the only place calculating ICMP
checksums, create a helper function for symmetry with ICMPv6. For
future flexibility it allows the ICMPv6 header and payload to be in
separate buffers.
Signed-off-by: David Gibson
Add a helper for calculating UDP checksums when used over IPv6
For future flexibility, the new helper takes parameters for the fields in
the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to
be explicitly constructed. It also allows the UDP header and payload to
be in separate buffers, although we don't use this yet.
Signed-off-by: David Gibson
At least two places in passt fill in UDP over IPv4 checksums, although
since UDP checksums are optional with IPv4 that just amounts to storing
a 0 (in tap_ip_send()) or leaving a 0 from an earlier initialization (in
dhcp()). For consistency, add a helper for this "calculation".
Just for the heck of it, add the option (compile time disabled for now) to
calculate real UDP checksums.
Signed-off-by: David Gibson
We calculate IPv4 header checksums in at least two places, in dhcp() and
in tap_ip_send. Add a helper to handle this calculation in both places.
Signed-off-by: David Gibson
tap_ip_send() doesn't take a destination address, because it's specifically
for inbound packets, and the IP addresses of the guest/namespace are
already known to us. Rather than open-coding this destination address
logic, make helper functions for it which will enable some later cleanups.
Signed-off-by: David Gibson
tap_ip_send() is never used for TCP packets, we're unlikely to use it for
that in future, and the handling of TCP packets makes other cleanups
unnecessarily awkward. Remove it.
This is the only user of csum_tcp4(), so we can remove that as well.
Signed-off-by: David Gibson
Callers of tap_send() can optionally use a small optimization by adding
extra space for the 4 byte length header used on the qemu socket interface.
tap_ip_send() is currently the only user of this, but this is used only
for "slow path" ICMP and DHCP packets, so there's not a lot of value to
the optimization.
Worse, having the two paths here complicates the interface and makes future
cleanups difficult, so just remove it. I have some plans to bring back the
optimization in a more general way in future, but for now it's just in the
way.
Signed-off-by: David Gibson
The IPv4 and IPv6 paths in tap_ip_send() have very little in common, and
it turns out that every caller (statically) knows if it is using IPv4 or
IPv6. So split into separate tap_ip4_send() and tap_ip6_send() functions.
Use a new tap_l2_hdr() function for the very small common part.
While we're there, make some minor cleanups:
- We were double writing some fields in the IPv6 header, so that it
temporary matched the pseudo-header for checksum calculation. With
recent checksum reworks, this isn't neccessary any more.
- We don't use any IPv4 header options, so use some sizeof() constructs
instead of some open coded values for header length.
- The comment used to say that the flow label was for TCP over IPv6, but
in fact the only thing we used it for was DHCPv6 over UDP traffic
Signed-off-by: David Gibson
tap_ip6_send() has special case logic to compute the checksums for UDP
and ICMP packets, which is a mild layering violation. By using a suitable
helper we can split it into tap_udp6_send() and tap_icmp6_send() functions
without greatly increasing the code size, this removing that layering
violation.
We make some small changes to the interface while there. In both cases
we make the destination IPv6 address a parameter, which will be useful
later. For the UDP variant we make it take just the UDP payload, and it
will generate the UDP header. For the ICMP variant we pass in the ICMP
header as before. The inconsistency is because that's what seems to be
the more natural way to invoke the function in the callers in each case.
Signed-off-by: David Gibson
ndp() takes a parameter giving the ethernet source address of the packet
it is to respond to, which it uses to determine the destination address to
send the reply packet to.
This is not necessary, because the address will always be the guest's
MAC address. Even if the guest has just changed MAC address, then either
tap_handler_passt() or tap_handler_pasta() - which are the only call paths
leading to ndp() will have updated c->mac_guest with the new value.
So, remove the parameter, and just use c->mac_guest, making it more
consistent with other paths where we construct packets to send inwards.
Signed-off-by: David Gibson
We send ICMPv6 packets to the guest from both icmp.c and from ndp.c. The
case in ndp() manually constructs L2 and IPv6 headers, unlike the version
in icmp.c which uses the tap_icmp6_send() helper from tap.c Now that we've
broaded the parameters of tap_icmp6_send() we can use it in ndp() as well
saving some duplicated logic.
Signed-off-by: David Gibson
tap_ip4_send() has special case logic to compute the checksums for UDP
and ICMP packets, which is a mild layering violation. By using a suitable
helper we can split it into tap_udp4_send() and tap_icmp4_send() functions
without greatly increasing the code size, this removing that layering
violation.
We make some small changes to the interface while there. In both cases
we make the destination IPv4 address a parameter, which will be useful
later. For the UDP variant we make it take just the UDP payload, and it
will generate the UDP header. For the ICMP variant we pass in the ICMP
header as before. The inconsistency is because that's what seems to be
the more natural way to invoke the function in the callers in each case.
Signed-off-by: David Gibson
The IPv4 specific dhcp() manually constructs L2 and IP headers to send its
DHCP reply packet, unlike its IPv6 equivalent in dhcpv6.c which uses the
tap_udp6_send() helper. Now that we've broaded the parameters to
tap_udp4_send() we can use it in dhcp() to avoid some duplicated logic.
Signed-off-by: David Gibson
On Wed, 19 Oct 2022 11:43:43 +1100
David Gibson
The main packet "fast paths" for UDP and TCP mostly just forward packets rather than generating them from scratch. However the control paths for ICMP and DHCP sometimes generate packets more or less from scratch. Because these are relatively rare, it's not performance critical.
The paths for sending these packets have some duplication of the header generation. There's also some layering violation in tap_ip_send() which both generates IP headers and updates the L4 (UDP or UCMP) checksum.
Finally that checksum generation is a little awkward: it temporarily generates the IP pseudo header (or something close enough to serve) in the place of the actual header, generates the checksum, then replaces it with the real IP header. This approach seems to be causing miscompiles with some LTO optimization, because the stores to the pseudo header are being moved or elided across the code calculating the checksum.
This series addresses all of these. We consolidate and clarify the packet sending helpers, and use them in some places there was previously duplicated code. In the process we use new checksum generation helpers which take a different approach which should avoid the LTO problems (this aspect I haven't tested yet though).
Changes since v1: * Numerous minor style changes * Rename header generation helpers to make their behaviour clearer * Added several missing function doc comments * Corrected some erroneous statements and terms in comments
Thanks, it looks good to me! I'm travelling, I'll apply in a bit. -- Stefano
On Wed, 19 Oct 2022 11:43:43 +1100
David Gibson
The main packet "fast paths" for UDP and TCP mostly just forward packets rather than generating them from scratch. However the control paths for ICMP and DHCP sometimes generate packets more or less from scratch. Because these are relatively rare, it's not performance critical.
The paths for sending these packets have some duplication of the header generation. There's also some layering violation in tap_ip_send() which both generates IP headers and updates the L4 (UDP or UCMP) checksum.
Finally that checksum generation is a little awkward: it temporarily generates the IP pseudo header (or something close enough to serve) in the place of the actual header, generates the checksum, then replaces it with the real IP header. This approach seems to be causing miscompiles with some LTO optimization, because the stores to the pseudo header are being moved or elided across the code calculating the checksum.
This series addresses all of these. We consolidate and clarify the packet sending helpers, and use them in some places there was previously duplicated code. In the process we use new checksum generation helpers which take a different approach which should avoid the LTO problems (this aspect I haven't tested yet though).
Changes since v1: * Numerous minor style changes * Rename header generation helpers to make their behaviour clearer * Added several missing function doc comments * Corrected some erroneous statements and terms in comments
Applied now, thanks, and sorry for the delay. -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio