[PATCH v2] tcp: probe for SO_PEEK_OFF both in tcpv4 and tcp6
From: Jon Maloy
On Tue, Jul 23, 2024 at 12:09:37AM +0200, Stefano Brivio wrote:
From: Jon Maloy
Based on an original patch by Jon Maloy:
Reviewed-by: David Gibson
On Tue, 23 Jul 2024 00:09:37 +0200
Stefano Brivio
From: Jon Maloy
Based on an original patch by Jon Maloy:
-- The recently added socket option SO_PEEK_OFF is not supported for TCP/IPv6 sockets. Until we get that support into the kernel we need to test for support in both protocols to set the global 'peek_offset_cap´ to true. --
Compared to the original patch: - only check for SO_PEEK_OFF support for enabled IP versions - use sa_family_t instead of int to pass the address family around
Fixes: e63d281871ef ("tcp: leverage support of SO_PEEK_OFF socket option when available")
...so, with this, the probing issue is solved: on a 6.10 kernel, SO_PEEK_OFF is not used, unless I disable IPv6 (with --ipv4-only / -4). However, if I disable it, for some reason, resorting to IPv4, at least together with the flow table (applying just this patch to HEAD), I get something that looks like one of the "old" TCP stalls. On the host: $ ./passt -f -t 10000 -4 and in the guest: # ip link set dev eth0 up # dhclient eth0 # iperf3 -s -p 10000 back to the host: $ iperf3 -c 127.0.0.1 -p 10000 Connecting to host 127.0.0.1, port 10000 [ 5] local 127.0.0.1 port 39046 connected to 127.0.0.1 port 10000 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 11.2 MBytes 94.3 Mbits/sec 0 5.50 MBytes [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 0 5.50 MBytes [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 5.50 MBytes ...the transfer never recovers. I didn't really have time to debug this further. At the moment I would be inclined to temporarily revert commit e63d281871ef ("tcp: leverage support of SO_PEEK_OFF socket option when available"), but it's not a good idea if this happens to be hiding some (unlikely?) issue with the flow table. -- Stefano
On Tue, Jul 23, 2024 at 10:29:36PM +0200, Stefano Brivio wrote:
On Tue, 23 Jul 2024 00:09:37 +0200 Stefano Brivio
wrote: From: Jon Maloy
Based on an original patch by Jon Maloy:
...so, with this, the probing issue is solved: on a 6.10 kernel, SO_PEEK_OFF is not used, unless I disable IPv6 (with --ipv4-only / -4).
However, if I disable it, for some reason, resorting to IPv4, at least together with the flow table (applying just this patch to HEAD), I get something that looks like one of the "old" TCP stalls. On the host:
$ ./passt -f -t 10000 -4
and in the guest:
# ip link set dev eth0 up # dhclient eth0 # iperf3 -s -p 10000
back to the host:
$ iperf3 -c 127.0.0.1 -p 10000 Connecting to host 127.0.0.1, port 10000 [ 5] local 127.0.0.1 port 39046 connected to 127.0.0.1 port 10000 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 11.2 MBytes 94.3 Mbits/sec 0 5.50 MBytes [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 0 5.50 MBytes [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 5.50 MBytes
...the transfer never recovers.
Bother. I've reproduced and am debugging now.
I didn't really have time to debug this further.
At the moment I would be inclined to temporarily revert commit e63d281871ef ("tcp: leverage support of SO_PEEK_OFF socket option when available"), but it's not a good idea if this happens to be hiding some (unlikely?) issue with the flow table.
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Jul 24, 2024 at 10:40:15AM +1000, David Gibson wrote:
On Tue, Jul 23, 2024 at 10:29:36PM +0200, Stefano Brivio wrote:
On Tue, 23 Jul 2024 00:09:37 +0200 Stefano Brivio
wrote: From: Jon Maloy
Based on an original patch by Jon Maloy:
...so, with this, the probing issue is solved: on a 6.10 kernel, SO_PEEK_OFF is not used, unless I disable IPv6 (with --ipv4-only / -4).
However, if I disable it, for some reason, resorting to IPv4, at least together with the flow table (applying just this patch to HEAD), I get something that looks like one of the "old" TCP stalls. On the host:
$ ./passt -f -t 10000 -4
and in the guest:
# ip link set dev eth0 up # dhclient eth0 # iperf3 -s -p 10000
back to the host:
$ iperf3 -c 127.0.0.1 -p 10000 Connecting to host 127.0.0.1, port 10000 [ 5] local 127.0.0.1 port 39046 connected to 127.0.0.1 port 10000 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 11.2 MBytes 94.3 Mbits/sec 0 5.50 MBytes [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 0 5.50 MBytes [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 5.50 MBytes
...the transfer never recovers.
Bother. I've reproduced and am debugging now.
Found it. Looks like one of the cases where we need to set SO_PEEK_OFF was lost somewhere in the refactorings :(.
I didn't really have time to debug this further.
At the moment I would be inclined to temporarily revert commit e63d281871ef ("tcp: leverage support of SO_PEEK_OFF socket option when available"), but it's not a good idea if this happens to be hiding some (unlikely?) issue with the flow table.
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, 24 Jul 2024 13:31:49 +1000
David Gibson
On Wed, Jul 24, 2024 at 10:40:15AM +1000, David Gibson wrote:
On Tue, Jul 23, 2024 at 10:29:36PM +0200, Stefano Brivio wrote:
On Tue, 23 Jul 2024 00:09:37 +0200 Stefano Brivio
wrote: From: Jon Maloy
Based on an original patch by Jon Maloy:
...so, with this, the probing issue is solved: on a 6.10 kernel, SO_PEEK_OFF is not used, unless I disable IPv6 (with --ipv4-only / -4).
However, if I disable it, for some reason, resorting to IPv4, at least together with the flow table (applying just this patch to HEAD), I get something that looks like one of the "old" TCP stalls. On the host:
$ ./passt -f -t 10000 -4
and in the guest:
# ip link set dev eth0 up # dhclient eth0 # iperf3 -s -p 10000
back to the host:
$ iperf3 -c 127.0.0.1 -p 10000 Connecting to host 127.0.0.1, port 10000 [ 5] local 127.0.0.1 port 39046 connected to 127.0.0.1 port 10000 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 11.2 MBytes 94.3 Mbits/sec 0 5.50 MBytes [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 0 5.50 MBytes [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 5.50 MBytes
...the transfer never recovers.
Bother. I've reproduced and am debugging now.
Found it. Looks like one of the cases where we need to set SO_PEEK_OFF was lost somewhere in the refactorings :(.
Hah, great, thanks, it fixes the issue on my setup as well. Re-running all tests now... -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio