Re: [PATCH] udp: Fix breakage of UDP error handling by PKTINFO support

14 Apr 2025

      A couple of further observations:

On Mon, 14 Apr 2025 11:56:41 +0200
Stefano Brivio  wrote:
...
On Mon, 14 Apr 2025 13:58:53 +1000
David Gibson  wrote:
...
We recently enabled the IP_PKTINFO / IPV6_RECVPKTINFO socket options on our
UDP sockets.  This lets us obtain and properly handle the specific local
address used when we're "listening" with a socket on 0.0.0.0 or ::.
However, the PKTINFO cmsgs this option generates appear on error queue
messages as well as regular datagrams.  udp_sock_recverr() doesn't expect
this and so flags an unrecoverable error when it can't parse the control
message.
Correct this by adding space in udp_sock_recverr()s control buffer for the
additional PKTINFO data, and scan through all cmsgs for the RECVERR, rather
than only looking at the first one.
Link: https://bugs.passt.top/show_bug.cgi?id=99
Fixes: f4b0dd8b06 ("udp: Use PKTINFO cmsgs to get destination address..")
Reported-by: Stefano Brivio 
Signed-off-by: David Gibson 
The patch looks good to me, but I'm hitting something in my tests (with
my recent DNS fix) that's perhaps not intended.
On the host:
$ cat /etc/resolv.conf
  nameserver 127.0.0.1
...and nobody is listening on that address. Podman passes
--dns-forward 169.254.1.1 (default) and in the container:
$ podman run --net=pasta:-d,-l,/tmp/pasta.log --rm -ti alpine sh
I do:
/ # nslookup google.com 169.254.1.1
  ;; connection timed out; no servers could be reached
which is expected. But logs show a warning:
49.5377:          Flow 0 (NEW): FREE -> NEW
49.5377:          Flow 0 (INI): NEW -> INI
49.5377:          Flow 0 (INI): TAP [88.198.0.164]:43458 -> [169.254.1.1]:53 => ?
49.5377:          Flow 0 (TGT): INI -> TGT
49.5377:          Flow 0 (TGT): TAP [88.198.0.164]:43458 -> [169.254.1.1]:53 => HOST [0.0.0.0]:43458 -> [127.0.0.1]:53
49.5377:          Flow 0 (UDP flow): TGT -> TYPED
49.5377:          Flow 0 (UDP flow): TAP [88.198.0.164]:43458 -> [169.254.1.1]:53 => HOST [0.0.0.0]:43458 -> [127.0.0.1]:53
49.5378:          Flow 0 (UDP flow): Side 0 hash table insert: bucket: 148325
49.5378:          Flow 0 (UDP flow): Side 1 hash table insert: bucket: 309967
49.5378:          Flow 0 (UDP flow): TYPED -> ACTIVE
49.5378:          Flow 0 (UDP flow): TAP [88.198.0.164]:43458 -> [169.254.1.1]:53 => HOST [127.0.0.1]:43458 -> [127.0.0.1]:53
49.5378: WARNING: Error peeking at socket address: Connection refused
49.5379:          ICMP error on UDP socket 208: Connection refused
49.5379:          ICMP error on UDP socket 208: Connection refused
52.0404:          ICMP error on UDP socket 208: Connection refused
52.0404:          ICMP error on UDP socket 208: Connection refused
and I'm not sure if that warning is intended.
The warning actually comes from the previous series, not from "[PATCH
0/3] Properly preseve local addresses for UDP flows", but rather from
commit 84ab1305faba ("udp: Polish udp_vu_sock_info() and remove from
vu specific code").

I couldn't figure out why we get that error *there* but I didn't try
hard. If it's unavoidable, I guess we should change that to a debug()
message, otherwise a guest / container doing a couple of UDP port scans
could flood host logs.
...
By the way, I have the
feeling that it now takes longer (with the whole IP_PKTINFO thing) for
nslookup to fail, as if those ICMP errors were not relayed anymore, but
I'm not sure about this, and I didn't investigate yet.
I couldn't reproduce the "fast" nslookup failure I'm fairly sure I had
seen at some point. I tried reverting my fixes, and both of your series
separately, but I can't get that to work.

I'm attaching a packet capture of the current behaviour. All the
relevant details are the same as without your two series (and with or
without my fix).

I'm not sure if we should map the source address of the ICMP messages
(Jon, thoughts?) instead of leaving it as 127.0.0.1, in case
fwd_nat_from_tap() changed the outbound source address.

That is, if we have:

4.2401:          Flow 0 (UDP flow): TAP [88.198.0.164]:34843 -> [169.254.1.1]:53 => HOST [0.0.0.0]:34843 -> [127.0.0.1]:53
4.2402:          ICMP error on UDP socket 209: Connection refused

we currently send an ICMP message from 127.0.0.1 (I guess ignored)
instead of using 169.254.1.1 as source. Should udp_send_tap_icmp4()
call tap_icmp4_send() with 'fromside->oaddr' as source (instead of
'saddr'), where 'fromside' is the flow side opposite to 'toside'?

I'm not entirely sure if those ICMP messages are ignored, though.
Maybe it's all fine the way it is and it's just ports that need to
match, and it's normal that nslookup retries after a while...

-- 
Stefano