On Mon, Mar 17, 2025 at 01:40:51PM -0400, Jon Maloy wrote:On 2025-03-16 22:58, David Gibson wrote:Urgh. I mean, yes it works for (the current implementation of) traceroute, but doing it like this really isn't correct for anything else that manipulates TTL. I mean, yes, traceroute is *probably* the only thing to do that, but I think we should endeavour to be more generally correct if it's not too complicated. And in this case, I don't think it's too complicated.On Sat, Mar 15, 2025 at 11:32:45AM -0400, Jon Maloy wrote: > Now that ICMP pass-through from socket-to-tap is in place, it is > easy to support UDP based traceroute functionality in direction > tap-to-socket.[...]I forgot one thing at our discussion about this at our meeting. Traceroute works by sending three packets with ttl=1, then three with ttl=2, then three with ttl=3 and so forth. For each packet sent it steps the destination port number, starting at 33334, and in the most extreme case (assuming it stops at ttl=32, which linux traceroute does in practice) stopping at port number 33524. The whole port range [33434:33535] is unassigned by IANA, and no sane developer would use it for his application, for obvious reasons. This means that every single traceroute/udp message will become a separate flow, and there is no need to complicate things by storing the ttl in the flow or use it as criteria. It also means that my current approach works fine. I don't even need to restore the ttl in the outgoing socket, since that, just like the associated flow, is a single-shot affair.+ uint8_t ttl, const struct pool *p, int idx, + const struct timespec *now) { const struct flowside *toside; struct mmsghdr mm[UIO_MAXIOV]; @@ -933,6 +935,12 @@ int udp_tap_handler(const struct ctx *c, uint8_t pif, mm[i].msg_hdr.msg_controllen = 0; mm[i].msg_hdr.msg_flags = 0; + if (ttl <= 30) { + if (setsockopt(s, IPPROTO_IP, IP_TTL, + &ttl, sizeof(ttl)) < 0)AFAIK this will set the TTL on every subsequent packet, not just the next one, so this isn't quite right. To use this I guess we'd have to store the correct TTL in the flow, and do the sockopt if it's different from the guest side one.Thinking about this more, I don't actually think setting the TTL per-packet with a cmsg is particularly useful anyway. For traceroute to a small extent - and I suspect in the unlikely case of anything else manipulating TTL - it seems like the guest is likely to use the same TTL for multiple packets in a row. That means just calling the setsockopt() when it changes from the current value should work well. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibsonUnless there's a way to set TTL via cmsg. Maybe there is, but I haven't spotted it in a quick glance.I seems I can add it as ancillary data if I add a control buffer to struct mmsghdr.msghdr. However, it is hard to see any benefit of this, given the above. It will cost more performance on the critical code path than the simple test I am doing now.