On Thu, 10 Oct 2024 20:11:57 -0400 Jon Maloy <jmaloy(a)redhat.com> wrote:Hi all, I added the addressing/routing workarounds suggested by Stefano, andFor context: there were/are two issues in the tests with Jon's setup (private IPv6 address and route on the host): 1. this private address was assigned with a /40 netmask, but in the pasta throughput tests via tap, namespace to host, to find out a local, non-loopback address to use, we do: ip -j -6 addr show|jq -rM '.[] | select(.ifname == "eth0").addr_info[] | select(.scope == "global" and .prefixlen == 64).local' ...I don't remember if there's a valid reason why we filter on /64 addresses. I guess we should drop that if not needed. Workaround for this: assign the address as /64. 2. the default gateway for IPv6 wasn't a link-local address. In ndp(), we use our_tap_ll as source address for advertisements (and before the introduction of our_tap_ll, this was conceptually the same). However, with --config-net (which is not used in these tests, because we want to test NDP and DHCPv6), we would copy routes, including the default gateway, from the host, and the default gateway copied from the host is the gateway address we also expect in the container. I quickly tried to change this logic (I'm not sure if we really *need* to use a link-local address as source for the advertisement, hence as router address), but if I use a non-link-local address, the kernel refuses to assign it. Workaround: use a link-local address as gateway address.the performance measurements now seems to be working flawlessly, even the one Stefano said failed in his runs. I made 5 runs from the master branch, and 5 with my two patches applied. You can observe the resulta at https://drive.google.com/drive/folders/1xGcWJ79smELbWOPwcJdsmyvoIrTz9R56...kind of, one would need a Google account and specific access. Anyway, I attached your logs to this email.However, there seems to be a systematic decrease in throughput. If we take the average over the runs for IPv6 ns - > host via tap, we get 33.56 Gb/s vs 31.84 Gb/, i.e. a 5% difference.That's not on the path that's directly affected by your patches: that's namespace to host, but the queues are used on the host to namespace (or guest) direction. On the other hand, acknowledgement segments are actually using those queues.I don´t really know what to make of this, and would like to know if anybody else can confirm or falsify this.It's quite hard to get statistically significant figures with those tests (transfers last one second) -- those are there just to check that there's nothing seriously wrong (that is, a massive decrease in throughput). To understand if this is an actual decrease in throughput, I would suggest to run a manual test, much longer (at least 20-30 seconds), with pasta or passt running under perf(1). Then, check throughput and cycles spent on the various system calls involved. -- Stefano