On Mon, Nov 04, 2024 at 12:16:58PM +1100, David Gibson wrote:On Fri, Nov 01, 2024 at 08:51:32PM -0400, Jon Maloy wrote: > In order to reduce static memory and code footprint, we merge > the array for l2 flag frames into the one for payload frames. > > This change also ensures that no flag message will be sent out > over the l2 media bypassing already queued payload messages. > > Performance measurements with iperf3, where we force all > traffic via the tap queue, show no significant difference:Spotted a couple of extra things I missed in my previous mail. [snip]> @@ -107,13 +96,6 @@ void tcp_sock_iov_init(const struct ctx *c) > tcp_payload[i].th.ack = 1; > } > > - for (i = 0; i < ARRAY_SIZE(tcp_flags); i++) { > - tcp6_flags_ip[i] = ip6; > - tcp4_flags_ip[i] = iph; > - tcp_flags[i].th.doff = sizeof(struct tcphdr) / 4; > - tcp_flags[i].th.ack = 1; > - } > - > for (i = 0; i < TCP_FRAMES_MEM; i++) { > struct iovec *iov = tcp_l2_iov[i]; >Because you're now setting all the flags fields for every frame, you should remove the initialisation of th.ack from iov_init. [snip]Actually, looking at this again, you shouldn't need to update doff in any case - it's the same for flag frames and data frames. Of course, it's plausible there's no point pre-filling the doff (or any other) header fields, but I don't think changing that should be in scope for this patch. I hope to post this afternoon a bunch of different buffer cleanups, along with vhost rebased on those and other changes. Unfortunately those will conflict with this patch, sorry. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson@@ -274,6 +237,9 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn, iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp6_payload_ip[tcp_payload_used]); iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src; } + payload = iov[TCP_IOV_PAYLOAD].iov_base; + flags = &payload->th.window - 1; + *(flags) = PAYLOAD_FLAGS;I think this is likely to run afoul of TBAA rules, which could cause miscompiles because cc assumes *flags is not aliased with payload->th. Although it's more bulky, I think it's better to explicitly assign each of the fields in struct tcphdr. The compiler should be able to boil it down to the same instructions in any case. Note that now we're using the netinet version of struct tcphdr, it has an anonymous union so you can use th_flags, rather than the one bit fields (doesn't include doffset, though).