On Thu, Dec 04, 2025 at 08:45:39AM +0100, Stefano Brivio wrote:
...under two conditions:
- the remote peer is advertising a bigger value to us, meaning that a bigger sending buffer is likely to benefit throughput, AND
I think this condition is redundant: if the remote peer is advertising less, we'll clamp new_wnd_to_tap to that value anyway.
- this is not a short-lived connection, where the latency cost of retransmissions would be otherwise unacceptable.
By doing this, we can reliably trigger TCP buffer size auto-tuning (as long as it's available) on bulk data transfers.
Signed-off-by: Stefano Brivio
--- tcp.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/tcp.c b/tcp.c index 2220059..454df69 100644 --- a/tcp.c +++ b/tcp.c @@ -353,6 +353,13 @@ enum { #define LOW_RTT_TABLE_SIZE 8 #define LOW_RTT_THRESHOLD 10 /* us */
+/* Try to avoid retransmissions to improve latency on short-lived connections */ +#define SHORT_CONN_BYTES (16ULL * 1024 * 1024) + +/* Temporarily exceed available sending buffer to force TCP auto-tuning */ +#define SNDBUF_BOOST_FACTOR 150 /* % */ +#define SNDBUF_BOOST(x) ((x) * SNDBUF_BOOST_FACTOR / 100)
For the short term, the fact this works empirically is enough. For the longer term, it would be nice to have a better understanding of what this "overcommit" amount is actually estimating. I think what we're looking for is an estimate of the number of bytes that will have left the buffer by the time the guest gets back to us. So: <connection throughput> * <guest-side RTT> Alas, I don't see a way to estimate either of those from the information we already track - we'd need additional bookkeeping.
#define ACK_IF_NEEDED 0 /* See tcp_send_flag() */
#define CONN_IS_CLOSING(conn) \ @@ -1137,6 +1144,9 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
if ((int)sendq > SNDBUF_GET(conn)) /* Due to memory pressure? */ limit = 0; + else if ((int)tinfo->tcpi_snd_wnd > SNDBUF_GET(conn) && + tinfo->tcpi_bytes_acked > SHORT_CONN_BYTES)
This is pretty subtle, I think it would be worth having some rationale in a comment, not just the commit message.
+ limit = SNDBUF_BOOST(SNDBUF_GET(conn)) - (int)sendq; else limit = SNDBUF_GET(conn) - (int)sendq;
-- 2.43.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson