On Wed, 10 Sep 2025 20:12:13 -0400
Jon Maloy
On 2025-09-09 14:16, Stefano Brivio wrote:
If the peer shrinks the window to zero, we'll skip storing the new window, as a convenient
Is this really convenient? It looks more like an inconsistency with potential for future trouble to me.
See your own reasoning in commit a740e16fd1b9 ("tcp: handle shrunk window advertisements from guest"). You described it as "extremely simple" back then, and now this is becoming marginally more complicated, but not because of this patch.
Wouldn't it be better with just a SEND_WIN_PROBE flag or similar to be reset as soon as the window goes non-zero again?
We could also store the actual window value, instead of an explicit flag, but yes, it's becoming clear for other reasons that we need to introduce a couple of new flags to simplify all this, see: https://archives.passt.top/passt-dev/20250910083754.323c0b1e@elisabeth/
way to cause window probes (which exceed any zero-sized window, strictly speaking) if we don't get window updates in a while.
As we do so, though, we need to ensure we don't try to queue more data from the socket right after we process this window update, as the entire point of a zero-window advertisement is to keep us from sending more data.
Signed-off-by: Stefano Brivio
Reviewed-by: David Gibson --- tcp.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/tcp.c b/tcp.c index b83510b..9c70a25 100644 --- a/tcp.c +++ b/tcp.c @@ -1271,8 +1271,10 @@ static void tcp_get_tap_ws(struct tcp_tap_conn *conn, * @c: Execution context * @conn: Connection pointer * @wnd: Window value, host order, unscaled + * + * Return: false on zero window (not stored to wnd_from_tap), true otherwise */ -static void tcp_tap_window_update(const struct ctx *c, +static bool tcp_tap_window_update(const struct ctx *c, struct tcp_tap_conn *conn, unsigned wnd) { wnd = MIN(MAX_WINDOW, wnd << conn->ws_from_tap); @@ -1285,13 +1287,14 @@ static void tcp_tap_window_update(const struct ctx *c, */ if (!wnd && SEQ_LT(conn->seq_ack_from_tap, conn->seq_to_tap)) { tcp_rewind_seq(c, conn); - return; + return false; }
conn->wnd_from_tap = MIN(wnd >> conn->ws_from_tap, USHRT_MAX);
/* FIXME: reflect the tap-side receiver's window back to the sock-side * sender by adjusting SO_RCVBUF? */
Not so sure. That sender will stop in due time anyway, with no harm done. Starting fiddling with SO_RCVBUF sounds like something to avoid.
This is all not related to this patch, but see commit cf3eeba6c0d7 ("tcp: Don't use TCP_WINDOW_CLAMP") for the reasoning. It's not about harm, it's rather about making our behaviour as transparent as possible.
+ return true; }
/** @@ -2101,9 +2104,8 @@ int tcp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af, if (!th->ack) goto reset;
- tcp_tap_window_update(c, conn, ntohs(th->window)); - - tcp_data_from_sock(c, conn); + if (tcp_tap_window_update(c, conn, ntohs(th->window))) + tcp_data_from_sock(c, conn);
if (p->count - idx == 1) return 1; @@ -2113,8 +2115,8 @@ int tcp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af, if (conn->events & TAP_FIN_RCVD) { tcp_sock_consume(conn, ntohl(th->ack_seq)); tcp_update_seqack_from_tap(c, conn, ntohl(th->ack_seq)); - tcp_tap_window_update(c, conn, ntohs(th->window)); - tcp_data_from_sock(c, conn); + if (tcp_tap_window_update(c, conn, ntohs(th->window))) + tcp_data_from_sock(c, conn);
if (conn->seq_ack_from_tap == conn->seq_to_tap) { if (th->ack && conn->events & TAP_FIN_SENT)
-- Stefano