On Fri, Oct 24, 2025 at 01:04:31AM +0200, Stefano Brivio wrote:
On Fri, 17 Oct 2025 14:28:37 +0800 Yumei Huang
wrote: If a client connects while guest is not connected or ready yet, resend SYN instead of just resetting connection after 10 seconds.
Use the same backoff calculation for the timeout as linux kernel.
Linux.
Link: https://bugs.passt.top/show_bug.cgi?id=153 Signed-off-by: Yumei Huang
--- tcp.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++-------- tcp.h | 5 +++++ 2 files changed, 52 insertions(+), 8 deletions(-) diff --git a/tcp.c b/tcp.c index 2ec4b0c..9385132 100644 --- a/tcp.c +++ b/tcp.c @@ -179,9 +179,11 @@ * * Timeouts are implemented by means of timerfd timers, set based on flags: * - * - SYN_TIMEOUT: if no ACK is received from tap/guest during handshake (flag - * ACK_FROM_TAP_DUE without ESTABLISHED event) within this time, reset the - * connection + * - SYN_TIMEOUT_INIT: if no ACK is received from tap/guest during handshake + * (flag ACK_FROM_TAP_DUE without ESTABLISHED event) within this time, resend + * SYN. It's the starting timeout for the first SYN retry. If this persists
"If this persists" makes sense for the existing ACK_TIMEOUT description but not here, because it looks like it refers to "starting timeout".
Coupled with the next patch, it becomes increasingly difficult to understand what "this" persisting thing is.
Yeah. This was my suggested wording, based on the existing wording for ACK_TIMEOUT. It's not great, but I struggled a bit to find better wording.
Maybe directly say "Retry for ..., then reset the connection"? It's shorter and clearer.
There it is :). "Retry (NNN) times, then reset the connection". [snip]
@@ -2409,8 +2419,17 @@ void tcp_timer_handler(const struct ctx *c, union epoll_ref ref) tcp_timer_ctl(c, conn); } else if (conn->flags & ACK_FROM_TAP_DUE) { if (!(conn->events & ESTABLISHED)) { - flow_dbg(conn, "handshake timeout"); - tcp_rst(c, conn); + if (conn->retries >= TCP_MAX_RETRIES || + conn->retries >= (c->tcp.tcp_syn_retries + + c->tcp.syn_linear_timeouts)) { + flow_dbg(conn, "handshake timeout"); + tcp_rst(c, conn); + } else { + flow_trace(conn, "SYN timeout, retry"); + tcp_send_flag(c, conn, SYN); + conn->retries++;
I think I already raised this point on a previous revision: this needs to be zeroed as the connection is established, but I don't see that in the current version.
Yes, you raised that, but then I realised it's already handled. I think I put that in the thread, not just direct to Yumei, but maybe not? Or it just got lost in the minutiae. When we receive a SYN-ACK, it will have th->ack_seq advanced a byte acknowledging the SYN. tcp_tap_handler() calls tcp_update_seqack_from_tap() in the !ESTABLISHED case which will see the new ack_seq and clear retries (retrans before this series).
+ tcp_timer_ctl(c, conn); + } } else if (CONN_HAS(conn, SOCK_FIN_SENT | TAP_FIN_ACKED)) { flow_dbg(conn, "FIN timeout"); tcp_rst(c, conn); @@ -2766,6 +2785,24 @@ static socklen_t tcp_probe_tcp_info(void) return sl; }
+/** + * tcp_syn_params_init() - Get initial SYN parameters for inbound connection
They're not initial, they'll be used for all the connections if I understand correctly.
Maybe "Get SYN retries sysctl values"? I think the _init() in the function name is also somewhat misleading.
"Get host kernel RTO parameters"? Since we're thinking of extending this to cover the RTO upper bound as well as the SYN specific parameters.
+ * @c: Execution context +*/ +void tcp_syn_params_init(struct ctx *c) +{ + intmax_t tcp_syn_retries, syn_linear_timeouts; + + tcp_syn_retries = read_file_integer(TCP_SYN_RETRIES, 8);
Why 8? Perhaps a #define would help?
+ syn_linear_timeouts = read_file_integer(TCP_SYN_LINEAR_TIMEOUTS, 1); + + c->tcp.tcp_syn_retries = MIN(tcp_syn_retries, UINT8_MAX); + c->tcp.syn_linear_timeouts = MIN(syn_linear_timeouts, UINT8_MAX); + + debug("TCP SYN parameters: retries=%"PRIu8", linear_timeouts=%"PRIu8,
Similar to the comment above: these are not parameters of SYN segments (which would seem to imply TCP options, such as the MSS).
We typically don't print C assignments, rather human-readable messages, so that could be "Read sysctl values tcp_syn_retries: ..., syn_linear_timeouts: ...".
+ c->tcp.tcp_syn_retries, c->tcp.syn_linear_timeouts); +} + /** * tcp_init() - Get initial sequence, hash secret, initialise per-socket data * @c: Execution context @@ -2776,6 +2813,8 @@ int tcp_init(struct ctx *c) { ASSERT(!c->no_tcp);
+ tcp_syn_params_init(c); + tcp_sock_iov_init(c);
memset(init_sock_pool4, 0xff, sizeof(init_sock_pool4)); diff --git a/tcp.h b/tcp.h index 234a803..4369b52 100644 --- a/tcp.h +++ b/tcp.h @@ -59,12 +59,17 @@ union tcp_listen_epoll_ref { * @fwd_out: Port forwarding configuration for outbound packets * @timer_run: Timestamp of most recent timer run * @pipe_size: Size of pipes for spliced connections + * @tcp_syn_retries: Number of SYN retries during handshake + * @syn_linear_timeouts: Number of SYN retries using linear backoff timeout + * before switching to exponential backoff timeout
Maybe more compact:
* @syn_linear_timeouts: SYN retries before using exponential timeout
*/ struct tcp_ctx { struct fwd_ports fwd_in; struct fwd_ports fwd_out; struct timespec timer_run; size_t pipe_size; + uint8_t tcp_syn_retries; + uint8_t syn_linear_timeouts; };
#endif /* TCP_H */
-- Stefano
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson