[PATCH v2 0/3] Fix errors in FIN timeout logic
While investigating bug 179, I found a number of things that confused me about the TCP timer handling. One of them, I think I figured out what's going on and what should be done about it. So, here are the changes. This is mostly about FIN handling and only tangentially about the timer, but it does at least slightly simplify the timer handling while I figure out the rest of it. Changes in v2: * Stefano pointed out some errors in my guesses at the history of things, evised commit message of 2/3 accordingly * Added 3/3 checking for shutdown(2) failures David Gibson (3): tcp: Retransmit FINs like data segments tcp: Eliminate FIN_TIMEOUT tcp, tcp_splice: Check for failures of shutdown(2) tcp.c | 49 ++++++++++++++++++++++++------------------------- tcp_buf.c | 1 + tcp_splice.c | 3 ++- tcp_vu.c | 1 + 4 files changed, 28 insertions(+), 26 deletions(-) -- 2.52.0
Despite the name and its value of 60s, FIN_TIMEOUT is not related to
the kernel's net.ipv4.tcp_fin_timeout sysctl. Indeed, we can't make
an equivalent to that, since it relies on information that endpoint
kernels have, but we do not.
Neither is it simply the time to wait for an ACK to a FIN. It may
have been intended as that at some point, but the implementation has
not matched that for some time. In any case RFC9293 makes no
distinction between ACKs to FIN segments and ACKs to data segments, so
we now implement handling of ACKs to FINs with the same code path as
ACKs to data segments.
The theory of operation describes FIN_TIMEOUT thus:
- FIN_TIMEOUT: if a FIN segment was acknowledged by tap/guest and a FIN
segment (write shutdown) was sent via socket (events SOCK_FIN_SENT and
TAP_FIN_ACKED), but no socket activity is detected from the socket within
this time, reset the connection
In other words, it's attempting to handle the case that we
shutdown(SHUT_WR) on the socket side (causing the kernel to send a
FIN), but the kernel never responds with an EPOLLHUP event indicating
the peer has acked the FIN.
The description doesn't match what the code does: in tcp_timer_ctl()
we only set FIN_TIMEOUT on our timer when when ACK_FROM_TAP_DUE is
unset, but we only act on the FIN_TIMEOUT if ACK_FROM_TAP_DUE *is*
(also) set.
In fact, there's no need to handle this case. Once we've called
shutdown(SHUT_WR), it's the kernel's responsibility to resend FINs as
needed (and reset the connection if that times out). Therefore,
entirely remove the FIN_TIMEOUT related logic.
Signed-off-by: David Gibson
On Fri, 30 Jan 2026 15:41:01 +1100
David Gibson
While investigating bug 179, I found a number of things that confused me about the TCP timer handling. One of them, I think I figured out what's going on and what should be done about it. So, here are the changes. This is mostly about FIN handling and only tangentially about the timer, but it does at least slightly simplify the timer handling while I figure out the rest of it.
Changes in v2: * Stefano pointed out some errors in my guesses at the history of things, evised commit message of 2/3 accordingly * Added 3/3 checking for shutdown(2) failures
David Gibson (3): tcp: Retransmit FINs like data segments tcp: Eliminate FIN_TIMEOUT tcp, tcp_splice: Check for failures of shutdown(2)
Applied. -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio