[PATCH 0/6] tcp: Improve error handling around socket pools
To reduce latency, the TCP code maintains several pools of ready to connect TCP sockets. This patch series contains a number of improvements to improve error handling and reporting when we're unable to refill these pools, or unable to obtain a socket from these pools. David Gibson (6): treewide: Use sa_family_t for address family variables tcp: Don't stop refilling socket pool if we find a filled entry tcp: Stop on first error when refilling socket pools tcp, tcp_splice: Issue warnings if unable to refill socket pool tcp, tcp_splice: Helpers for getting sockets from the pools tcp: Don't store errnos in socket pool icmp.c | 6 ++--- icmp.h | 4 +-- inany.h | 3 ++- tcp.c | 73 ++++++++++++++++++++++++++++++++++++++++------------ tcp.h | 2 +- tcp_conn.h | 4 +-- tcp_splice.c | 71 +++++++++++++++++++++++++++++++------------------- udp.c | 2 +- udp.h | 2 +- util.c | 2 +- util.h | 2 +- 11 files changed, 114 insertions(+), 57 deletions(-) -- 2.43.2
Sometimes we use sa_family_t for variables and parameters containing a
socket address family, other times we use a plain int. Since sa_family_t
is what's actually used in struct sockaddr and friends, standardise on
that.
Signed-off-by: David Gibson
Currently tcp_sock_refill_pool() stops as soon as it finds an entry in the
pool with a valid fd. This appears to makes sense: we always use fds from
the front of the pool, so if we find a filled one, the rest of the pool
should be filled as well.
However, that's not quite correct. If a previous refill hit errors trying
to open new sockets, it could leave gaps between blocks of valid fds. We're
going to add some changes that could make that more likely.
So, for robustness, instead skip over the filled entry but still try to
refill the rest of the array. We expect simply iterating over the pool to
be of small cost compared to even a single system call, so this shouldn't
have much impact.
Signed-off-by: David Gibson
On Mon, 19 Feb 2024 18:56:47 +1100
David Gibson
Currently tcp_sock_refill_pool() stops as soon as it finds an entry in the pool with a valid fd. This appears to makes sense: we always use fds from the front of the pool, so if we find a filled one, the rest of the pool should be filled as well.
However, that's not quite correct. If a previous refill hit errors trying to open new sockets, it could leave gaps between blocks of valid fds. We're going to add some changes that could make that more likely.
Uh oh, good catch, I didn't think of the possibility that with 3/6 we could otherwise stop in the middle of a block of "empty" slots, with filled slots occurring after them. -- Stefano
On Wed, Feb 21, 2024 at 10:08:49PM +0100, Stefano Brivio wrote:
On Mon, 19 Feb 2024 18:56:47 +1100 David Gibson
wrote: Currently tcp_sock_refill_pool() stops as soon as it finds an entry in the pool with a valid fd. This appears to makes sense: we always use fds from the front of the pool, so if we find a filled one, the rest of the pool should be filled as well.
However, that's not quite correct. If a previous refill hit errors trying to open new sockets, it could leave gaps between blocks of valid fds. We're going to add some changes that could make that more likely.
Uh oh, good catch, I didn't think of the possibility that with 3/6 we could otherwise stop in the middle of a block of "empty" slots, with filled slots occurring after them.
Right. The consequences aren't particularly dire if we get this wrong: we don't fill the pool as full as we should, but we should still have fds to work with. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Currently if we get an error opening a new socket while refilling a socket
pool, we carry on to the next slot and try again. This isn't very useful,
since by far the most likely cause of an error is some sort of resource
exhaustion. Trying again will probably just hit the same error, and maybe
even make things worse.
So, instead stop on the first error while refilling the pool, making do
with however many sockets we managed to open before the error.
Signed-off-by: David Gibson
Currently if tcp_sock_refill_pool() is unable to fill all the slots in the
pool, it will silently exit. This might lead to a later attempt to get
fds from the pool to fail at which point it will be harder to tell what
originally went wrong.
Instead add warnings if we're unable to refill any of the socket pools when
requested. We have tcp_sock_refill_pool() return an error and report it
in the callers, because those callers have more context allowing for a
more useful message.
Signed-off-by: David Gibson
On Mon, 19 Feb 2024 18:56:49 +1100
David Gibson
Currently if tcp_sock_refill_pool() is unable to fill all the slots in the pool, it will silently exit. This might lead to a later attempt to get fds from the pool to fail at which point it will be harder to tell what originally went wrong.
Instead add warnings if we're unable to refill any of the socket pools when requested. We have tcp_sock_refill_pool() return an error and report it in the callers, because those callers have more context allowing for a more useful message.
Signed-off-by: David Gibson
--- tcp.c | 24 ++++++++++++++++++------ tcp_conn.h | 2 +- tcp_splice.c | 16 ++++++++++++---- 3 files changed, 31 insertions(+), 11 deletions(-) diff --git a/tcp.c b/tcp.c index d49210bc..ad56ffc3 100644 --- a/tcp.c +++ b/tcp.c @@ -3007,8 +3007,10 @@ static int tcp_ns_socks_init(void *arg) * @c: Execution context * @pool: Pool of sockets to refill * @af: Address family to use + * + * Return: 0 on success, -ve error code if there was at least one error
Is -ve an abbreviation for something or just a typo? It sounds like the voltage of the emitter in a BJT transistor. Must be a typo, the rest of the patch looks good to me, I can also fix this up while applying. -- Stefano
On Wed, Feb 21, 2024 at 10:09:10PM +0100, Stefano Brivio wrote:
On Mon, 19 Feb 2024 18:56:49 +1100 David Gibson
wrote: Currently if tcp_sock_refill_pool() is unable to fill all the slots in the pool, it will silently exit. This might lead to a later attempt to get fds from the pool to fail at which point it will be harder to tell what originally went wrong.
Instead add warnings if we're unable to refill any of the socket pools when requested. We have tcp_sock_refill_pool() return an error and report it in the callers, because those callers have more context allowing for a more useful message.
Signed-off-by: David Gibson
--- tcp.c | 24 ++++++++++++++++++------ tcp_conn.h | 2 +- tcp_splice.c | 16 ++++++++++++---- 3 files changed, 31 insertions(+), 11 deletions(-) diff --git a/tcp.c b/tcp.c index d49210bc..ad56ffc3 100644 --- a/tcp.c +++ b/tcp.c @@ -3007,8 +3007,10 @@ static int tcp_ns_socks_init(void *arg) * @c: Execution context * @pool: Pool of sockets to refill * @af: Address family to use + * + * Return: 0 on success, -ve error code if there was at least one error
Is -ve an abbreviation for something or just a typo? It sounds like the voltage of the emitter in a BJT transistor.
Oh, it's supposed to be an abbreviation for "negative".
Must be a typo, the rest of the patch looks good to me, I can also fix this up while applying.
Go ahead, please. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Thu, 22 Feb 2024 08:44:48 +1100
David Gibson
On Wed, Feb 21, 2024 at 10:09:10PM +0100, Stefano Brivio wrote:
On Mon, 19 Feb 2024 18:56:49 +1100 David Gibson
wrote: Currently if tcp_sock_refill_pool() is unable to fill all the slots in the pool, it will silently exit. This might lead to a later attempt to get fds from the pool to fail at which point it will be harder to tell what originally went wrong.
Instead add warnings if we're unable to refill any of the socket pools when requested. We have tcp_sock_refill_pool() return an error and report it in the callers, because those callers have more context allowing for a more useful message.
Signed-off-by: David Gibson
--- tcp.c | 24 ++++++++++++++++++------ tcp_conn.h | 2 +- tcp_splice.c | 16 ++++++++++++---- 3 files changed, 31 insertions(+), 11 deletions(-) diff --git a/tcp.c b/tcp.c index d49210bc..ad56ffc3 100644 --- a/tcp.c +++ b/tcp.c @@ -3007,8 +3007,10 @@ static int tcp_ns_socks_init(void *arg) * @c: Execution context * @pool: Pool of sockets to refill * @af: Address family to use + * + * Return: 0 on success, -ve error code if there was at least one error
Is -ve an abbreviation for something or just a typo? It sounds like the voltage of the emitter in a BJT transistor.
Oh, it's supposed to be an abbreviation for "negative".
Funny, I just found https://en.wiktionary.org/wiki/-ve#English, but I never used "-ve" like that. Well, for consistency, I would change this to "negative" anyway. -- Stefano
We maintain pools of ready-to-connect sockets in both the original and
(for pasta) guest namespace to reduce latency when starting new TCP
connections. If we exhaust those pools we have to take a higher
latency path to get a new socket.
Currently we open-code that fallback in the places we need it. To improve
clarity encapsulate that into helper functions. While we're at it, give
those helpers clearer error reporting.
Signed-off-by: David Gibson
If tcp_sock_refill_pool() gets an error opening new sockets, it stores the
negative errno of that error in the socket pool. This isn't especially
useful:
* It's inconsistent with the initial state of the pool (all -1)
* It's inconsistent with the state of an entry that was valid and was
then consumed (also -1)
* By the time we did anything with this error code, it's now far removed
from the situation in which the error occurred, making it difficult to
report usefully
We now have error reporting closer to when failures happen on the refill
paths, so just leave a pool slot we can't fill as -1.
Signed-off-by: David Gibson
On Mon, 19 Feb 2024 18:56:51 +1100
David Gibson
If tcp_sock_refill_pool() gets an error opening new sockets, it stores the negative errno of that error in the socket pool. This isn't especially useful: * It's inconsistent with the initial state of the pool (all -1) * It's inconsistent with the state of an entry that was valid and was then consumed (also -1) * By the time we did anything with this error code, it's now far removed from the situation in which the error occurred, making it difficult to report usefully
We now have error reporting closer to when failures happen on the refill paths, so just leave a pool slot we can't fill as -1.
Signed-off-by: David Gibson
--- tcp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tcp.c b/tcp.c index 34e32641..27865691 100644 --- a/tcp.c +++ b/tcp.c @@ -3039,11 +3039,13 @@ int tcp_sock_refill_pool(const struct ctx *c, int pool[], sa_family_t af) int i;
for (i = 0; i < TCP_SOCK_POOL_SIZE; i++) { + int fd;
Usual newline between declaration and code. Rest of the series looks good to me, same as 4/6, if you agree I can fix it up, or respin, you choose. -- Stefano
On Wed, Feb 21, 2024 at 10:09:34PM +0100, Stefano Brivio wrote:
On Mon, 19 Feb 2024 18:56:51 +1100 David Gibson
wrote: If tcp_sock_refill_pool() gets an error opening new sockets, it stores the negative errno of that error in the socket pool. This isn't especially useful: * It's inconsistent with the initial state of the pool (all -1) * It's inconsistent with the state of an entry that was valid and was then consumed (also -1) * By the time we did anything with this error code, it's now far removed from the situation in which the error occurred, making it difficult to report usefully
We now have error reporting closer to when failures happen on the refill paths, so just leave a pool slot we can't fill as -1.
Signed-off-by: David Gibson
--- tcp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tcp.c b/tcp.c index 34e32641..27865691 100644 --- a/tcp.c +++ b/tcp.c @@ -3039,11 +3039,13 @@ int tcp_sock_refill_pool(const struct ctx *c, int pool[], sa_family_t af) int i;
for (i = 0; i < TCP_SOCK_POOL_SIZE; i++) { + int fd;
Usual newline between declaration and code. Rest of the series looks good to me, same as 4/6, if you agree I can fix it up, or respin, you choose.
Oops. Go ahead and fix on merge, please. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Mon, 19 Feb 2024 18:56:45 +1100
David Gibson
To reduce latency, the TCP code maintains several pools of ready to connect TCP sockets. This patch series contains a number of improvements to improve error handling and reporting when we're unable to refill these pools, or unable to obtain a socket from these pools.
David Gibson (6): treewide: Use sa_family_t for address family variables tcp: Don't stop refilling socket pool if we find a filled entry tcp: Stop on first error when refilling socket pools tcp, tcp_splice: Issue warnings if unable to refill socket pool tcp, tcp_splice: Helpers for getting sockets from the pools tcp: Don't store errnos in socket pool
Applied (with minor changes as mentioned for 4/6 and 6/6). -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio