[PATCH v2 0/4] TCP hash table changes, in preparation for flow table
I now have an in-progress draft of a unified hash table to go with the unified flow table. This turns out to be easier if we first make some preliminary changes to the structure of the TCP hash table. So, here are those. Changes since v1: * Use while loops instead of some equivalent, but hard to read for loops for the hash probing. * Switch from probing forwards through hash buckets to probing backwards. This makes the code closer to the version in Knuth its based on, and thus easier to see if we've made a mistake in adaptation. * Improve the helpers for modular arithmetic in use * Correct an error where we had things exactly the wrong way around when finding entries to move during removal. * Add a patch fixing a conceptual / documentation problem in some adjacent code David Gibson (4): tcp: Fix conceptually incorrect byte-order switch in tcp_tap_handler() tcp: Switch hash table to linear probing instead of chaining tcp: Implement hash table with indices rather than pointers tcp: Don't account for hash table size in tcp_hash() flow.h | 11 +++++ tcp.c | 143 ++++++++++++++++++++++++++++------------------------- tcp_conn.h | 2 - util.h | 28 +++++++++++ 4 files changed, 114 insertions(+), 70 deletions(-) -- 2.43.0
tcp_hash_lookup() expects the port numbers in host order, but the TCP
header, of course, has them in network order, so we need to switch them.
However we call htons() (host to network) instead of ntohs() (network to
host). This works because those do the same thing in practice (they only
wouldn't on very strange theoretical platforms which are neither big nor
little endian).
But, having this the "wrong" way around is misleading, so switch it around.
Signed-off-by: David Gibson
Currently we deal with hash collisions by letting a hash bucket contain
multiple entries, forming a linked list using an index in the connection
structure.
That's a pretty standard and simple approach, but in our case we can use
an even simpler one: linear probing. Here if a hash bucket is occupied
we just move onto the next one until we find a feww one. This slightly
simplifies lookup and more importantly saves some precious bytes in the
connection structure by removing the need for a link. It does require some
additional complexity for hash removal.
This approach can perform poorly with hash table load is high. However, we
already size our hash table of pointers larger than the connection table,
which puts an upper bound on the load. It's relatively cheap to decrease
that bound if we find we need to.
I adapted the linear probing operations from Knuth's The Art of Computer
Programming, Volume 3, 2nd Edition. Specifically Algorithm L and Algorithm
R in Section 6.4. Note that there is an error in Algorithm R as printed,
see errata at [0].
[0] https://www-cs-faculty.stanford.edu/~knuth/all3-prepre.ps.gz
Signed-off-by: David Gibson
We implement our hash table with pointers to the entry for each bucket (or
NULL). However, the entries are always allocated within the flow table,
meaning that a flow index will suffice, halving the size of the hash table.
For TCP, just a flow index would be enough, but future uses will want to
expand the hash table to cover indexing either side of a flow, so use a
flow_sidx_t as the type for each hash bucket.
Signed-off-by: David Gibson
Currently tcp_hash() returns the hash bucket for a value, that is the hash
modulo the size of the hash table. Usually it's a bit more flexible to
have hash functions return a "raw" hash value and perform the modulus in
the callers. That allows the same hash function to be used for multiple
tables of different sizes, or to re-use the hash for other purposes.
We don't do anything like that with tcp_hash() at present, but we have some
plans to do so. Prepare for that by making tcp_hash() and tcp_conn_hash()
return raw hash values.
Signed-off-by: David Gibson
On Thu, 7 Dec 2023 16:53:49 +1100
David Gibson
I now have an in-progress draft of a unified hash table to go with the unified flow table. This turns out to be easier if we first make some preliminary changes to the structure of the TCP hash table. So, here are those.
Changes since v1: * Use while loops instead of some equivalent, but hard to read for loops for the hash probing. * Switch from probing forwards through hash buckets to probing backwards. This makes the code closer to the version in Knuth its based on, and thus easier to see if we've made a mistake in adaptation. * Improve the helpers for modular arithmetic in use * Correct an error where we had things exactly the wrong way around when finding entries to move during removal. * Add a patch fixing a conceptual / documentation problem in some adjacent code
David Gibson (4): tcp: Fix conceptually incorrect byte-order switch in tcp_tap_handler() tcp: Switch hash table to linear probing instead of chaining tcp: Implement hash table with indices rather than pointers tcp: Don't account for hash table size in tcp_hash()
Applied. -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio