[PATCH v3 00/20] Introduce discontiguous frames management
This series introduces iov_tail to convey frame information between functions. This is only an API change, for the moment the memory pool is only able to store contiguous buffer, so, except for vhost-user in a special case, we only play with iovec array with only one entry. Laurent Vivier (20): arp: Don't mix incoming and outgoing buffers iov: Introduce iov_slice(), iov_tail_slice() and iov_tail_drop(). iov: Update IOV_REMOVE_HEADER() and IOV_PEEK_HEADER() tap: Use iov_tail with tap_add_packet() packet: Use iov_tail with packet_add() packet: Add packet_data() arp: Convert to iov_tail ndp: Convert to iov_tail icmp: Convert to iov_tail udp: Convert to iov_tail tcp: Convert tcp_tap_handler() to use iov_tail tcp: Convert tcp_data_from_tap() to use iov_tail dhcpv6: move offset initialization out of dhcpv6_opt() dhcpv6: Extract sending of NotOnLink status dhcpv6: Convert to iov_tail dhcpv6: Use iov_tail in dhcpv6_opt() dhcp: Convert to iov_tail ip: Use iov_tail in ipv6_l4hdr() tap: Convert tap4_handler() to iov_tail tap: Convert tap6_handler() to iov_tail arp.c | 90 ++++++++++++++------- dhcp.c | 45 ++++++----- dhcpv6.c | 222 ++++++++++++++++++++++++++++++++-------------------- icmp.c | 25 +++--- iov.c | 165 +++++++++++++++++++++++++++++++++++--- iov.h | 61 +++++++++++---- ip.c | 27 +++---- ip.h | 3 +- ndp.c | 7 +- packet.c | 84 ++++++++------------ packet.h | 21 ++--- pcap.c | 1 + tap.c | 88 ++++++++++++++------- tap.h | 3 +- tcp.c | 61 ++++++++++----- tcp_buf.c | 2 +- udp.c | 34 +++++--- vu_common.c | 26 ++---- 18 files changed, 634 insertions(+), 331 deletions(-) -- 2.49.0
Don't use the memory of the incoming packet to build the outgoing buffer
as it can be memory of the TX queue in the case of vhost-user.
Moreover with vhost-user, the packet can be splitted accross several
iovec and it's easier to rebuild it in a buffer than updating an
existing iovec array.
Signed-off-by: Laurent Vivier
iov_tail_drop() discards a header from the iov_tail,
iov_slice() makes a new iov referencing a subset of the data
in the original iov.
iov_tail_slice() is a wrapper for iov_slice() using iov_tail
Signed-off-by: Laurent Vivier
On Tue, Apr 15, 2025 at 11:42:42AM +0200, Laurent Vivier wrote:
iov_tail_drop() discards a header from the iov_tail,
iov_slice() makes a new iov referencing a subset of the data in the original iov.
iov_tail_slice() is a wrapper for iov_slice() using iov_tail
Signed-off-by: Laurent Vivier
--- iov.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ iov.h | 6 ++++ 2 files changed, 98 insertions(+) diff --git a/iov.c b/iov.c index 8c63b7ea6e31..60ab1f9af973 100644 --- a/iov.c +++ b/iov.c @@ -156,6 +156,59 @@ size_t iov_size(const struct iovec *iov, size_t iov_cnt) return len; }
+/** + * iov_slice - Make a new iov referencing a subset of the data in the original iov + * + * @dst_iov: Pointer to the destination array of struct iovec describing + * the scatter/gather I/O vector to copy to.
The indentation seems a bit wonky here, looks like some lines are using spaces and others tabs.
+ * @dst_iov_cnt: Maximum number of elements in the destination iov array. + * @iov: Pointer to the source array of struct iovec describing + * the scatter/gather I/O vector to copy from. + * @iov_cnt: Maximum number of elements in the source iov array. + * @offset: Offset within the source iov from where copying should start. + * @bytes: On input, total number of bytes to copy from @iov to @dst_iov, + * if @bytes is NULL, copy all the content of @iov, + * on output actual number of bytes copied. + * + * Returns: The number of elements successfully copied to the destination + * iov array, a negative value if there is not enough room in the + * destination iov array + */ +/* cppcheck-suppress staticFunction */ +int iov_slice(struct iovec *dst_iov, size_t dst_iov_cnt, + const struct iovec *iov, size_t iov_cnt, + size_t offset, size_t *bytes)
The return value, is (usually) an iov count. Since that's given as a size_t in most places, it probably makes sense to use an ssize_t in the return value, rather than plain int.
+{ + unsigned int i, j; + size_t remaining = bytes ? *bytes : SIZE_MAX; + + i = iov_skip_bytes(iov, iov_cnt, offset, &offset); + + /* create a new iov array referencing a subset of the source one */ + for (j = 0; i < iov_cnt && j < dst_iov_cnt && remaining; i++) { + size_t len = MIN(remaining, iov[i].iov_len - offset); + + dst_iov[j].iov_base = (char *)iov[i].iov_base + offset; + dst_iov[j].iov_len = len; + j++; + remaining -= len; + offset = 0; + } + if (j == dst_iov_cnt) { + if (bytes) { + if (remaining) + return -1; + } else if (i != iov_cnt) { + return -1; + } + } + + if (bytes) + *bytes -= remaining; + + return j; +} + /** * iov_tail_prune() - Remove any unneeded buffers from an IOV tail * @tail: IO vector tail (modified) @@ -191,6 +244,21 @@ size_t iov_tail_size(struct iov_tail *tail) return iov_size(tail->iov, tail->cnt) - tail->off; }
+/** + * iov_tail_drop() - Discard a header from an IOV tail + * @tail: IO vector tail + * @len: length to move the head of the tail + * + * Returns: true if the tail still contains any bytes, otherwise false + */ +/* cppcheck-suppress unusedFunction */ +bool iov_tail_drop(struct iov_tail *tail, size_t len) +{ + tail->off = tail->off + len; + + return iov_tail_prune(tail); +} + /** * iov_peek_header_() - Get pointer to a header from an IOV tail * @tail: IOV tail to get header from @@ -247,3 +315,27 @@ void *iov_remove_header_(struct iov_tail *tail, size_t len, size_t align) tail->off = tail->off + len; return p; } + +/** + * iov_tail_slice - Make a new iov referencing a subset of the data + * in an iov_tail + * + * @dst_iov: Pointer to the destination array of struct iovec describing + * the scatter/gather I/O vector to copy to. + * @dst_iov_cnt: Maximum number of elements in the destination iov array. + * @tail: Pointer to the source iov_tail + * @bytes: On input, total number of bytes to copy from @iov to @dst_iov, + * if @bytes is NULL, copy all the content of @iov, + * on output actual number of bytes copied.
Do you have any use cases that need this parameter? The ones I remember from v2 all wanted to copy the entire remainder of the iov_tail.
+ * Returns: The number of elements successfully copied to the destination + * iov array, a negative value if there is not enough room in the + * destination iov array + */ +/* cppcheck-suppress unusedFunction */ +int iov_tail_slice(struct iovec *dst_iov, size_t dst_iov_cnt, + struct iov_tail *tail, size_t *bytes) +{ + return iov_slice(dst_iov, dst_iov_cnt, &tail->iov[0], tail->cnt, + tail->off, bytes); +} diff --git a/iov.h b/iov.h index 9855bf0c0c32..4bbdbf23f27b 100644 --- a/iov.h +++ b/iov.h @@ -28,6 +28,9 @@ size_t iov_from_buf(const struct iovec *iov, size_t iov_cnt, size_t iov_to_buf(const struct iovec *iov, size_t iov_cnt, size_t offset, void *buf, size_t bytes); size_t iov_size(const struct iovec *iov, size_t iov_cnt); +int iov_slice(struct iovec *dst_iov, size_t dst_iov_cnt, + const struct iovec *iov, size_t iov_cnt, + size_t offset, size_t *bytes);
/* * DOC: Theory of Operation, struct iov_tail @@ -72,8 +75,11 @@ struct iov_tail {
bool iov_tail_prune(struct iov_tail *tail); size_t iov_tail_size(struct iov_tail *tail); +bool iov_tail_drop(struct iov_tail *tail, size_t len); void *iov_peek_header_(struct iov_tail *tail, size_t len, size_t align); void *iov_remove_header_(struct iov_tail *tail, size_t len, size_t align); +int iov_tail_slice(struct iovec *dst_iov, size_t dst_iov_cnt, + struct iov_tail *tail, size_t *bytes);
/** * IOV_PEEK_HEADER() - Get typed pointer to a header from an IOV tail
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
Provide a temporary variable of the wanted type to store
the header if the memory in the iovec array is not contiguous.
Signed-off-by: Laurent Vivier
On Tue, Apr 15, 2025 at 11:42:43AM +0200, Laurent Vivier wrote:
Provide a temporary variable of the wanted type to store the header if the memory in the iovec array is not contiguous.
Signed-off-by: Laurent Vivier
Reviewed-by: David Gibson
--- iov.c | 53 ++++++++++++++++++++++++++++++++++++++++++++--------- iov.h | 52 ++++++++++++++++++++++++++++++++++++++-------------- tcp_buf.c | 2 +- 3 files changed, 83 insertions(+), 24 deletions(-)
diff --git a/iov.c b/iov.c index 60ab1f9af973..6ae792a1bf8f 100644 --- a/iov.c +++ b/iov.c @@ -108,7 +108,7 @@ size_t iov_from_buf(const struct iovec *iov, size_t iov_cnt, * * Returns: The number of bytes successfully copied. */ -/* cppcheck-suppress unusedFunction */ +/* cppcheck-suppress [staticFunction] */ size_t iov_to_buf(const struct iovec *iov, size_t iov_cnt, size_t offset, void *buf, size_t bytes) { @@ -126,6 +126,7 @@ size_t iov_to_buf(const struct iovec *iov, size_t iov_cnt, /* copying data */ for (copied = 0; copied < bytes && i < iov_cnt; i++) { size_t len = MIN(iov[i].iov_len - offset, bytes - copied); + /* NOLINTNEXTLINE(clang-analyzer-core.NonNullParamChecker) */ memcpy((char *)buf + copied, (char *)iov[i].iov_base + offset,
... it would be nice to have a clearer idea what's going on with this lint.
len); copied += len; @@ -260,7 +261,7 @@ bool iov_tail_drop(struct iov_tail *tail, size_t len) }
/** - * iov_peek_header_() - Get pointer to a header from an IOV tail + * iov_check_header() - Check if a header can be accessed * @tail: IOV tail to get header from * @len: Length of header to get, in bytes * @align: Required alignment of header, in bytes @@ -271,8 +272,7 @@ bool iov_tail_drop(struct iov_tail *tail, size_t len) * overruns the IO vector, is not contiguous or doesn't have the * requested alignment. */ -/* cppcheck-suppress [staticFunction,unmatchedSuppression] */ -void *iov_peek_header_(struct iov_tail *tail, size_t len, size_t align) +static void *iov_check_header(struct iov_tail *tail, size_t len, size_t align) { char *p;
@@ -292,27 +292,62 @@ void *iov_peek_header_(struct iov_tail *tail, size_t len, size_t align) return p; }
+/** + * iov_peek_header_() - Get pointer to a header from an IOV tail + * @tail: IOV tail to get header from + * @v: Temporary memory to use if the memory in @tail + * is discontinuous + * @len: Length of header to get, in bytes + * @align: Required alignment of header, in bytes + * + * @tail may be pruned, but will represent the same bytes as before. + * + * Returns: Pointer to the first @len logical bytes of the tail, or to + * a copy if that overruns the IO vector, is not contiguous or + * doesn't have the requested alignment. NULL if that overruns the + * IO vector. + */ +/* cppcheck-suppress [staticFunction,unmatchedSuppression] */ +void *iov_peek_header_(struct iov_tail *tail, void *v, size_t len, size_t align) +{ + char *p = iov_check_header(tail, len, align); + size_t l; + + if (p) + return p; + + l = iov_to_buf(tail->iov, tail->cnt, tail->off, v, len); + if (l != len) + return NULL; + + return v; +} + /** * iov_remove_header_() - Remove a header from an IOV tail * @tail: IOV tail to remove header from (modified) + * @v: Temporary memory to use if the memory in @tail + * is discontinuous * @len: Length of header to remove, in bytes * @align: Required alignment of header, in bytes * * On success, @tail is updated so that it longer includes the bytes of the * returned header. * - * Returns: Pointer to the first @len logical bytes of the tail, NULL if that - * overruns the IO vector, is not contiguous or doesn't have the - * requested alignment. + * Returns: Pointer to the first @len logical bytes of the tail, or to + * a copy if that overruns the IO vector, is not contiguous or + * doesn't have the requested alignment. NULL if that overruns the + * IO vector. */ -void *iov_remove_header_(struct iov_tail *tail, size_t len, size_t align) +void *iov_remove_header_(struct iov_tail *tail, void *v, size_t len, size_t align) { - char *p = iov_peek_header_(tail, len, align); + char *p = iov_peek_header_(tail, v, len, align);
if (!p) return NULL;
tail->off = tail->off + len; + return p; }
diff --git a/iov.h b/iov.h index 4bbdbf23f27b..03c85bd1a465 100644 --- a/iov.h +++ b/iov.h @@ -73,41 +73,65 @@ struct iov_tail { #define IOV_TAIL(iov_, cnt_, off_) \ (struct iov_tail){ .iov = (iov_), .cnt = (cnt_), .off = (off_) }
+/** + * IOV_TAIL_FROM_BUF() - Create a new IOV tail from a buffer + * @buf_: Buffer address to use in the iovec + * @len_: Buffer size + * @off_: Byte offset in the buffer where the tail begins + */ +#define IOV_TAIL_FROM_BUF(buf_, len_, off_) \ + IOV_TAIL((&(const struct iovec){ .iov_base = (buf_), .iov_len = (len_) }), 1, (off_)) + bool iov_tail_prune(struct iov_tail *tail); size_t iov_tail_size(struct iov_tail *tail); bool iov_tail_drop(struct iov_tail *tail, size_t len); -void *iov_peek_header_(struct iov_tail *tail, size_t len, size_t align); -void *iov_remove_header_(struct iov_tail *tail, size_t len, size_t align); +void *iov_peek_header_(struct iov_tail *tail, void *v, size_t len, size_t align); +void *iov_remove_header_(struct iov_tail *tail, void *v, size_t len, size_t align); int iov_tail_slice(struct iovec *dst_iov, size_t dst_iov_cnt, struct iov_tail *tail, size_t *bytes);
/** * IOV_PEEK_HEADER() - Get typed pointer to a header from an IOV tail * @tail_: IOV tail to get header from - * @type_: Data type of the header + * @var_: Temporary buffer of the type of the header to use if + * the memory in the iovec array is not contiguous. * * @tail_ may be pruned, but will represent the same bytes as before. * - * Returns: Pointer of type (@type_ *) located at the start of @tail_, NULL if - * we can't get a contiguous and aligned pointer. + * Returns: Pointer of type (@type_ *) located at the start of @tail_ + * or to @var_ if iovec memory is not contiguous, NULL if + * that overruns the iovec. */ -#define IOV_PEEK_HEADER(tail_, type_) \ - ((type_ *)(iov_peek_header_((tail_), \ - sizeof(type_), __alignof__(type_)))) + +#define IOV_PEEK_HEADER(tail_, var_) \ + ((__typeof__(var_) *)(iov_peek_header_((tail_), &(var_), \ + sizeof(var_), __alignof__(var_))))
/** * IOV_REMOVE_HEADER() - Remove and return typed header from an IOV tail * @tail_: IOV tail to remove header from (modified) - * @type_: Data type of the header to remove + * @var_: Temporary buffer of the type of the header to use if + * the memory in the iovec array is not contiguous. * * On success, @tail_ is updated so that it longer includes the bytes of the * returned header. * - * Returns: Pointer of type (@type_ *) located at the old start of @tail_, NULL - * if we can't get a contiguous and aligned pointer. + * Returns: Pointer of type (@type_ *) located at the start of @tail_ + * or to @var_ if iovec memory is not contiguous, NULL if + * that overruns the iovec. + */ + +#define IOV_REMOVE_HEADER(tail_, var_) \ + ((__typeof__(var_) *)(iov_remove_header_((tail_), &(var_), \ + sizeof(var_), __alignof__(var_)))) + +/** IOV_DROP_HEADER() -- Remove a typed header from an IOV tail + * @tail_: IOV tail to remove header from (modified) + * @type_: Data type of the header to remove + * + * Returns: true if the tail still contains any bytes, otherwise false */ -#define IOV_REMOVE_HEADER(tail_, type_) \ - ((type_ *)(iov_remove_header_((tail_), \ - sizeof(type_), __alignof__(type_)))) +#define IOV_DROP_HEADER(tail_, type_) \ + iov_tail_drop((tail_), sizeof(type_))
#endif /* IOVEC_H */ diff --git a/tcp_buf.c b/tcp_buf.c index 05305636b503..4bcc1acb245a 100644 --- a/tcp_buf.c +++ b/tcp_buf.c @@ -160,7 +160,7 @@ static void tcp_l2_buf_fill_headers(const struct tcp_tap_conn *conn, uint32_t seq, bool no_tcp_csum) { struct iov_tail tail = IOV_TAIL(&iov[TCP_IOV_PAYLOAD], 1, 0); - struct tcphdr *th = IOV_REMOVE_HEADER(&tail, struct tcphdr); + struct tcphdr thc, *th = IOV_REMOVE_HEADER(&tail, thc); struct tap_hdr *taph = iov[TCP_IOV_TAP].iov_base; const struct flowside *tapside = TAPFLOW(conn); const struct in_addr *a4 = inany_v4(&tapside->oaddr);
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
Use IOV_PEEK_HEADER() to get the ethernet header from the iovec.
Move the workaround about multiple iovec array from vu_handle_tx() to
tap_add_packet(). Removing the offset out of the iovec array should
reduce the iovec count to 1.
Signed-off-by: Laurent Vivier
Modify the interface of packet_add_do() to take an iov_tail
rather than a memory pointer aand length.
Internally it only supports iovec array with only one entries,
after being pruned. We can accept iovec array with several
entries if the offset allows the function to reduce the number
of entries to 1.
tap4_handler() is updated to create an iov_tail value using
IOV_TAIL_FROM_BUF() from the buffer and the length.
Signed-off-by: Laurent Vivier
packet_data() gets the data range from a packet descriptor from a
given pool.
It uses iov_tail to return the packet memory.
Signed-off-by: Laurent Vivier
On Tue, Apr 15, 2025 at 11:42:46AM +0200, Laurent Vivier wrote:
packet_data() gets the data range from a packet descriptor from a given pool.
It uses iov_tail to return the packet memory.
Signed-off-by: Laurent Vivier
Reviewed-by: David Gibson
--- packet.c | 41 +++++++++++++++++++++++++++++++++++++++++ packet.h | 5 +++++ 2 files changed, 46 insertions(+)
diff --git a/packet.c b/packet.c index 98ded4e27aae..675bccf4d787 100644 --- a/packet.c +++ b/packet.c @@ -190,6 +190,47 @@ void *packet_get_do(const struct pool *p, const size_t idx, return r; }
+/** + * packet_data_do() - Get data range from packet descriptor from given pool + * @p: Packet pool + * @idx: Index of packet descriptor in pool + * @data: IOV tail to store the address of the data (output) + * @func: For tracing: name of calling function, NULL means no trace() + * @line: For tracing: caller line of function call + * + * Return: true if @data contains valid data, false otherwise + */ +/* cppcheck-suppress unusedFunction */ +bool packet_data_do(const struct pool *p, size_t idx, + struct iov_tail *data, + const char *func, int line) +{ + size_t i; + + ASSERT_WITH_MSG(p->count <= p->size, + "Corrupt pool count: %zu, size: %zu, %s:%i", + p->count, p->size, func, line); + + if (idx >= p->count) { + debug("packet %zu from pool size: %zu, count: %zu, " + "%s:%i", idx, p->size, p->count, func, line); + return false; + } + + data->cnt = 1; + data->off = 0; + data->iov = &p->pkt[idx]; + + for (i = 0; i < data->cnt; i++) { + ASSERT_WITH_MSG(!packet_check_range(p, data->iov[i].iov_base, + data->iov[i].iov_len, + func, line), + "Corrupt packet pool, %s:%i", func, line); + } + + return true; +} + /** * pool_flush() - Flush a packet pool * @p: Pointer to packet pool diff --git a/packet.h b/packet.h index af40b39b5251..062afb978124 100644 --- a/packet.h +++ b/packet.h @@ -39,6 +39,9 @@ void *packet_get_try_do(const struct pool *p, const size_t idx, void *packet_get_do(const struct pool *p, const size_t idx, size_t offset, size_t len, size_t *left, const char *func, int line); +bool packet_data_do(const struct pool *p, const size_t idx, + struct iov_tail *data, + const char *func, int line); bool pool_full(const struct pool *p); void pool_flush(struct pool *p);
@@ -49,6 +52,8 @@ void pool_flush(struct pool *p); packet_get_try_do(p, idx, offset, len, left, __func__, __LINE__) #define packet_get(p, idx, offset, len, left) \ packet_get_do(p, idx, offset, len, left, __func__, __LINE__) +#define packet_data(p, idx, data) \ + packet_data_do(p, idx, data, __func__, __LINE__)
#define PACKET_POOL_DECL(_name, _size, _buf) \ struct _name ## _t { \
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_PEEK_HEADER()
rather than packet_get().
Introduce iov_tail_msghdr() to convert iov_tail to msghdr
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and iov_peek_header_() rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_PEEK_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
No functional change.
Currently, if dhcpv6_opt() is called with offset set to 0, it will set the
offset to point to DHCPv6 options offset.
To simplify the use of iovec_tail in a later patch, move the initialization
out of the function. Replace all the call using 0 by a call using
the offset of the DHCPv6 options.
Signed-off-by: Laurent Vivier
Extract code from dhcpv6() into a new function, dhcpv6_send_ia_notonlink()
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Signed-off-by: Laurent Vivier
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_PEEK_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Remove packet_get() as it is not used anymore.
Signed-off-by: Laurent Vivier
participants (2)
-
David Gibson
-
Laurent Vivier