I did some more cleaning up of the migration series, including handling SO_PEEK_OFF and some other fixes. I expect to post the latest series in an hour or two. However, then I hit another bug, and have made some, but not a lot of progress figuring it out. It occurs when after the migration I send data inbound from a peer before I send anything outbound from the guest. The data seems to get lost, and after that outbound things also don't seem to work, at least not reliably. Sometimes the data seems to eventually come through, but not always, and I haven't figured out a specific pattern yet. I had a look at packet captures, and discovered that what passt is recording isn't the same as what I see with a tcpdump running in the guest itself. This disagreement at L2 makes me suspect the problem is something in the guts of vhost-user. Do we need to be migrating some of the ring pointers from vu_virtq? Or are those supposed to be reset by the various config commands that come in to the target side with the migration? Laurent, any insight? I'm going to tackle something else for the remainder of today. I'm attaching the packet captures, in hopes that it helps one of you source.pcap - The packet capture recorded by the source side passt instance target.pcap - The packet capture recorded by the target side passt instance internal.pcap - The packet capture recored by tcpdump running within the guest The relevant stream is the one from port 55314 to port 5555. The data flow _should_ look like: guest to peer: "first\n" peer to guest: "before\n" -- migration occurs -- peer to guest: "after\n" guest to peer: "return\n" The "after" packet appears in both traces, but from passt's view it gets no reply and is retransmitted. From the guest's view it is acked promptly. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Thu, Feb 06, 2025 at 03:33:04PM +1100, David Gibson wrote:I did some more cleaning up of the migration series, including handling SO_PEEK_OFF and some other fixes. I expect to post the latest series in an hour or two.After posting that, I added migration of addr_seen guest_mac. This seems to have improved this partially. The data sent first from peer to guest seems to get through. But then further data from the guest outwards still seems to be delayed / re-ordered w.r.t the data coming the other way. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson