Re: [PATCH v2 0/2] More graceful handling of migration without passt-repair

27 Feb 2025


      On Thu, 27 Feb 2025 12:43:41 +1100
David Gibson  wrote:
...
On Wed, Feb 26, 2025 at 12:24:12PM +0100, Stefano Brivio wrote:
...
On Wed, 26 Feb 2025 19:51:11 +1100
David Gibson  wrote:
...
On Wed, Feb 26, 2025 at 09:09:48AM +0100, Stefano Brivio wrote:
...
On Wed, 26 Feb 2025 11:27:32 +1100
David Gibson  wrote:
...
On Tue, Feb 25, 2025 at 06:43:16PM +0100, Stefano Brivio wrote:
...
On Tue, 25 Feb 2025 16:51:30 +1100
David Gibson  wrote:
> From Red Hat internal testing we've had some reports that if
> attempting to migrate without passt-repair, the failure mode is uglier
> than we'd like.  The migration fails, which is somewhat expected, but
> we don't correctly roll things back on the source, so it breaks
> network there as well.
> 
> Handle this more gracefully allowing the migration to proceed in this
> case, but allow TCP connections to break
> 
> I've now tested this reasonably:
>  * I get a clean migration if there are now active flows
>  * Migration completes, although connections are broken if
>    passt-repair isn't connected
>  * Basic test suite (minus perf)
> 
> I didn't manage to test with libvirt yet, but I'm pretty convinced the
> behaviour should be better than it was.
I did, and it is. The series looks good to me and I would apply it as
it is, but I'm waiting a bit longer in case you want to try out some
variations based on my tests as well. Here's what I did.
[snip]
Thanks for the detailed instructions.  More complex than I might have
liked, but oh well.
...
$ virsh migrate --verbose --p2p --live --unsafe alpine --tunneled qemu+ssh://88.198.0.161:10951/session
  Migration: [97.59 %]error: End of file while reading data: : Input/output error
...despite --verbose the error doesn't tell much (perhaps I need
LIBVIRT_DEBUG=1 instead?), but passt terminates at this point. With
this series (I just used 'make install' from the local build), migration
succeeds instead:
$ virsh migrate --verbose --p2p --live --unsafe alpine --tunneled qemu+ssh://88.198.0.161:10951/session
  Migration: [100.00 %]
Now, on the target, I still have to figure out how to tell libvirt
to start QEMU and prepare for the migration (equivalent of '-incoming'
as we use in our tests), instead of just starting a new instance like
it does. Otherwise, I have no chance to start passt-repair there.
Perhaps it has something to do with persistent mode described here:
Ah.  So I'm pretty sure virsh migrate will automatically start qemu
with --incoming on the target.
("-incoming"), yes, see src/qemu/qemu_migration.c,
qemuMigrationDstPrepare().
...
IIUC the problem here is more about
timing: we want it to start it early, so that we have a chance to
start passt-repair and let it connect before the migration actually
happens.
For the timing itself, we could actually wait for passt-repair to be
there, with a timeout (say, 100ms).
I guess.  That still requires some way for KubeVirt (or whatever) to
know at least roughly when it needs to launch passt-repair, and I'm
not sure if that's something we can currently get from libvirt.
KubeVirt sets up the target pod, and that's when it should be done (if
we have an inotify mechanism or similar). I can't point to an exact code
path yet but there's something like that.
Right, but that approach does require inotify and starting
passt-repair before passt, which might be fine, but I have the concern
noted below.  To avoid that we'd need notification after passt & qemu
are started on the target, but before the migration is actually
initiated which I don't think libvirt provides.
...
...
...
We could also modify passt-repair to set up an inotify watcher if the
socket isn't there yet.
Maybe, yes.  This kind of breaks our "passt starts first, passt-repair
connects to it" model though, and I wonder if we need to revisit the
security implications of that.
I don't think it actually breaks that model for security purposes,
because the guest doesn't have anyway a chance to cause a connection to
passt-repair. The guest is still suspended (or missing) at that point.
I wasn't thinking of threat models coming from the guest, but an
attack from an unrelated process impersonating passt in order to
access passt-repair's superpowers.
Then an inotify watch shouldn't substantially change things. The
attacker could create the socket earlier and obtain the same outcome.
...
[...]
...
We could even think of deferring switching repair mode off until the
right address is there, by the way. That would make a difference to
us.
Do you mean by blocking?  Or by returning to normal operation with the
flow flagged somehow to be "woken up" by a netlink monitor?
The latter. I don't think we should block connectivity (with new
addresses) meanwhile.

-- 
Stefano