On Wed, Oct 29, 2025 at 05:52:59AM +0100, Stefano Brivio wrote:
On Wed, 29 Oct 2025 11:35:29 +1100 David Gibson
wrote: On Wed, Oct 29, 2025 at 12:13:30AM +0100, Stefano Brivio wrote:
On Mon, 20 Oct 2025 20:17:10 +1100 David Gibson
wrote: On Mon, Oct 20, 2025 at 07:11:07AM +0200, Stefano Brivio wrote:
On Mon, 20 Oct 2025 11:20:19 +1100 David Gibson
wrote: [snip] > Rather than the local link I was thinking of whatever monitor or > liveness probe in KubeVirt which might have a 60-second period, or some > firewall agent, or how long it typically takes for guests to stop and > resume again in KubeVirt.
Right, I hadn't considered those. Although.. do those actually re-use a single connection? I would have guessed they use a new connection each time, making the timeouts here irrelevant.
It depends on the definition of "each time", because we don't time out host-side connections immediately.
Hm, ok. Is your concern that getting a negative answer from the probe will take too long?
More like getting a positive answer taking too long, because we retry so infrequently.
Right, but it will only be slow if we lose the first probe, which should be very rare.
No, because again, that might be due to the guest doing something with its firewall or stopping/resuming/getting online etc. It's not necessarily rare.
Hmmm... I'd think if interruption due to coming up / firewall frobbing / whatever is *not* rare, then that constitutes flaky availability that arguably the probe *should* fail on.
If that situation persists for at least 1 + 2 + 4 + 8 + 16 + 32 = 55 seconds, without a clamp, we'll wait 119 seconds next, and 247 seconds after that. In this case, to me, it looks more reasonable to retry every minute instead.
Yeah, I guess so. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson