[libvirt PATCH] qemu: allow passt to self-daemonize
I initially had the passt process being started in an identical
fashion to the slirp-helper - libvirt was daemonizing the new process
and recording its pid in a pidfile. The problem with this is that,
since it is daemonized immediately, any startup error in passt happens
after the daemonization, and thus isn't seen by libvirt - libvirt
believes that the process has started successfully and continues on
its merry way. The result was that sometimes a guest would be started,
but there would be no passt process for qemu to use for network
traffic.
Instead, we should be starting passt in the same manner we start
dnsmasq - we just exec it as normal (along with a request that passt
create the pidfile, which is just another option on the passt
commandline) and wait for the child process to exit; passt then has a
chance to parse its commandline and complete all the setup prior to
daemonizing itself; if it encounters an error and exits with a non-0
code, libvirt will see the code and know about the failure. We can
then grab the output from stderr, log that so the "user" has some idea
of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
On Wed, Feb 08, 2023 at 18:13:10 -0500, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
[..]
if (cmdret < 0 || exitstatus != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, - _("Could not start 'passt'. exitstatus: %d"), exitstatus); + _("Could not start 'passt': %s"), errbuf); goto error; }
So the 'passt' binary doesn't do any logging later on during runtime
which we'd have to capture into a specific log file?
For this patch:
Reviewed-by: Peter Krempa
On 2/9/23 09:36, Peter Krempa wrote:
On Wed, Feb 08, 2023 at 18:13:10 -0500, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) [..]
if (cmdret < 0 || exitstatus != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, - _("Could not start 'passt'. exitstatus: %d"), exitstatus); + _("Could not start 'passt': %s"), errbuf); goto error; }
So the 'passt' binary doesn't do any logging later on during runtime which we'd have to capture into a specific log file?
It does: -e, --stderr Log to stderr too default: log to system logger only if started from a TTY -l, --log-file PATH Log (only) to given file --log-size BYTES Maximum size of log file default: 1 MiB Maybe, we can keep the errfd and let our event loop read from it? But that looks like a stretch, unnecessary - what would we do with the error if it's reported after guest is started, there's no client connected and no API running? The best we could do is to relay the error into our logs. Which is probably as good as '-l' option then. BTW: I don't see us passing --stderr. Is that intentional? Maybe I don't understand the default. Michal
On Thu, Feb 09, 2023 at 09:59:54 +0100, Michal Prívozník wrote:
On 2/9/23 09:36, Peter Krempa wrote:
On Wed, Feb 08, 2023 at 18:13:10 -0500, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) [..]
if (cmdret < 0 || exitstatus != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, - _("Could not start 'passt'. exitstatus: %d"), exitstatus); + _("Could not start 'passt': %s"), errbuf); goto error; }
So the 'passt' binary doesn't do any logging later on during runtime which we'd have to capture into a specific log file?
It does:
-e, --stderr Log to stderr too default: log to system logger only if started from a TTY -l, --log-file PATH Log (only) to given file --log-size BYTES Maximum size of log file default: 1 MiB
Maybe, we can keep the errfd and let our event loop read from it? But that looks like a stretch, unnecessary - what would we do with the error if it's reported after guest is started, there's no client connected and no API running? The best we could do is to relay the error into our logs. Which is probably as good as '-l' option then.
Well, the stdout/err FDs can be passed to virtlogd so that the output is in the appropriate log file and rotated as needed.
BTW: I don't see us passing --stderr. Is that intentional? Maybe I don't understand the default.
I don't know the default either, but in this case logging to the system journal would be not very good as it would be hard for the user to identify which instance the log belongs to.
On Thu, 9 Feb 2023 10:09:38 +0100
Peter Krempa
On Thu, Feb 09, 2023 at 09:59:54 +0100, Michal Prívozník wrote:
On 2/9/23 09:36, Peter Krempa wrote:
On Wed, Feb 08, 2023 at 18:13:10 -0500, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) [..]
if (cmdret < 0 || exitstatus != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, - _("Could not start 'passt'. exitstatus: %d"), exitstatus); + _("Could not start 'passt': %s"), errbuf); goto error; }
So the 'passt' binary doesn't do any logging later on during runtime which we'd have to capture into a specific log file?
It does:
-e, --stderr Log to stderr too default: log to system logger only if started from a TTY -l, --log-file PATH Log (only) to given file --log-size BYTES Maximum size of log file default: 1 MiB
...yes, but that's already handled earlier in qemuPasstStart(): if (net->backend.logFile) virCommandAddArgList(cmd, "--log-file", net->backend.logFile, NULL);
Maybe, we can keep the errfd and let our event loop read from it? But that looks like a stretch, unnecessary - what would we do with the error if it's reported after guest is started, there's no client connected and no API running? The best we could do is to relay the error into our logs. Which is probably as good as '-l' option then.
Well, the stdout/err FDs can be passed to virtlogd so that the output is in the appropriate log file and rotated as needed.
I don't think libvirt needs to do that. My assumption is that passt would keep working (and logging) as long as the guest is alive and connected, even if libvirtd and virtlogd terminate -- hence the need for a separate log file.
BTW: I don't see us passing --stderr. Is that intentional? Maybe I don't understand the default.
I don't know the default either, but in this case logging to the system journal would be not very good as it would be hard for the user to identify which instance the log belongs to.
It's just what it says in the man page: -e, --stderr Log to standard error too. Default is to log to system logger only, if started from an interactive terminal, and to both sys‐ tem logger and standard error otherwise. so you don't need to pass that explicitly. Then, if --log-file is passed: -l, --log-file PATH Log to file PATH, not to standard error, and not to the system logger. we'll stop logging to standard error and to the system logger, but this only happens (and we should clarify that in the man page, I guess) once options have been parsed, so that any issue with the command line preventing passt from starting is still reported to stderr -- and ends up in 'errbuf', after this patch. For context: Laine just sent a series, for passt: https://archives.passt.top/passt-dev/20230208174838.1680517-1-laine@redhat.c... moving the point where we stop logging to stderr a bit later: not once passt is done processing options, but once it daemonises. There would be a few possible error messages (and abort paths) not covered, otherwise. About correlating entries from a system log: if libvirtd logs the PID of any given instance, I think we're all set. -- Stefano
On Wed, Feb 08, 2023 at 06:13:10PM -0500, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
Reviewed-by: Martin Kletzander
On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off", "--socket", passtSocketName, "--mac-addr", virMacAddrFormat(&net->mac, macaddr), + "--pid", pidfile,
The only problem with this approach is that our virPidFile*() functions rely on locking the very first byte. And when reading the pidfile, we try to lock the file and if we succeeded it means the file wasn't locked which means the process holding the lock died and thus the pid in the pidfile is stale. Now, I don't see passt locking the pidfile at all. So effectively, after this patch qemuPasstStop() would do nothing (well, okay, it'll remove the pidfile), qemuPasstSetupCgroup() does nothing, etc. What we usually do in this case, is: we let our code write the pidfile (just like the current code does), but then have a loop that waits a bit for socket to show up. If it doesn't in say 5 seconds we kill the child process (which we know the PID of). You can take inspiration from: qemuDBusStart() or qemuProcessStartManagedPRDaemon(). Michal
On Thu, Feb 09, 2023 at 09:52:00AM +0100, Michal Prívozník wrote:
On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off", "--socket", passtSocketName, "--mac-addr", virMacAddrFormat(&net->mac, macaddr), + "--pid", pidfile,
The only problem with this approach is that our virPidFile*() functions rely on locking the very first byte. And when reading the pidfile, we try to lock the file and if we succeeded it means the file wasn't locked which means the process holding the lock died and thus the pid in the pidfile is stale.
Now, I don't see passt locking the pidfile at all. So effectively, after this patch qemuPasstStop() would do nothing (well, okay, it'll remove the pidfile), qemuPasstSetupCgroup() does nothing, etc.
What we usually do in this case, is: we let our code write the pidfile (just like the current code does), but then have a loop that waits a bit for socket to show up. If it doesn't in say 5 seconds we kill the child process (which we know the PID of). You can take inspiration from: qemuDBusStart() or qemuProcessStartManagedPRDaemon().
Busy waiting for sockets is nasty though. Depending on how passt is written it might not be needed. If passt creates the listen() socket and does all the important initialization steps that are liable to fail, *before* it daemonizes, then we can synchronize without busy waiting. ie waitpid() for passt leader process to exit. Then check if the socket exists. If it does, then passt has daemonized and is listening and running, if it does not, then passt failed. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 2/9/23 10:56, Daniel P. Berrangé wrote:
On Thu, Feb 09, 2023 at 09:52:00AM +0100, Michal Prívozník wrote:
On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off", "--socket", passtSocketName, "--mac-addr", virMacAddrFormat(&net->mac, macaddr), + "--pid", pidfile,
The only problem with this approach is that our virPidFile*() functions rely on locking the very first byte. And when reading the pidfile, we try to lock the file and if we succeeded it means the file wasn't locked which means the process holding the lock died and thus the pid in the pidfile is stale.
Now, I don't see passt locking the pidfile at all. So effectively, after this patch qemuPasstStop() would do nothing (well, okay, it'll remove the pidfile), qemuPasstSetupCgroup() does nothing, etc.
What we usually do in this case, is: we let our code write the pidfile (just like the current code does), but then have a loop that waits a bit for socket to show up. If it doesn't in say 5 seconds we kill the child process (which we know the PID of). You can take inspiration from: qemuDBusStart() or qemuProcessStartManagedPRDaemon().
Busy waiting for sockets is nasty though. Depending on how passt is written it might not be needed. If passt creates the listen() socket and does all the important initialization steps that are liable to fail, *before* it daemonizes, then we can synchronize without busy waiting. ie waitpid() for passt leader process to exit. Then check if the socket exists. If it does, then passt has daemonized and is listening and running, if it does not, then passt failed.
That still requires passt to hold the pidfile open and locked, neither of which is happening with the current code. Michal
On Thu, 9 Feb 2023 11:10:21 +0100
Michal Prívozník
On 2/9/23 10:56, Daniel P. Berrangé wrote:
On Thu, Feb 09, 2023 at 09:52:00AM +0100, Michal Prívozník wrote:
On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off", "--socket", passtSocketName, "--mac-addr", virMacAddrFormat(&net->mac, macaddr), + "--pid", pidfile,
The only problem with this approach is that our virPidFile*() functions rely on locking the very first byte. And when reading the pidfile, we try to lock the file and if we succeeded it means the file wasn't locked which means the process holding the lock died and thus the pid in the pidfile is stale.
Now, I don't see passt locking the pidfile at all. So effectively, after this patch qemuPasstStop() would do nothing (well, okay, it'll remove the pidfile), qemuPasstSetupCgroup() does nothing, etc.
What we usually do in this case, is: we let our code write the pidfile (just like the current code does), but then have a loop that waits a bit for socket to show up. If it doesn't in say 5 seconds we kill the child process (which we know the PID of). You can take inspiration from: qemuDBusStart() or qemuProcessStartManagedPRDaemon().
Busy waiting for sockets is nasty though. Depending on how passt is written it might not be needed. If passt creates the listen() socket and does all the important initialization steps that are liable to fail, *before* it daemonizes, then we can synchronize without busy waiting.
It does. In my opinion it could simply be handled like it's done for dnsmasq -- from networkStartDhcpDaemon(): if (virCommandRun(cmd, NULL) < 0) return -1; /* * There really is no race here - when dnsmasq daemonizes, its * leader process stays around until its child has actually * written its pidfile. So by time virCommandRun exits it has * waitpid'd and guaranteed the proess has started and written a * pid */
ie waitpid() for passt leader process to exit. Then check if the socket exists. If it does, then passt has daemonized and is listening and running, if it does not, then passt failed.
That still requires passt to hold the pidfile open and locked, neither of which is happening with the current code.
...is this still a requirement even if qemuPasstStop() just needs to remove the PID file? -- Stefano
On Thu, 9 Feb 2023 09:52:00 +0100
Michal Prívozník
On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off", "--socket", passtSocketName, "--mac-addr", virMacAddrFormat(&net->mac, macaddr), + "--pid", pidfile,
The only problem with this approach is that our virPidFile*() functions rely on locking the very first byte. And when reading the pidfile, we try to lock the file and if we succeeded it means the file wasn't locked which means the process holding the lock died and thus the pid in the pidfile is stale.
Now, I don't see passt locking the pidfile at all. So effectively, after this patch qemuPasstStop() would do nothing (well, okay, it'll remove the pidfile), qemuPasstSetupCgroup() does nothing, etc.
And it doesn't need to do anything, actually! passt is started with the --one-off option: -1, --one-off Quit after handling a single client connection, that is, once the client closes the socket, or once we get a socket error. well, removing the PID file is nice (passt can't do it as it won't see the filesystem after starting up), but that's about it. -- Stefano
On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
OOOPS, somehow I've accidentally merged this. Let me post follow up patches.
diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off",
BTW: we definitely need something better than this. IF, something goes wrong after we've executed passt but before we execute QEMU, then passt just hangs there. This is because passt clone()-s itself (i.e. creates a child process), but QEMU that would connect to the socket never comes around. Thus, the child process never sees the EOF on the socket and just hangs in there thinking there will be somebody connecting, soon. I thought this could be solved by just killing the whole process group, but the child process calls setsid(), which creates its own process group. I've managed to work around this by passing --foreground, but I'm unclear about the consequences. Though, it looks like it's still dropping caps, creating its own namespaces, etc. So this may actually be the way to go. But in general, we may need to make our pid handling better. I remember Jano mentioned some problems with virtiofsd which also fork()-s. Michal
On Tue, 14 Feb 2023 09:01:39 +0100
Michal Prívozník
On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) OOOPS, somehow I've accidentally merged this. Let me post follow up patches.
diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off",
BTW: we definitely need something better than this. IF, something goes wrong after we've executed passt but before we execute QEMU, then passt just hangs there. This is because passt clone()-s itself (i.e. creates a child process), but QEMU that would connect to the socket never comes around. Thus, the child process never sees the EOF on the socket and just hangs in there thinking there will be somebody connecting, soon.
Okay, I see the point now -- I thought libvirtd would start passt only once it knows for sure that the guest will connect to it.
I thought this could be solved by just killing the whole process group, but the child process calls setsid(), which creates its own process group. I've managed to work around this by passing --foreground, but I'm unclear about the consequences. Though, it looks like it's still dropping caps, creating its own namespaces, etc. So this may actually be the way to go.
I wouldn't recommend that: --foreground is really intended for interactive usage and we won't be able, for example, to spawn a child in a new PID namespace, which is a nice security feature, I think. I already suggested this to Laine offline: can libvirt just connect() to the socket and close() it, in case QEMU doesn't start? Then passt will terminate. It should be a few (~5) lines of code, instead of all the complexity potentially involved in tracking PIDs and avoiding related races, and design-wise it looks clean to me (libvirtd plays for a moment the QEMU role, because QEMU is not around). -- Stefano
On 2/14/23 11:08, Stefano Brivio wrote:
On Tue, 14 Feb 2023 09:01:39 +0100 Michal Prívozník
wrote: On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) OOOPS, somehow I've accidentally merged this. Let me post follow up patches.
diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off",
BTW: we definitely need something better than this. IF, something goes wrong after we've executed passt but before we execute QEMU, then passt just hangs there. This is because passt clone()-s itself (i.e. creates a child process), but QEMU that would connect to the socket never comes around. Thus, the child process never sees the EOF on the socket and just hangs in there thinking there will be somebody connecting, soon.
Okay, I see the point now -- I thought libvirtd would start passt only once it knows for sure that the guest will connect to it.
I'm failing to see how that would be possible. Starting a guest involves many actions, each one of can fail. From defensive coding POV it's better we have the option to kill passt.
I thought this could be solved by just killing the whole process group, but the child process calls setsid(), which creates its own process group. I've managed to work around this by passing --foreground, but I'm unclear about the consequences. Though, it looks like it's still dropping caps, creating its own namespaces, etc. So this may actually be the way to go.
I wouldn't recommend that: --foreground is really intended for interactive usage and we won't be able, for example, to spawn a child in a new PID namespace, which is a nice security feature, I think.
Well, it's clone() that brings all the problems (well, in combination with setsid()).
I already suggested this to Laine offline: can libvirt just connect() to the socket and close() it, in case QEMU doesn't start? Then passt will terminate.
That relies on the fact that passt isn't stuck and responds to the EOF. We certainly can do that if passt needs graceful shutdown, but mustn't rely on that.
It should be a few (~5) lines of code, instead of all the complexity potentially involved in tracking PIDs and avoiding related races, and design-wise it looks clean to me (libvirtd plays for a moment the QEMU role, because QEMU is not around).
Well, we can place all these helper processes into a CGroup and let it trace PIDs. That should be race free. Michal
On Tue, 14 Feb 2023 12:13:28 +0100
Michal Prívozník
On 2/14/23 11:08, Stefano Brivio wrote:
On Tue, 14 Feb 2023 09:01:39 +0100 Michal Prívozník
wrote: On 2/9/23 00:13, Laine Stump wrote:
I initially had the passt process being started in an identical fashion to the slirp-helper - libvirt was daemonizing the new process and recording its pid in a pidfile. The problem with this is that, since it is daemonized immediately, any startup error in passt happens after the daemonization, and thus isn't seen by libvirt - libvirt believes that the process has started successfully and continues on its merry way. The result was that sometimes a guest would be started, but there would be no passt process for qemu to use for network traffic.
Instead, we should be starting passt in the same manner we start dnsmasq - we just exec it as normal (along with a request that passt create the pidfile, which is just another option on the passt commandline) and wait for the child process to exit; passt then has a chance to parse its commandline and complete all the setup prior to daemonizing itself; if it encounters an error and exits with a non-0 code, libvirt will see the code and know about the failure. We can then grab the output from stderr, log that so the "user" has some idea of what went wrong, and then fail the guest startup.
Signed-off-by: Laine Stump
--- src/qemu/qemu_passt.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) OOOPS, somehow I've accidentally merged this. Let me post follow up patches.
diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 0f09bf3db8..f640a69c00 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -141,24 +141,23 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; pid_t pid = (pid_t) -1; int exitstatus = 0; int cmdret = 0; - VIR_AUTOCLOSE errfd = -1;
cmd = virCommandNew(PASST);
virCommandClearCaps(cmd); - virCommandSetPidFile(cmd, pidfile); - virCommandSetErrorFD(cmd, &errfd); - virCommandDaemonize(cmd); + virCommandSetErrorBuffer(cmd, &errbuf);
virCommandAddArgList(cmd, "--one-off",
BTW: we definitely need something better than this. IF, something goes wrong after we've executed passt but before we execute QEMU, then passt just hangs there. This is because passt clone()-s itself (i.e. creates a child process), but QEMU that would connect to the socket never comes around. Thus, the child process never sees the EOF on the socket and just hangs in there thinking there will be somebody connecting, soon.
Okay, I see the point now -- I thought libvirtd would start passt only once it knows for sure that the guest will connect to it.
I'm failing to see how that would be possible. Starting a guest involves many actions, each one of can fail. From defensive coding POV it's better we have the option to kill passt.
I don't know exactly, I thought the "probing" phase would be considered enough -- I'm not saying it's possible, just that it was my (flawed, then) assumption.
I thought this could be solved by just killing the whole process group, but the child process calls setsid(), which creates its own process group. I've managed to work around this by passing --foreground, but I'm unclear about the consequences. Though, it looks like it's still dropping caps, creating its own namespaces, etc. So this may actually be the way to go.
I wouldn't recommend that: --foreground is really intended for interactive usage and we won't be able, for example, to spawn a child in a new PID namespace, which is a nice security feature, I think.
Well, it's clone() that brings all the problems (well, in combination with setsid()).
Yes, but other than being a security feature, that's how non-interactive executables are typically implemented.
I already suggested this to Laine offline: can libvirt just connect() to the socket and close() it, in case QEMU doesn't start? Then passt will terminate.
That relies on the fact that passt isn't stuck and responds to the EOF.
There's no need for an end-of-file, just closing the socket is enough. Any other method of terminating the process relies on passt to do or not do something specific anyway, such as writing the correct PID file, writing a PID file at all, not blocking SIGTERM (in case you use that), etc. Even if you run it with --foreground, you still rely on it on correctly parsing options and not creating new processes in new sessions. Connecting to the socket and closing it is in the same class of reliability, I think. Statistically speaking, we had one (embarrassing) issue with the contents of the PID file being wrong, see passt commit 3ec02c097536 ("passt: Truncate PID file on open()"), and (so far) zero reported issues with passt not terminating on EPOLLHUP on its socket with --one-off.
We certainly can do that if passt needs graceful shutdown, but mustn't rely on that.
It doesn't need that -- it does absolutely nothing on shutdown. I'm just saying you can use that to terminate passt, only in case QEMU doesn't start.
It should be a few (~5) lines of code, instead of all the complexity potentially involved in tracking PIDs and avoiding related races, and design-wise it looks clean to me (libvirtd plays for a moment the QEMU role, because QEMU is not around).
Well, we can place all these helper processes into a CGroup and let it trace PIDs. That should be race free.
The problem is where you get those PIDs from, at least if you just rely on PID files. If you don't use PID file descriptors ("pidfd", which I don't see used anywhere in libvirt), you could add the PID of another process (which had its PID recycled from a passt process that terminated meanwhile) to the cgroup, and later terminate something unrelated. -- Stefano
participants (6)
-
Daniel P. Berrangé
-
Laine Stump
-
Martin Kletzander
-
Michal Prívozník
-
Peter Krempa
-
Stefano Brivio