Apologies for the delay.
On Wed, 9 Jul 2025 01:54:36 +0200
Lisa Gnedt
Hi Stefano,
Thanks for you fast feedback!
On 2025-07-07 18:19, Stefano Brivio wrote:
For context, "this" is: always join the user namespace owning a network namespace.
Yes, exactly.
It looks reasonable (and desirable) to me, but I'm not sure how / why it breaks the --userns parameter.
In the case, where a PID is supplied, it misuses the userns variable and sets it to the path of the network namespace and then always calls the ioctl NS_GET_USERNS to get the owning user namespace. However, the userns variable might also be manually set via the --userns option, whereas we expect users to set this parameter to the path of a user namespace. The ioctl NS_GET_USERNS returns the parent user namespace when a user namespace is given. Therefore, we join the parent user namespace with this patch instead of joining the given user namespace.
Ah, I see. Well, in that case, I guess we could simply skip the NS_GET_USERNS ioctl() if --userns is given.
We should probably never do this when --netns-only is given (that's Podman's case, for example).
I agree, it does not make sense to join a user namespace, when a user explicitly only wants to join the network namespace.
It would be good to have a way to "cleanly" exclude this new behaviour, but, once we add the NS_GET_USERNS trick, --netns-only doesn't exactly get us back to the previous behaviour. What about --userns-from-pid or something like that? That name isn't great though.
I agree that this would be one of the possible solutions. It would enable the use either with PID or with the --netns option. Maybe --userns-from-netns would be better? Somehow it would be cool to include the relationship between userns and netns more concretely like --join-owning-userns-from-netns, but on the other hand it is also a bit too long.
Would --userns-from-netns imply that the PID given on the command line always refers to the network namespace, and the user namespace comes from it? If that's the case, the name looks fitting (but it needs a bit of explanation in the man page and usage message).
However, I am not sure, if it is really necessary to have a separate CLI option. Maybe it would also be a fine new default behavior just for the case when a PID is specified, but no network or user namespace is explicitly given. If anyone really needs the old behavior, it is still possible to specify the user namespace explicitly and, therefore, deactivate the new behavior. It seems that podman uses the --netns option, so it should be fully unaffected by this proposed change of default behavior.
Right, Podman shouldn't be affected at all. I wonder about rootlesskit (used by moby / Docker) though: https://github.com/rootless-containers/rootlesskit/blob/3c8213d359b54284f4f0... from what I understand, --netns is passed to pasta only if the user gives an explicit --detach-netns. Now, even with the change you propose, things should always work, but I guess we should test it at least in the common use case (Docker starting a container).
I am fine with both solutions. I thought a bit more about the current and changed behavior by iterating trough the possible combinations of options with my code changes in mind. I hope this is not too much, but it also uncovered a few already existing strange edge cases.
CLI options -> behavior ----------------------------------------------
PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) ***2 It looks like this is currently already a strange behavior, as it would get the netns and userns from PID.
I'm not sure about this part: the intended behaviour is to only care about a target network namespace, because who starts pasta already joined / detached the intended user namespace. You mention it's broken but I'm not sure why. I don't think the behaviour should change here.
--userns X PID -> existing behavior (netns from PID, userns from option) --userns X --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID - strange) ***1 It looks like this is currently already a strange behavior, as it would also get the userns from PID. --netns-only --userns X PID -> existing behavior (netns from PID, userns from option - strange) ***1 --netns X PID -> existing behavior (invalid) (skipping further combination with --netns and PID)
COMMAND -> existing behavior (new netns, new userns) --netns-only COMMAND -> existing behavior (new netns, no userns) --userns X COMMAND -> existing behavior (new netns, userns from option) --userns X --netns-only COMMAND -> existing behavior (new netns, no userns - a bit strange) ***1 --netns-only --userns X COMMAND -> existing behavior (new netns, userns from option - strange) ***1 --netns X COMMAND -> existing behavior (invalid) (skipping further combination with --netns and COMMAND)
--netns X -> existing behavior (netns from option, no userns) --netns X --netns-only -> existing behavior (netns from option, no userns) --netns X --userns Y -> existing behavior (netns from option, userns from option) --netns X --userns Y --netns-only -> existing behavior (netns from option, no userns - a bit strange) ***1 --netns X --netns-only --userns Y -> existing behavior (netns from option, userns from option - strange) ***1
Thanks for the table, it's really helpful, and everything else makes sense to me.
Although it is not directly related to the change I am proposing, it might make sense to clean up the CLI option behavior a bit. I would argue to forbid --userns in combination with --netns-only completely (everything marked with ***1).
Right, that's probably a good idea.
By the way, I'd suggest checking with David Gibson
Furthermore, --netns-only PID seems to be currently broken (marked with ***2). I think the netns_only variable (or use_userns how it is called inside isolate.c) should most likely get higher priority than the userns variable itself. This should fix the behavior to only use the netns from PID and no userns.
I'm not quite sure what the current problem is.
Now, 4.9 feels "old" enough, but pasta used to run on a 3.13 kernel a while ago, then a few things were (inadvertently) broken. But it "almost" does. Couldn't we just add a fallback for the case where NS_GET_USERNS fails? You're already handling the error. You could just print a warning and continue instead of calling die_perror()...
Yes, it makes sense to implement a fallback when changing the default behavior. If it will become a separate option, it seems counter-intuitive to have an automatic fallback. I just thought it might be best to first discuss the wanted behavior before starting to implement more complex changes.
-- Stefano