On Sat, 17 May 2025 03:34:42 -0600
Max Chernoff
Hi Stefano
On Fri, 2025-05-16 at 18:11 +0200, Stefano Brivio wrote:
Max, could it be that you're running stuff with some customised SELinux policy? By the way, with "unconfined disabled":
Simpler than that: I was testing something with SELinux permissive, and I forgot to reenable it. Whoops. I'm getting the same results as you now.
Running with SELinux in permissive mode, I'm getting:
# cat /var/log/audit/audit.log type=AVC msg=audit(1747410763.621:130615): avc: denied { search } for pid=1352409 comm="pasta.avx2" name="1352408" dev="proc" ino=7022238 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410763.621:130616): avc: denied { read } for pid=1352409 comm="pasta.avx2" name="net" dev="proc" ino=7022285 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=lnk_file permissive=1 type=AVC msg=audit(1747410763.622:130617): avc: denied { read } for pid=1352409 comm="pasta.avx2" scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=file permissive=1 type=AVC msg=audit(1747410763.622:130618): avc: denied { read } for pid=1352409 comm="pasta.avx2" name="ns" dev="proc" ino=7022284 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410763.622:130619): avc: denied { open } for pid=1352409 comm="pasta.avx2" path="/proc/1352408/ns" dev="proc" ino=7022284 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410764.622:130620): avc: denied { read } for pid=1352417 comm="pasta.avx2" name="net" dev="proc" ino=7022285 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c609,c838 tclass=lnk_file permissive=1
and:
# audit2allow -a
#============= pasta_t ============== allow pasta_t container_runtime_t:dir { open read search }; allow pasta_t container_runtime_t:file read; allow pasta_t container_runtime_t:lnk_file read; allow pasta_t container_t:lnk_file read;
If I add those rules, everything works
Yes, adding those rules also fixes things for me.
To me those denials look reasonable, in the sense that I would expect the namespace links to have container_runtime_t type.
I'm a little surprised that "container_runtime_t:file read" is necessary since I thought that "container_runtime_t:lnk_file read" would be sufficient to get the target of the link, but it indeed does not work without it.
(well, I'm not saying that's the solution...).
I guess the options are:
1. Add the above rules to the pasta SELinux policy
2. Have Podman change the context of /proc/self/ns/net to pasta_t
3. Have Podman pass a file descriptor to the netns instead of the path to the netns.
(1) is arguably the least secure, but is probably fine in practice?
Well: 2. is probably the most restrictive but it doesn't really feel correct to me (pasta is not, at least conceptually, the exclusive user of the network namespace link) 3. is pretty much a way to dodge LSM policies (SELinux / AppArmor can't see this, done) ...so I would opt for 1. I see why you mention it's less secure: we didn't really want to be able to open and read *any* container_runtime_t:dir or container_t:lnk_file. But that's not really the part of "fine-grained" security that we typically delegate to SELinux anyway.
Max, could it be that you're running stuff with some customised SELinux policy? By the way, with "unconfined disabled":
https://bugzilla.redhat.com/show_bug.cgi?id=2330512
we seem to have unconfined_t as type for those links:
type=AVC msg=audit(1733378482.320:31258): avc: denied { open } for pid=651955 comm="pasta.avx2" path="/proc/651954/ns" dev="proc" ino=2904841 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
...but I'm not sure at which point in time exactly.
Ah, I wonder if that might be related to this:
https://github.com/containers/buildah/issues/6160
But with the workaround documented there, and the rules from above, "podman build" works as expected with the unconfined module disabled.
Ah, great, then I guess we don't need to fix something that's not broken.
Wait a moment. I don't think something SELinux-specific belongs to pasta's man page, because that's not relevant for all users and distributions.
We could maintain that as an addition for Fedora and perhaps Gentoo, but I wonder if it's really worth the effort.
+1
...so I guess the only remaining point, other than adding those rules, is to figure out why %selinux_relabel_post isn't enough and what we can add to the spec file instead. I'll try to have a look at it within a couple of days unless you find an explanation / solution before then. -- Stefano