What’s Next for Containers? User Namespaces

What are user namespaces? Sticking with the apartment complex analogy, the numbering of users and groups have historically been the same in every container and in the underlying host, just like public channel 10 is generally the same in every unit in an apartment building.

But, imagine that people in different apartments are getting their television signal from different cable and satellite companies. Channel 10 is now different for for each person. It might be sports for one person, and news for another.

Historically, in the Linux kernel, there was a single data structure which held users and groups. Starting in kernel version 3.8, user namespaces were implemented. Using the clone() system call with the CLONE_NEWUSER flag, a separate user namespace can be created. Think of these as nested data structures within a new namespace. In this new namespace, there is a virtual set of users and groups. These users and groups, beginning with uid/gid 0 are mapped to a non-trusted (not root) uid/gid outside the namespace.

In modern Linux kernels, administrators can create about four billion users, 4,294,967,294 (unsigned 32 bit integer) to be specific. These four billion users can be mapped among each user namespace, giving administrators plenty of scalability.

useradd -u 4294967294 testusr

Try 4294967295 yourself….

User Namespaces and Containers

With a user namespace, different containers can have completely different user (uid) and group (gid) numbers. User 500 in the container A may map to user 1500 outside the container and user 500 in container B can map to user 2500 outside the container.

So, why would I want to do this? Well, this is especially useful for providing root access inside of a container. Imagine that the root user (uid 0) in container A maps to uid 1000, and that root in container B maps to user id 2000 outside the container. Similar to network port mapping, this allows the administrator to give someone uid 0 (root) in the container without giving them uid 0 on the underlying system. It also allows a user to freely add/delete users inside the container.

This may sound good at first, but there’s more to the story. Going back to the apartment complex analogy, imagine that each renter could modify their own electric and plumbing. Each person would be their own miniature superintendent. Imagine a single user wired their apartment without using the proper gauge wiring, this could create risk for all of the other renters.

Work to Be Done

So, checking the kernel commit logs, user namespaces were added in kernel version 3.8 and Red Hat Enterprise Linux 7 has kernel 3.10, but namespaces don’t work in Red Hat Enterprise Linux 7.1, what gives? Well, through the Fedora Project, the wider Linux community and internally, Red Hat has been working on user namespaces for quite some time and we view it as a great feature to encourage container adoption. That said, Red Hat disabled them, because we think that user namespaces need to incubate in the upstream community longer to fully understand the security implications and mitigate/remediate any exploits/attack vectors that could expose our customers to malicious activity. Put differently, as with all of Red Hat’s enterprise products, including our solutions that focus on Linux containers (like Red Hat Enterprise Linux Atomic Host and OpenShift Enterprise 3), we don’t enable features until we are sure that they are ready for enterprise use.

As suggested above, the upstream Fedora community has enabled this feature in the latest versions of Fedora Cloud, Fedora Server and Fedora Workstation as part of the project’s commitment to leading-edge Linux technologies, which provides an excellent test bed for how this feature behaves in practice and allows for continued community innovation around it, a vital component of the enterprise hardening process.

It’s also important to note that for user namespaces to be easily consumable by end users, this feature must also be enabled in Docker. Currently (July 2015), user namespaces are not enabled in Docker, but Red Hat is working with the upstream community to enable them.

The goal of Red Hat Enterprise Linux and all of its specialized offerings is to provide customers with a stable, secure, and easy to manage operating environment on which they can deploy their applications. Red Hat’s approach to enabling user namespaces only for the root user, is a reflection of the incredibly high bar that we set for product security. Software is changing rapidly, and there is a constant struggle between features and security and Red Hat is hard at work driving both forward.

Update: As of September 2015, user namespaces have been enabled as a Technology Preview in the beta release of Red Hat Enterprise Linux 7.2.

  1. In case someone else comes here from google trying to figure out how to use user namespaces on rhel/centos 7.2, you have to do `sudo grubby –args=”user_namespace.enable=1″ –update-kernel=/boot/vmlinuz-3.10.0-327.3.1.el7.x86_64` and then reboot to enable them. (obviously you have to change the kernel name if you’re running another kernel)

    1. I need to write an update. My tweet was being a bit snarky and that probably needs some explaining. Name spaces are a way of mapping root in the container to a non-root user outside of the container. Most people know and understand that.

      The typical security tool lowers risk, it does not incur new risk.

      In a lot of people’s opinion, using root in the container raises the risk, not lowers. Namespaces mitigates this risk to an extent. The question becomes, is it secure enough for your liking. Namespaces is about incurring new risk, but in a limited fashion.

      Namespaces allow for convenience, not security. Your risk still goes up, but hopefully, not TOO much for your liking. That’s why I refer to it as a management tool, not a security tool. It’s more about the convenience of running root containers.

      We STILL recommend that you only run root containers when you absolutely need to. For example, I think it’s reasonable to have root containers for systems administration tasks. Running those in a namespace, might even be better in certain cases.

      Imagine a network scanning container that has a special seccomp profile, and runs as root in a namespace. That’s pretty cool, now my network administrator can’t reboot the box – as it should be.

      Namespaces are cool, they shouldn’t be used to just download stuff off the Internet and run it as root. STILL not a good idea. Make sense?

  2. Yes, this is helpful feedback – the picture is beginning to emerge for me. What still isn’t clear is the nature of the risk of running a containers as root in an environment that is using user namespacing so that root in the container is no root on the host.

    I had previously assumed that with user name spacing, a container running as root might put itself at risk of compromise, but this had no particular bearing on the host itself being compromised. I had understood this to be a key concern with running containers as root in a pre-user namespaced world. In those olden times (more than a year ago?) a container running as root was using the same “root” as the host. If the process namespace or some other containerization technology could be broken these was a theoretical risk that the container user might get root access on the host.

    Your example of a network scanning container makes me think my previous assumption is wrong. Would the network scanning container actually need to run as root in it’s namespace to do it’s job? Couldn’t a well designed network scanner avoid requiring root as well? Or does it require root to take advantage of kernel permissions provided through a seccomp profile?

    I’m asking because there are people in my organization who, in fact, would like to download stuff off the Internet (or dockerhub) and run it as root (because some dockerhub images seem to require this). I’m trying to understand the nature of the risk of letting people do that. Are they putting only their containers at risk, or does this practice potentially put the host at risk as well?

    1. Doug, you are actually nailing it. In reality, my example is imperfect. A network scanner in particular could run with CAP_NET_ADMIN privilege and hence avoid being run as root. There are actually very few things that NEED to run as root. In fact, off the top of my head, I am having trouble coming up with a real reason why you might want that. Perhaps, you want a to let systems administrators run a tools container, that might be a valid reason to run as root. It would be flexible, etc.

      At the end of the day, you are also nailing that people think that they can just be lazy, download software off the Internet and run it as root. In reality, there are two problems. One, downloading stuff off the Internet and running as a regular user is bad enough, but running it as root is TERRIBLE – at best user namespaces takes the risk back down to running as a regular user [4] – at worst, it provides code paths that are only exploitable as root [2]. Second, people are just building images WRONG. There is no reason to have a web server run as root. With Kubernetes/OpenShift (and even Docker) it is SO easy to run a web server on port 2222 or 1234 or anything over 1024, then map it back to 80. In this case, a major use case removes root. The same can be done for almost any network service [3].

      I hope that helps. One final note. Containers can attack each other, and the host. If a root container finds a code path to exploit, it’s now real root. That’s the biggest fear with running containers as root, even with namespaces……

      [1]: http://man7.org/linux/man-pages/man7/capabilities.7.html
      [2]: It is expected that there are exploits out there in the wild that are ONLY exploitable as root. Nobody ever cared, because well, you were already root. With Namespaces, this could get hairy. Worse, there is debate and nobody actually knows if it’s possible, but many people think it is. End of the day, I don’t have warm and fuzzies.
      [3]: you may be asking yourself, then why do people do this? Lazy, or they don’t know. At the end of the day, it’s small change in the config somewhere in the software (Apache, Bind, etc).
      [4]: And we all know there are user privilege escalation exploits out there. Defense in depth can help here (SECCOMP, SELinux, Capabilities, etc), but it’s still best not to remove one of the technical controls unless necessary.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s