Architecting Containers Part 3: How the User Space Affects Your Application

In Architecting Containers Part 1 we explored the difference between the user space and kernel space.  In Architecting Containers Part 2 we explored why the user space matters to developers, administrators, and architects. In today’s post we will highlight a handful of important ways the choice of the user space can affect application deployment and maintenance.

While there are many ways for a given container architecture to affect and/or influence your application, the user space provides tooling that is often overlooked, namely:

  1. Language runtimes
  2. Debug and management tools

Application Containers vs. Super Privileged Containers

Before we start, let’s delineate between two main types of “container workloads” – application containers and Super Privileged Containers (SPCs). As we will see, each requires more or less deterministic communication with, as well as, secure access to the underlying container host.

Application containers (aka service containers) provide tooling for applications and batch processing – examples include: web servers with Python or Ruby, JVMs, or even Hadoop or HPC tooling. Application containers are what developers are trying to move into production or onto a cluster to meet the needs of the business.

I often say that containers can be viewed as “fancy processes” that typically run in their own user space, with extra isolation added. What most people don’t realize is that much of this isolation can be removed to complete various systems administration tasks.

processes v. containersContainers are really just fancy processes which mount a separate disk image enabled by kernel namespaces. Service containers typically run with SELinux, and CGroups enabled, while Super Privileged Containers often run without these constraints.

Enter, Super Privileged Containers (SPC) which provide tooling to manage and debug other containers or the container host itself. For example, SPCs can be used to load kernel modules, sniff network traffic, or to run SystemTap. Super Privileged Containers are similar to giving systems administrators and senior developers sudo (root) access to support applications.

User Space vs. Kernel Space - Super Privileged Containers vs. Service ContainersDevelopers and systems administrators have different concerns.

The purpose of application containers and Super Privileged Containers is different and building them requires a different thought process, but enough background… how does this affect your application?

Well, it affects the decisions you will make about how to build, debug, manage, and deploy your application across environments, from the developer laptop to the production server. Choosing and standardizing on a single container user space (toolchain), across environments will enable developers, systems administrators, and architects to simplify communication between development and operations, speeding up deployments. Building up skills and understanding around a single toolchain simplifies communication (between developers and operations), training, and troubleshooting (…if and when things don’t go as planned).

Application (aka Service) Containers

The user space affects applications because this is where all of the language runtimes exist. The user space is what gets packaged up inside the container image and shared between developers, architects, and system administrators. Every time a developer or architect starts a new project, they are implicitly choosing a user space (aka container image) based on what language runtimes and applications are available. Application availability is key when making a platform decision whether inside or outside of a container.

While developers and application architects tend to think more about what language runtimes are available, systems administrators are largely concerned with security and how long the platform will be supported.

DIY - Application ContainerDevelopers are concerned with deploying their code and data. Users are concerned with gaining access to these applications.

Also, remember that when application containers are created – they are, as mentioned above, really just fancy processes with extra isolation added in the kernel. This means that most of the normal rules apply to managing the contents of the application containers. They still need to be patched, they still need to be hardened, and they still need to be managed once they are up-and-running.

Having the application runtime and its dependencies packaged within the container image makes sharing your code easy. From collaborating on microservices internally (e.g. with a member of the operations team), to checking the homework project of a potential developer you are interviewing, collaboration is simplified when the entire application and all of its dependencies are packaged as a container image.

Development Runtimes & Servers

Red Hat Software Collections provides access to the latest stable versions (both as RPMs and as container images) of open source languages, databases (e.g. Ruby, Python, Node.js, Passenger, GCC, Ruby, MySQL, MongoDB, PHP, etc.) and other utilities and servers (Varnish, httpd, nginx). Of note, the most popular Red Hat Software Collections are now available as container images via the Red Hat Customer Portal.

In addition, the latest versions of Red Hat certified third-party container images are available via the Red Hat Federated Registry. Developers, systems administrators and architects can use these Independent Software Vendor (ISV) provided images to more easily develop, deploy, and utilize pre-built solutions – all certified for use with Red Hat’s ecosystem of container hosts and platforms.

User Space vs. Kernel Space - Super Privileged Containers vs. Service Containers(1)Red Hat provides developers access to newer language runtimes and servers through Red Hat Software Collections and certified ISV applications. Red Hat provides systems administrators access to tools with the rhel-tools container. All of these containers are provided through either the Red Hat Container Registry or the Red Hat Federated Registry.

Introspection

Developers, architects, and systems administrators each make changes to the user space to get their work done. Looking at each image layer as part of a supply chain, formats like Docker provide tooling to quickly and easily determine what changes have been made in an upstream container image (user space).

Do you want to understand how the image was built? The metadata tells part of the story:

FROM rhel7
MAINTAINER Scott McCarty <smccarty@redhat.com>
RUN yum update -y;yum clean all # update the image

Since Docker images are built in layers, we can also determine what has changed between this image layer and the underlying image layer. Notice that the latest image layer was updated a week ago.

docker history rhel7-updated

IMAGE               CREATED             CREATED BY                                      SIZE
ff664e850f37        1 weeks ago        /bin/sh -c yum update -y                        632.9 MB
775344f011a7        1 weeks ago        /bin/sh -c #(nop) MAINTAINER Scott McCarty sm   0 B
edf056c07122        1 weeks ago        bash                                            111 MB
d822d9962e7c        1 months ago                                                       836.4 MB

We can even determine which files have changed between the image and the running container. This is especially useful while migrating legacy applications into containers. The engineer doing the migration can debug where the application is making changes in the user space. This allows the engineer to more easily separate where the logs, data, configurations, and code live – and to place components on the correct storage “device” (e.g. storage external to the container).

docker ps -a

CONTAINER ID  IMAGE                 COMMAND  CREATED         STATUS                    PORTS  NAMES
fd99f8f10a9a  rhel7-updated:latest  "bash"   11 seconds ago  Exited (0) 2 seconds ago           hubristic_stallman

Notice that the Bash history file has been updated:

docker diff fd99f8f10a9a

C /root

A /root/.bash_history

In fact, we’re only skimming the surface when it comes to the value of introspection.  For example, consider how convenient this introspection tooling might be when doing migrations years after the developer who wrote it has moved on…

Simplified Application Deployment

While the migration of an application into a container is typically completed as a singular project (once), the running of applications is something that happens quite often. Tools like the Atomic run command enable developers to embed complex start up logic into the metadata of a container image so that it doesn’t have to be specified every time the container is started. Using the RUN label, startup can be simplified to:

atomic run ntpd

instead of:

docker run -d -n ntpd --cap_add SYS_TIME ntp

The logic, is instead neatly saved in the Dockerfile, which can be version controlled:

FROM rhel7
RUN yum -y install ntp; yum -y clean all
LABEL RUN="docker run -d --name=ntpd --cap-add=SYS_TIME --cap-add=NET_BIND_SERVICE IMAGE /usr/sbin/ntpd -d"
CMD /usr/sbin/ntpd

Life Cycle

Building your container image on the Red Hat Enterprise Linux 7 user space, provides the application with a 10 year lifecycle. During this supported life cycle, security and bug fixes are provided as a part of your Red Hat Enterprise Linux subscription. During this life cycle, Red Hat will provide newer container images in addition to RPMs – allowing customers to update their container images as they see fit.

Most Linux distributions, allow important pieces of software to receive major version updates. As an example, Fedora tries to update the kernel as quickly as possible (rebasing). Fedora has the goal of innovating, which is great, but this can sometimes break things. Red Hat Enterprise Linux uses the methodology of backporting (vs. rebasing) patches to specific versions of libraries and binaries in the user space. For example, the version of the Apache Web server in Red Hat Enterprise Linux 7 will continue to be 2.4.6, but the engineering team will carefully patch it for security and bug fixes. This grants administrators peace of mind as container images will rebuild for years to come. In this example, configuration files and other software which depend on Apache will continue to work, even after updating the container image to apply security updates and bug fixes.

A simple FROM line in your Dockerfile, will have a big impact, on how much technical debt your development and operations teams incur when doing updates and upgrades over an application’s life cycle. Moreover, as these containers will run on hardware ranging from developer laptops to production servers, support and life cycle are a key consideration when making decisions as to which distribution to use inside the container image.

Super Privileged Containers

Once you have applications up and running in a containerized infrastructure (and especially once these applications are in production) it will likely be necessary to rethink how to troubleshoot, scan, backup, and manage applications, hosts, and data.

With Super Privileged Containers, the importance of the relationship between the user space and the kernel is clear. SPCs are executed with privileges similar to a regular process owned and executed by the root user. The tooling inside of a SPC often interacts directly with kernel data structures and the host file system – things like the TCP stack, process list, user list, etc. The RHEL Tools container makes many of these operations quite simple.

DIY - SPCSystems administrators are concerned with managing, maintaining, troubleshooting and scanning containers (…and the hosts they run on).

SPCs also demonstrate how easy it is to deploy system software and utilities, especially when an administrator only needs them temporarily to troubleshoot a problem. After the administrator is done troubleshooting, the utilities can simply and easily be removed. To make this operation painless, Red Hat has a created a special tools container called the Atomic Tools Container. We will demonstrate its use in some of the following sections.

For now let’s dig into some additional use cases for SPCs…

Debugging Containers and Container Hosts

Once you have an application up and running how do you troubleshoot what is happening in the container? How do you determine if the problem is in the container or with the container host? The kernel resides and runs on the host, but the applications run in the container, so it’s natural to approach the debugging process holistically. As administrators, our first step is often to determine exactly where the problem “is” – is it in the host, or within the container?

To start, an administrator could ssh into the underlying container host platform (e.g. Red Hat Enterprise Linux 7) and run a bunch of tools…  as they normally would.  Alternatively, and more conveniently, they could also run these utilities from within a Super Privileged Container which makes them easy to remove afterwards and doesn’t pollute the container host with a bunch of software, files, and configurations.

Track System Calls

Often it is useful to see what system calls an application is making to the kernel, but how do we do this if the application is in a container? Some good news… the process is nearly identical to the approach used when you’re not using containers – first we call up a process list, then we attach to it with the strace command. The main difference is that we have to allow the container to see the host kernel’s process table:

docker run -it --pid=host registry.access.redhat.com/rhel7/rhel-tools ps -ef | grep yum
root      35114  34439  1 03:42 ?        00:00:01 /usr/bin/python /usr/bin/yum install
docker run -it --pid=host --privileged registry.access.redhat.com/rhel7/rhel-tools strace -p 35114
Process 35114 attached

The example commands (above) are fairly complicated because an administrator must remember a variety of command line options for docker, so you might consider using the Atomic command to simplify to the following:

atomic run rhel-tools ps -ef | grep yum
root     12993 12974  5 19:56 ?        00:00:07 /usr/bin/python /usr/bin/yum upd
root     13405 12880  0 19:59 ?        00:00:00 grep --color=auto yum
atomic run rhel7/rhel-tools strace -p 12993
Process 12993 attached
read(0,

Network Sniffing

Another common scenario is network sniffing. How do we sniff packets coming from the container? By asking the kernel’s TCP stack, just like normal.

First, get the container’s IP Address:

docker ps

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
3ab5aa19b1be        rhel7               "httpd"              2 hours ago         Up About a minute                       clever_payne

Output:

docker inspect 3ab5aa19b1be | grep IPAddress
"IPAddress": "172.17.0.13",
"SecondaryIPAddresses": null,

Now, sniff traffic like “normal”, but using a SPC:

atomic run --spc rhel7/rhel-tools tcpdump -Xs 1514 -i any -n host 172.17.0.13

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 1514 bytes
07:12:28.521162 IP 172.17.42.1.60854 > 172.17.0.13.http: Flags [S], seq 648584050, win 14600, options [mss 1460,sackOK,TS val 1956311198 ecr 0,nop,wscale 7], length 0
0x0000:  4500 003c 6bec 4000 4006 4c9f ac11 2a01  E..<k.@.@.L...*.
0x0010:  ac11 000d edb6 0050 26a8 9b72 0000 0000  .......P&..r....
0x0020:  a002 3908 825f 0000 0204 05b4 0402 080a  ..9.._..........
0x0030:  749a f09e 0000 0000 0103 0307            t...........

SystemTap

Have you ever had a an application experience a performance regression (aka run slower) after upgrading? Imagine you update your container hosts (especially the kernel) and an application’s performance tanks…. how do you troubleshoot an issue like this? One place to start, would be timing system calls between kernel versions. SystemTap can do this, but how do we install it in a containerized environment?

Well, luckily Red Hat’s Global Support Services has already solved it – with this Red Hat Knowledge Base Article, it’s easy to run SystemTap in a container. After you have a container image built, with stap and the debug symbols for the installed kernel, it’s easy to run SystemTap because you now have a user space and kernel that can work together:

docker run --privileged -v /lib/modules:/lib/modules --tty=true --interactive=true stap stap -e 'probe kernel.function("generic_make_request") { print_backtrace() }'

In fact, it was easy to measure the speed of system calls in the kernel with the syscalltimes script by changing the stap command to:

docker run --privileged -v /lib/modules:/lib/modules --tty=true --interactive=true stap stap $P_VERBOSE_STR -w "$P_PROCESS_STR" -e '

Then the script is just ran like normal, but all of the code is ran from a super privileged container:

./syscalltimes.sh -t
System Call       Count  Total ns  Avg ns   Min ns  Max ns
write             103    458792    4454     999     42182
readlink          4      26277     6569     3460    9715
set_tid_address   2      2094      1047     1019    1075
execve            2      218808    109404   108172  110636
access            10     44503     4450     2312    8385
getpeername       4      3804      951      752     1208
close             500    294474    588      340     17796
wait4             6      200706487 33451081 601     113411404
ftruncate         4      24773     6193     2823    9584
setpgid           4      4822      1205     718     1743
mprotect          60     131753    2195     977     5733
accept4           4      14854     3713     1880    5663
epoll_create1     2      4841      2420     2359    2482

Debugging Applications In the Container

Often it is useful to debug an application inside of a container:

Capturing Data From Containers

A lot of work has gone into the RHEL Atomic Tools container to provide tools like kdump, abrt, sosreport, and tools to analyze core files, etc.

Scanning or Managing Containers and Container Hosts

It is often useful to scan images for security purposes. A similar methodology could be used for backup agents, monitoring, and even logging.

As an example of SPC, look at how easy it is to run the Docker CIS Benchmark to see if your container host and containers pass. Notice that this super privileged container transparently scans both the host and the container images. Also note, that the startup command is greatly simplified with the atomic RUN label:

atomic run fatherlinux/docker-bench-security

As another example, look at how easy it is to scan local containers for CVEs:

Conclusion

Thinking about containers from a workload perspective helps clarify the difference between application containers and Super Privileged Containers. Application containers support business logic, while Super Privileged Containers (SPC) support the applications and infrastructure.

There are a lot of things to think about when selecting container images and container hosts. Think through both what applications you need, but also how you will support and troubleshoot them during the full application life cycle.

When the operating system is broken up into a kernel (container host) and a user space (container image), many of the same support functions must be thought of differently, but the scope of what must be thought through really doesn’t change.

The container world is changing fast and I want to work on this together, so leave a comment below and I would be more than happy to reply…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s