Container Live Migration Using runC and CRIU

In my previous article I wrote about how it was possible to move from checkpoint/restore to container migration with CRIU. This time I want to write about how to actually migrate a running container from one system to another. In this article I will migrate a runC based container using runC’s built-in CRIU support to checkpoint and restore a container on different hosts.

I have two virtual machines (rhel01 and rhel02) which are hosting my container. My container is running Red Hat Enterprise Linux 7 and is located on a shared NFS, which both of my virtual machines have mounted. In addition, I am telling runC to mount the container file system read-only.

runC Container Configuration

Just as on the two involved systems, I will also use Red Hat Enterprise Linux 7 in my container. I have mounted my NFS share at /runc and instead of using an OCI bundle I am installing the container’s root file system using yum:

# mkdir -p /runc/containers/rhel7-httpd/rootfs/var/lib/rpm/

# rpm --initdb --root /runc/containers/rhel7-httpd/rootfs

# yum install --releasever 7.3 --installroot \

/runc/containers/rhel7-httpd/rootfs bash coreutils \

procps-ng iptools httpd

Once the container’s root file system is installed I am using oci-runtime-tools to generate my runC configuration file (config.json). There is a blog post describing the use of oci-runtime-tools which I followed to install it. This blog post still uses the old name ‘ocitools’ but besides that the following steps are taken from that blog post:

# export GOPATH=/some/dir

# mkdir -p $GOPATH

# go get github.com/opencontainers/runtime-tools

# cd $GOPATH/src/github.com/opencontainers/runtime-tools/

# make

# make install

Now I can use oci-runtime-tools to create the configuration for my runC based container:

# cd /runc/containers/rhel7-httpd/

# oci-runtime-tool generate --args "/usr/sbin/httpd" --args "-DFOREGROUND" --network host --tmpfs /tmp --tmpfs /run --tmpfs /var/log/httpd --tmpfs /run/httpd --read-only  --bind /tmp/rhel7-httpd:/var/www/html > config.json

# runc start rhel7-httpd -d &> /dev/null  < /dev/null

The used options have the following meaning:

  • --args "/usr/sbin/httpd" --args "-DFOREGROUND”

    “Start the httpd server in the foreground.”  (Otherwise the container would stop running immediately.)

  • --network host

    We are using the hosts’ network interfaces directly.

  • --tmpfs /tmp --tmpfs /run --tmpfs /var/log/httpd --tmpfs /run/httpd

    “Mount a few tmpfs where read-write file systems are required; data does not need to be persistent.”

  • -read-only

    “Make the container read-only (except for above mentioned tmpfs).”

  • --bind /tmp/rhel7-httpd:/var/www/html

    “Bind mount a local directory which contains an index.html file with following content: hostname > /tmp/rhel7-httpd/index.html”  (This command is run on both systems involved (rhel01 and rhel02). This way it is easy to see on which host the container will be running as it will return the hostname of the host running the container.)

The last line starts the container. The stdin and stdout redirection is necessary as runC currently does not correctly close those file descriptors when running in detached mode (-d).

Once the container is running I can connect from a third host to the webserver in the container:

$ curl rhel0x

rhel01

The hostname rhel0x belongs to an IP address which moves with the container so that the client can connect to the same IP address no matter where the container is running.

Now that the container is successfully running it is time to migrate it from my first host (rhel01) to my second host (rhel02) . The first step is to checkpoint the current state of the container on the host the container is currently running on like this:

# cd /runc/containers/rhel7-httpd/

# runc checkpoint rhel7-httpd

In its default configuration runC writes the checkpoint data to a directory called checkpoint. The checkpoint contains all necessary information to restore all the processes in the container to the same state the processes were during the checkpoint. This includes open files, memory content and the mounted tmpfs file systems. Once the checkpoint command has finished I can restore the container on my second host  (rhel02):

# cd /runc/containers/rhel7-httpd/

# runc restore -d rhel7-httpd

The restore command reads the checkpoint from the directory checkpoint and restores everything back to the state during checkpointing. After that the container is running and can be accessed as before:

$ curl rhel0x

rhel02

The difference now is that the content from the file on the second host is returned from the webserver. As the container’s root file system and the checkpoint are on a NFS share I do not need to copy neither the checkpoint nor the root file system from my source system to my destination system.

Everything described in this article is using:

# rpm -q runc criu

runc-0.1.1-5.el7.x86_64

criu-2.3-2.el7.x86_64

Questions on this process and/or on CRIU (in general)?  I encourage you to reach out using the comments section below!

  1. In `oci-runtime-tool` readonly flag now is called as `–rootfs-readonly`
    also in `runc` to start container it is `runc run …`

    1. The go compiler is part of the golang package on my system:

      #️ rpm -qf `which go`
      golang-bin-1.6.3-2.el7.x86_64

      Try to install the go compiler.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s