In my previous article I wrote about how it was possible to move from checkpoint/restore to container migration with CRIU. This time I want to write about how to actually migrate a running container from one system to another. In this article I will migrate a runC based container using runC’s built-in CRIU support to checkpoint and restore a container on different hosts.
I have two virtual machines (rhel01 and rhel02) which are hosting my container. My container is running Red Hat Enterprise Linux 7 and is located on a shared NFS, which both of my virtual machines have mounted. In addition, I am telling runC to mount the container file system read-only.
runC Container Configuration
Just as on the two involved systems, I will also use Red Hat Enterprise Linux 7 in my container. I have mounted my NFS share at /runc and instead of using an OCI bundle I am installing the container’s root file system using yum:
# mkdir -p /runc/containers/rhel7-httpd/rootfs/var/lib/rpm/ # rpm --initdb --root /runc/containers/rhel7-httpd/rootfs # yum install --releasever 7.3 --installroot \ /runc/containers/rhel7-httpd/rootfs bash coreutils \ procps-ng iptools httpd
Once the container’s root file system is installed I am using oci-runtime-tools to generate my runC configuration file (config.json). There is a blog post describing the use of oci-runtime-tools which I followed to install it. This blog post still uses the old name ‘ocitools’ but besides that the following steps are taken from that blog post:
# export GOPATH=/some/dir # mkdir -p $GOPATH # go get github.com/opencontainers/runtime-tools # cd $GOPATH/src/github.com/opencontainers/runtime-tools/ # make # make install
Now I can use oci-runtime-tools to create the configuration for my runC based container:
# cd /runc/containers/rhel7-httpd/ # oci-runtime-tool generate --args "/usr/sbin/httpd" --args "-DFOREGROUND" --network host --tmpfs /tmp --tmpfs /run --tmpfs /var/log/httpd --tmpfs /run/httpd --read-only --bind /tmp/rhel7-httpd:/var/www/html > config.json # runc start rhel7-httpd -d &> /dev/null < /dev/null
The used options have the following meaning:
--args "/usr/sbin/httpd" --args "-DFOREGROUND”
“Start the httpd server in the foreground.” (Otherwise the container would stop running immediately.)
We are using the hosts’ network interfaces directly.
--tmpfs /tmp --tmpfs /run --tmpfs /var/log/httpd --tmpfs /run/httpd
“Mount a few tmpfs where read-write file systems are required; data does not need to be persistent.”
“Make the container read-only (except for above mentioned tmpfs).”
“Bind mount a local directory which contains an index.html file with following content: hostname > /tmp/rhel7-httpd/index.html” (This command is run on both systems involved (rhel01 and rhel02). This way it is easy to see on which host the container will be running as it will return the hostname of the host running the container.)
The last line starts the container. The stdin and stdout redirection is necessary as runC currently does not correctly close those file descriptors when running in detached mode (-d).
Once the container is running I can connect from a third host to the webserver in the container:
$ curl rhel0x rhel01
The hostname rhel0x belongs to an IP address which moves with the container so that the client can connect to the same IP address no matter where the container is running.
Now that the container is successfully running it is time to migrate it from my first host (rhel01) to my second host (rhel02) . The first step is to checkpoint the current state of the container on the host the container is currently running on like this:
# cd /runc/containers/rhel7-httpd/ # runc checkpoint rhel7-httpd
In its default configuration runC writes the checkpoint data to a directory called checkpoint. The checkpoint contains all necessary information to restore all the processes in the container to the same state the processes were during the checkpoint. This includes open files, memory content and the mounted tmpfs file systems. Once the checkpoint command has finished I can restore the container on my second host (rhel02):
# cd /runc/containers/rhel7-httpd/ # runc restore -d rhel7-httpd
The restore command reads the checkpoint from the directory checkpoint and restores everything back to the state during checkpointing. After that the container is running and can be accessed as before:
$ curl rhel0x rhel02
The difference now is that the content from the file on the second host is returned from the webserver. As the container’s root file system and the checkpoint are on a NFS share I do not need to copy neither the checkpoint nor the root file system from my source system to my destination system.
Everything described in this article is using:
# rpm -q runc criu runc-0.1.1-5.el7.x86_64 criu-2.3-2.el7.x86_64
Questions on this process and/or on CRIU (in general)? I encourage you to reach out using the comments section below!