Container Live Migration Using runC and CRIU

In my previous article I wrote about how it was possible to move from checkpoint/restore to container migration with CRIU. This time I want to write about how to actually migrate a running container from one system to another. In this article I will migrate a runC based container using runC’s built-in CRIU support to checkpoint and restore a container on different hosts.

I have two virtual machines (rhel01 and rhel02) which are hosting my container. My container is running Red Hat Enterprise Linux 7 and is located on a shared NFS, which both of my virtual machines have mounted. In addition, I am telling runC to mount the container

Continue reading “Container Live Migration Using runC and CRIU”

From Checkpoint/Restore to Container Migration

The concept to save (i.e. checkpoint / dump) the state of a process, at a certain point in time, so that it may later be used to restore / restart the process (to the exact same state) has existed for many years. One of the most prominent motivations to develop and support checkpoint/restore functionality was to provide improved fault tolerance. For example, checkpoint/restore allows for processes to be restored from previously created checkpoints if, for one reason or another, these processes had been aborted.

Over the years there have been several different implementations of checkpoint/restore for Linux. Existing implementations of checkpoint/restore differ in terms of  “what level” (of the operating system) they are operating; the lowest level approaches focus on implementing checkpoint/restore directly in the kernel while other “higher level” approaches implement checkpoint/restore completely in user-space. While it would be difficult to unearth each and every approach /  implementation – it is likely fair to

Continue reading “From Checkpoint/Restore to Container Migration”