In this article I want to talk about a runC container which I want to migrate around the world while clients stay connected to the application.
In my previous Checkpoint/Restore In Userspace (CRIU) articles I introduced CRIU (From Checkpoint/Restore to Container Migration) and in the follow-up I gave an example how to use it in combination with containers (Container Live Migration Using runC and CRIU). Recently Christian Horn published an additional article about CRIU which is also a good starting point.
In my container I am running Xonotic. Xonotic calls itself ‘The Free and Fast Arena Shooter’. The part that is running in the container is the server part of the game to which multiple clients can connect to play together. In this article the client is running on my local system while the server and its container is live migrated around the world.
This article also gives detailed background information about
Continue reading “Container Migration Around The World”
In my previous article I wrote about how it was possible to move from checkpoint/restore to container migration with CRIU. This time I want to write about how to actually migrate a running container from one system to another. In this article I will migrate a runC based container using runC’s built-in CRIU support to checkpoint and restore a container on different hosts.
I have two virtual machines (rhel01 and rhel02) which are hosting my container. My container is running Red Hat Enterprise Linux 7 and is located on a shared NFS, which both of my virtual machines have mounted. In addition, I am telling runC to mount the container
Continue reading “Container Live Migration Using runC and CRIU”
The concept to save (i.e. checkpoint / dump) the state of a process, at a certain point in time, so that it may later be used to restore / restart the process (to the exact same state) has existed for many years. One of the most prominent motivations to develop and support checkpoint/restore functionality was to provide improved fault tolerance. For example, checkpoint/restore allows for processes to be restored from previously created checkpoints if, for one reason or another, these processes had been aborted.
Over the years there have been several different implementations of checkpoint/restore for Linux. Existing implementations of checkpoint/restore differ in terms of “what level” (of the operating system) they are operating; the lowest level approaches focus on implementing checkpoint/restore directly in the kernel while other “higher level” approaches implement checkpoint/restore completely in user-space. While it would be difficult to unearth each and every approach / implementation – it is likely fair to
Continue reading “From Checkpoint/Restore to Container Migration”