Docker Engine in Tutum: A Tale of Three Versions

docker engine@2x

Intro

This article intends to show the reader the features and problems we at Tutum have had with the latest versions of the Docker engine and explain the decisions we have taken when dealing with the different versions since 1.5.0.

v1.5.0

On February 10th, 2015 Docker engine 1.5.0 was released. Its main new features were:

  • IPv6 support: you can allocate an ipv6 address to each container with the ‘–ipv6’ flag and you can resolve ipv6 addresses from within a container.
  • Read-only containers: you can restrict the locations that an application inside a container can write files to with the flag ‘–read-only’.
  • Stats: new API endpoint and CLI commands that will stream live CPU, memory, network IO and block IO for your containers.
  • Specify the Dockerfile to use in build, rather than relying on the default “Dockerfile” with the “-f” flag.
  • V1 image specification: documentation on how Docker currently builds and formats images and their configuration.

At the time of writing, this is the latest Docker version that Tutum installs and manages in user nodes. Keep reading to find out why.

v1.6.0

On April 7th, 2015 Docker engine 1.6.0 was released. Its main new features were:

  • Container and image labels: now you can tag them with key value metadata. For example, you can look for images or inspect containers with target label.
  • Windows client preview: can be used just like the OS X client is today with a remote host, installed from Chocolatey package manager or with Boot2docker, a lightweight Linux virtual machine with Docker installed inside.
  • Logging drivers: API that allows you to send container logs to other third-parties such as Syslog.
  • Content addressable image identifiers: you are now able to pull, run, build and refer to images by a new content addressable identifier called a “digest” with the syntax “namespace/repo@digest”. Digests are an immutable reference to the content that is inside the image.
  • “–cgroup-parent” flag: support for custom ‘cgroups’ – now you can pass a specific cgroup to run a container in.
  • Ulimits: allow you to limit the resources of a given process. Now you can specify the default ‘ulimit’ settings for all containers when setting up the daemon, or when creating the container.
  • “commit –change” and “import –change” features: introduce the ability to make changes to images on the fly, and allows you to specify standard changes to be applied to the new image.
  • Also, using v1.6.0 with the new version of the Registry v2.0 allows you faster and more reliable image pulls (the most requested feature by Tutum users to upgrade to this version).

Tutum: upgrade to v1.6.0

So 1.6.0 was released, and we at Tutum wanted to allow our users to enjoy the new features it offered. We tested it in our production environment and saw nothing unusual in the upgrade tests so we went ahead and implemented support for it.

The transition was not as smooth as we would have hoped. Actually it wasn’t smooth at all. Soon after pushing the update to our users, we noticed that every time Tutum users tried to stop or force destroy a container that had been created in version 1.6.0 or older (1.3.3, 1.4.0 and 1.5.0) the action failed with the following error:

Stop container XXX with error: [2] Container does not exist: container destroyed

---

Could not kill running container, cannot remove - [2] Container does not exist: container destroyed

And the processes in question were left running on the host, as “ghost containers”. How could this be?

Sometimes, as a workaround, restarting the Docker daemon worked. You can read more about the issue here, and it’s easily reproducible following the steps that user jnummelin indicates in this comment. Looks like, as stated in jnummelin‘s comment:

The processes appears to “escape” from docker’s control. With older docker versions (up to 1.5), processes moved to PPID 1 when the daemon is stopped. When the daemon starts it actually kills all these “leftover” processes from stopped containers. The Docker 1.5 to 1.6 upgrade broke this, with the newer version not able to handle the leftover processes from the older version.

We ultimately decided an optimal user experience was more important than the latest release, and so we decided to disable the ability for users to upgrade their Docker engines to 1.6.0 and instead remained with 1.5.0. For those users that did upgrade some of their nodes to 1.6.0, we did not offer a roll-back solution, as our tests showed it actually led to an even greater number of problems. We encouraged users with nodes running Docker 1.6.0 to terminate them and deploy new ones (which were then being automatically provisioned with 1.5.0). Thankfully, because of how Tutum works, destroying and restarting an entire node-cluster and getting your apps up and running is just a matter of minutes in the worst case scenario.

Two more 1.6 versions (1.6.1 & 1.6.2) were released before 1.7, but none of them fixed the inherent problem. They instead focused on fixing security and volume-related issues, and so we decided to stay on 1.5.0.

v1.7.X: the future

v1.7.0:

On June 16th, 2015 Docker engine v1.7.0 was released. It’s main features were:

  • ZFS storage driver
  • Networking internals have been completely rewritten and split out into a separate reusable library, called “libnetwork”. Also, the volumes system has been rewritten for higher quality and cleanliness.
  • Several improvements in internals to make it faster, more stable and easier to maintain.
  • Experimental releases: networking and plugin features have been released in another experimental release of Docker, which allows the user to try out features early and provide feedback. You can read more about it here.

We looked forward to this version for two reasons primarily. First and foremost, if fixed the stopped/removed container bug described above. It also fixed a second issue that our users and Tutum were also running into:

Tutum user’s node(s) unexpectedly becoming unreachable and every time a new container was deployed, it failing to start. 

After an exhaustive review of the affected nodes we discovered the Docker daemon AND node were running out of memory!!

In some cases, the issue was so severe that we couldn’t access the nodes to troubleshoot it. In other cases, we rebooted the node and the problem disappeared, temporarily. Thanks to the nodes we were able to troubleshoot, we were able to look at the docker daemon’s logs and system stats to find the problem…

It all started back in October 2014. Issue #8502 reported in the official Github Docker repository stated that a user was getting the following error message:

XXXX/XX/XX XX:XX:XX Error response from daemon: Cannot start container XXX: [8] System error: fork/exec /usr/local/bin/docker: cannot allocate memory

The user was using Docker daemon version 1.2.0.

It looked like the engine was running out of memory. Restarting the daemon solved the problem, so immediately the community started to think that it had to be a memory leak somewhere. A few more issues were opened like issue 8539, the main tracking issue issue 9139 or issue 12848. Both users and maintainers of the project agreed that the principal cause was a bug in the logging process. It was easily reproducible with the way user enix described in issue 9139 (comment)

When running a container based on the following Dockerfile, it doesn’t take very long for an “out of memory” crash:

FROM busybox
CMD while true; do echo -n =; done;

The bug happens mainly in containers that log output heavily to stdout/stderr. The client reads much slower than the output comes and the internal buffer keeps on growing, leading to the leak.

This issue has been seen in all Docker versions since 1.2.0. Many PRs have been released, but none have fixed the problem entirely, at best they have just mitigated it. Docker maintainers point to version 1.7.0 as the one free of this bug. As a workaround, you can restart the Docker daemon or reboot the host. In Tutum, this means you can restart tutum-agent (sudo service tutum-agent restart) or reboot the node and the issue is fixed, but again it’s just a matter of time before it happens again.

Although version 1.7.0 fixes the Docker engine logging memory leak, it comes with its own leaks, as we can read in issue 13470 (comment) and its official tracking issue 13552, the engine uses lots of memory when using syslog driver and the container logs have many messages per second. This does not happen if you use the newer JSON log driver. It can be reproduced following the steps given in here.

Good news is, version 1.7.0 has been tested with Tutum’s MongoDB instance for weeks now and we have not seen any severe trace of leaks.

v1.7.1:

On July 14th, 2015 Docker engine v1.7.1 was released. It’s main features are fixes to the previous 1.7.0 release. You can check them all here.

We at Tutum are testing this release to be the next version that will be deployed to our user nodes. We decided to wait for 1.7.1 to allow for disk usage and registry v1/v2 fallback issues in 1.7.0 to be resolved.

Unlike with 1.6.0, support for 1.7.1 will be first rolled out to a subset of the Tutum user population. These are awesome members of the Docker and Tutum community that have volunteered to help ensure the transition is smooth to the general public. With thousands of nodes scattered in 40+ regions/data centers, Tutum is running one of the largest production Docker deployments in the world. Despite it only taking a handful of minutes to go from zero to running using Tutum, it’s of uttermost importance that users don’t experience a hiccup when upgrading their infrastructure to the latest and greatest Docker version.

Stay tuned for 1.7.1 as it will soon be coming to a node near you!

Thank you!

The team at Tutum would like to take this opportunity to give a shout-out to every Docker contributor, and in particular to the core maintainers. Keep up the fantastic work! Thank you!

Learn more

To know more about the details of every version released by Docker, check the Docker Changelog and the official announcements:
Docker 1.5
Docker 1.6
Docker 1.7

Tagged with: ,
Posted in Features, General
2 comments on “Docker Engine in Tutum: A Tale of Three Versions
  1. cazcade says:

    Excellent service and a sensible approach, just wish the Docker folk would slow down a little and focus on fixes not just new features.

  2. Thanks for the update, it’s really interesting to see what you are doing. We have been using a few of your tutum hub images for quite a while in pre-production, and I wanted to share a refinement we developed. We got really tired of dealing with all kinds of internal container management issues (not in your containers specifically, but overall), and we tried supervisord, runit, etc. Eventually, we built our own internal process manager that does a whole lot of stuff to solve logging, PID management, etc… and consumes very few resources. We’re looking for any feedback from experienced people like yourselves. Here is the documentation page: http://garywiz.github.io/chaperone/index.html

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories
%d bloggers like this: