The 5 Most Important Things I’ve Learned From Using Docker

5Things

 

Introduction

I’ve been using Docker for a little over a year, on several platforms – local
Linux installs as well as cloud providers – and in that time I’ve learned a decent amount
about how to manage my own images, build images flexible enough for any platform,
and even a bit about writing my own, not-intended-for-Docker-at-all programs.
I’ve tried to consolidate these into five usable, digestible items, to keep in
mind when starting a new Docker (or non-Docker!) project.

1. I need to be really, really specific and slightly paranoid when making images

I try to not run my applications as root. One nice thing about most distros:
they usually create a system user when you install any type of service. For
example, nearly every distro creates some type of http, apache, or
www-data user when Apache is installed.

I decided to make an image for ejabberd, from source – and part of my build
script would create an xmpp system user. And like many people, I used
Docker’s handy Automated Builds service, and set my image to automatically
rebuild when the base ubuntu image was updated.

I had an error in my image though: instead of saying FROM ubuntu:12.04 I
simply said FROM ubuntu – so one day, my image auto-updated and switched
to Ubuntu 14.04, which added a new default system user. This caused my
application’s UID to increase by one. I pulled my latest ejabberd
build and it failed to start since the ejabberd user was unable to read
files (since I use a volume to store ejabberd’s files).

This has led to me to do two things now:

  1. I use a specific distro version tag for any image I build.
  2. I write start-up scripts for every application.

Those start-up scripts normally launch as root and do a few things:

  • They make sure needed config files exist in the first place – you never
    know when they’ll be replaced by some empty volume!
  • They chown the configuration and data files to the user I’ll run my app as.

This has probably saved me countless hours of frustration. I’m very specific
with what my images are based off of, and I don’t just use my main application
as the image’s ENTRYPOINT – I write some type of script to make sure the
environment is sane.

2. I have no way of knowing what capabilities somebody’s system will have

Until recently, I ran strictly the latest and greatest version of Docker on
Ubuntu Linux. However, after trying out some cloud providers I’ve learned a few
things:

  • Somebody else might not be running the latest version of Docker.
  • Somebody else might not have every capability of Docker made available to them.
  • Somebody else might not have root access to the system running Docker.

This has drastically changed how I build my Docker images – I no longer write
instructions like, “you have to launch a container using --volumes-from” or
“this requires a linked container named DB” – I don’t know what another user might be using to run my images! I try and go out of my way to make my
images as flexible as possible. If my image needs a MySQL database, I’ll work
with a linked container, or an environment variable telling me where to connect
to, and so on.

This adds a lot of work up front, but I think it’s very worth it. Now I can
do all sorts of kooky things, like run an ambassador MySQL container that
connects to an actual MySQL database, and use the link option to use
a specific hostname when connecting. It’s pretty neat!

3. Dockerfiles might be a pain at first, but I love ’em now

There are two ways to create a Docker image. On one hand I can spawn a container and build it interactively. Once my container is running the way I’d like it to, I can
tag that container and use it as an image.

This is an easy way to build a container. I’ll be prompted when a package has an interactive step for setup. I can just edit configuration files with my
preferred text editor. It’s a very simple process.

Dockerfiles on the other hand, can be something of a pain. What happens when a
package prompts me for something? How do I edit files non-interactively? It’s a
tricky process; there’s no way to make a part of it interactive – everything
has to be entirely automated.

I’ve found the benefits outweigh the cons, though. I can’t count how many times
I finally have a program running just the way I like, but can’t remember exactly
what I did. A Dockerfile on the other hand? That shows exactly what I did.
It’s all right there; even better, I can place it under version control.
Yes, it requires more work up front to build an image, but it’s the
only way I build Docker images now.

4. Spawning processes requires care – whether I use Docker or not!

It’s not unusual for an application to spawn a child process; I do it in my
own programs all the time. On most systems I can spawn a process and read its
output, check its exit code, whatever I need to do – and let the init system
handle cleaning up when the process exits. For years, I’ve been writing
programs this way without so much as a second thought.

In many cases, a Docker container will not be running an init system – and any
spawned processes will hang around as zombie processes consuming resources.
I’ve learned how to properly monitor and destroy a process within my own
programs, just in case somebody ever decides to run my program in their own Docker
image. You never know when somebody will take a program you’ve written and do
something new with it!

5. One task per container doesn’t have to mean one process per container

This is a thought that’s met with some controversy. Go into the #docker
channel on Freenode and ask for recommendations on process supervision inside
of containers; you’ll get a variety of different responses. Nearly
everybody agrees that containers should perform one task, but there’s a lot of
disagreement on whether that means one process or not.

I’ve come to the conclusion that using a process supervisor (like supervisord,
runit, or s6) is completely acceptable as long as I determine what my task
is, and only run services needed for that task. This is especially useful for
web applications, where I often need specific rewrite rules for Apache or
Nginx.

For example, if I have a web application that requires PHP-FPM, Nginx, Cron,
and MySQL – I’ll run PHP-FPM+Nginx+Cron in one container, but keep MySQL in
another container. I also make sure that if a key process exits or crashes,
the process supervisor follows suit and quits as well. This preserves the normal Docker behavior of quitting when the main process quits.

Conclusion

I sincerely hope my lessons learned will be useful to those of you getting into
Docker (and those of you already pretty deep into it, too). Working with containers
has changed how I build software, and working with different Docker platforms has
changed how I build my containers.

If you have anything you’ve learned from using Docker, I’d love to hear about it from you!
Use the comment section below to share your stories and tips, and any other feedback you
may have, and thanks for reading!

John Regan is a System Administrator for the Oklahoma Mesonet, a world-class network of environmental monitoring stations, a joint project of the University of Oklahoma and Oklahoma State University.

Tagged with: , , ,
Posted in General
12 comments on “The 5 Most Important Things I’ve Learned From Using Docker
  1. Very useful post, thanks! I think i just understood the value of using ambassador containers 🙂 In the third question, do you use CM (ansible / puppet / …) in the docker file, or just ‘provision’ with shell scripts?

    • John Regan says:

      Hi Wéber –

      The way I do it is probably a bit weird, but here’s how I handle provisioning:

      In my Dockerfile, I usually write some shell scripts that use environment variables to setup the application. Then, I use Ansible to provision containers on my host machines. I don’t use Ansible’s own Docker integration, instead I have Upstart scripts as templates in Ansible. So Ansible will place the Upstart script on my host box and start up the container with the correct set of environment variables.

  2. zwischenzugs says:

    Great post.
    Regarding point 1) – I had a very similar experience here: http://zwischenzugs.wordpress.com/2014/07/16/phoenix-deployment-pain-and-win/
    Regarding point 3 and 5) – this was why I built http://ianmiell.github.io/shutit/, as Dockerfiles were a great idea in principle, but didn’t cut it for our complex and dynamic build needs.

  3. Fergus Gallagher says:

    I’m curious why you have supervisor exit, rather than respawning the “key process”.

    • John Regan says:

      Most existing Docker images run a single process, so if the process exits, the container exits as well.

      If I have my supervisor respawn the process, I’m breaking that behavior – the container will (most likely) never exit, meaning my image now behaves really differently from most images in the current Docker ecosystem.

      Instead, I leave the choice to auto-restart a process left up to the system admin. In my case, I use Upstart to automatically start/stop/restart my containers. When one goes down, I can have Upstart make a note of it (notify Nagios, write to a file, whatever makes sense) so I can come in later, look through the logs, and see why the process crashed/

      If my supervisor respawned the process, I’d have to come up with some other way to keep track of exits/crashes/etc.

      Idea: I could have an environment variable setup to determine if the container should restart the process or exit, so people who want the respawning behavior can have it.

      • John Regan says:

        “I leave the choice to auto-restart a process left up to the system admin.”

        Oh wow I should get some coffee. That should obviously read “I leave the choice to auto-restart a process up to the system admin.”

  4. Mike Dillion says:

    I found it interesting that you ran into the default Ubuntu user (ubuntu) when upgrading to 14.04. That default user has been around forever!

    But, instead of implementing the quick fix to support 14.04, you worked around the cause of the pain (and not the symptom) to reduce fragility. Nice! Great post.

  5. […] my previous blog post, I wrote about how I like to use a process supervisor in my containers, and rattled a few off. I […]

  6. […] may run crond within the same container that is doing something closely related using a base image that handles PID 0 well, like […]

  7. Twyla says:

    Has hecho decentes puntos allí. I comprobado en Internet para obtener información adicional sobre el tema y encontró la
    mayoría de los individuos estarán de acuerdo con sus puntos
    de vista sobre esta web .

  8. This is the great stuff.Please share more article like this,your post was really helpful me during the implementation.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories
%d bloggers like this: