Docker and S6 – My New Favorite Process Supervisor

Docker and s6

In my previous blog post, I wrote about how I like to use a process supervisor in my containers, and rattled a few off. I decided I ought to expand a bit on one in particular – S6, by Laurent Bercot.

Why S6, why not Supervisor?

I know a lot of people have been using Supervisor in their containers, and it’s a great system! It’s very easy to learn and has a lot of great features. Phusion produces a very popular, very solid base image for Ubuntu based around Supervisor Runit.

UPDATE 2014-12-09: I mistakenly wrote that Phusion uses Supervisor, when they in fact use Runit.

When Docker launches a container, your ENTRYPOINT process is launched as “process id 1” – and on nearly any Linux system, whether it’s a physical install, a virtual machine, or a container, “process id 1” has a special role. When any orphaned/disowned process exits, PID 1 is supposed to clean up after it.

Supervisor explicitly mentions that it is not meant to be run as your init process. If you have some subprocess fork itself off, it won’t be cleaned up by Supervisor. Phusion dealt with this by writing their own init process, which is fine – I don’t see anything particularly alarming or bad in their code – but it seems like overkill compared to just using a process supervisor that can run as init.

S6 is meant to run as “process id 1”, so I figure it’s a good way to cut out the middle man. Another benefit to S6 – it’s written in C, so I can easily produce static binaries and use it with any image, even the busybox image.

Getting S6 into an image

There’s two ways to get S6 into your image – either build it within your Dockerfile, or statically build it and include it via COPY or ADD directives. I prefer the latter, since I can reduce my image’s build time and keep the image size smaller.

I’ve become a fan of using Docker to create “build images,” where I create an image that compiles code and spits out a tarball. I have an image for S6 on github that produces static binaries of the S6 suite. Feel free to look at that (or just use it) to get an idea of how to compile S6.

I usually start by making a base image that includes S6 and other programs I tend to use in most of my images (for example, I can’t live without curl installed), then continue building images from that base. I’ve taken to making a folder named root in my project directory, laying out my filesystem in it, then having COPY root / towards the end of my Dockerfile. This lets me bring in S6 and my own configuration files as one layer.

Here’s a simplified version of what my root directory looks like.

root
|-- etc
|   |-- s6
|       |-- cron
|       |   |-- finish
|       |   |-- run
|       |-- syslog
|           |-- finish
|           |-- run
|       |-- .s6-svscan
|           |-- finish
|-- usr
|       |-- bin
|           |-- (s6 binaries)

Like I said, when I run docker build, all these files are copied into the image as a single layer.

Using S6 to start services

S6’s main init-like program is s6-svscan – when launched, it will scan a directory for “service directories”, and launch s6-supervise on each of those. In my example above, I’m using /etc/s6 as my “root” s6 directory, so cron and syslog are “service directories.” That .s6-svscan directory is not a service directory, that’s a directory used by s6-svcan.

Each service directory has two files – run and finish. The s6-supervise program will call your run program, and when the run program exits, it will call your finish program, then start over (by calling your run program). The run program can be anything – a shell script, or if a program requires no arguments/setup, I can just symbolically link to it, and the same goes for the finish program. If I don’t have any particular clean-up to do when my run program exits, I’ll just make finish a symlink to /bin/true.

When it comes to actually running a program, S6 is similar to Supervisor, Upstart, or Systemd – S6 will “hold on” to a program, instead of say, writing out PID files like SysV Init does. So I have to make sure each of of my run scripts launches programs in a foreground/non-daemonizing mode.

This is usually pretty easy to do – here’s my run script for cron:

#!/bin/sh
exec cron -f

My run script for syslog:

#!/bin/sh
exec rsyslogd -f /etc/rsyslog.conf -n

And here’s the ENTRYPOINT and CMD directives from my Dockerfile:

ENTRYPOINT ["/usr/bin/s6-svscan","/etc/s6"]
CMD []

That’s all! I’m now cruising along with S6, and running multiple processes inside a container.

Handling docker stop

Earlier, I mentioned that .s6-svscan directory, and it’s actually pretty important. When Docker stops a container, it sends a TERM signal to “process id 1”, which in my case is s6-svscan. When s6-svscan gets a TERM signal, it will send a TERM to all the running services, then try to execute .s6-svscan/finish. The important thing to note: it will not try to run the finish script in each of your service directories.

Since the container is about to be stopped (and probably destroyed), this isn’t a problem. I still like to run my ‘finish’ scripts, though, just in case I write one where I do something of importance. Here’s my .s6-svscan/finish script:

#!/usr/bin/env bash
for file in /etc/s6/*/finish; do

   $file
done

UPDATE 2015-03-01: Laurent reached out to me and pointed out I was incorrect – when s6-svscan gets a TERM signal, it will:

  • Send a TERM signal to each instance of s6-supervise (each of your monitored processes has a corresponding s6-supervise process).
  • s6-supervise will send a TERM signal to the monitored process, then execute your service’s finish script
  • After that, s6-svscan will run your .s6-svscan/finish script.

When s6-supervise receives that TERM signal, it runs finish with stdin/stdout pointed to /dev/null – meaning you won’t see any text output from those finish scripts. But they are in fact running, meaning that script above, where I manually call each finish script is not necessary.

Laurent is going to try and come up with a solution for that, since that behavior is confusing.

END UPDATE

Playing nice in the Docker ecosystem

In my previous article, I mentioned that I like to pick some process and call that the “key” process – if that dies, then my container should exit. I do this because most Docker containers do exactly that – they run a single process, and if that process calls it quits, the container calls it quits, too.

For example, let’s say I’m running a NodeJS program (for kicks, I’ll go with Ghost), cron, and syslog in a container. I don’t particularly care if cron or syslog die – I’ll just have S6 restart the process. But if Ghost dies, I want the container to exit, and let my host machine handle alerting me and restarting it. So my finish script for Ghost would be:

#!/bin/sh
s6-svscanctl -t /etc/s6

This will instruct s6-svscan to bring everything down and exit.

Ideas for future projects

There’s a few things I want to implement in the future.

Dependency-based startup

I think S6 is capable of this, I just haven’t figured out how!

Efficient logging

S6 has an interesting way to handle logs – if I create a directory named log and place a run script in it, the output of my program is piped into that run script. There’s a s6-log program that’s meant to be used as that piped-into program, that handles log rotation, can pipe logs into other processes, and so on.

I see a lot of images that just dump all output to stdout and let Docker handle it. I think there’s potential to come up with something better with these tools – I’m not sure what “better” is yet, but it’s something I’m going to be thinking about.

Conclusion

I think S6 is a really interesting, efficient alternative to Supervisor, and I especially like that I can include it on any image, even the busybox image. I really hope you enjoyed reading this – do you have any neat ideas? Have you been working on something similar? Use the comments to let me know. Thanks so much!

Resources

If you’re interested in building on top of what I’ve created, I have a collection of images here.

Everything except the “base” image I still consider pretty volatile right now. I keep all images in their own branches (and within a folder within that branch), so the latest version of the “base” image would be at /base in the “base-14.04” branch. You can find the base image here.

The base image is actually a bit more complicated than what I’ve written about, but it still follows the same basic structure/layout. However, I just run a few more services, and they have more complicated startup scripts.

You can find my Arch Linux image with s6 installed here.

Here are links to my Ubuntu and Arch images on the Docker registry.

John Regan is a System Administrator for the Oklahoma Mesonet, a world-class network of environmental monitoring stations, a joint project of the University of Oklahoma and Oklahoma State University.

Tagged with: , , , ,
Posted in Tutorial
37 comments on “Docker and S6 – My New Favorite Process Supervisor
  1. kfei says:

    Great article!
    I think s6 is what I looking for! Thank you so much!

    You’ve mentioned that:
    “When s6-svscan gets a TERM signal, it will send a TERM to all the running services, then try to execute .s6-svscan/finish.”

    In my case the processes don’t even have time to ‘TERM’ themselves because the s6-svscan return immediately after sending TERM to all running processes, and the container just stopped.

    My workaround is to add “sleep ” to the end of .s6-svscan/finish, give running processes time to gracefully shutdown themselves. But I don’t think sleeping for a fixed time is a good approach….. And I’ve tried ‘wait’, it does not work, container still stops instantly.

    Do you have a suggestion on that? Thanks.

    • John Regan says:

      Maybe you could try the `s6-svwait` program? You could try something like:

      #!/usr/bin/env bash

      for service in /etc/s6/* ; do
          s6-svwait -d $service
      done

      That will go through and block until each service enters a “down” state.

      Give that a try and let me know if it works!

      Here’s details on s6-svwait: http://skarnet.org/software/s6/s6-svwait.html

      • kfei says:

        Thank you.

        `s6-svwait` is the solution!

        Just to mention, in my images derived from debian:jessie there is no `/etc/leapsecs.dat` such file. Have no idea what it is, but that make `s6-svwait` goes error. So my workaround is adding a `touch /etc/leapsecs.dat` before the `s6-svwait` call. But I think you may have a better way on that?

        Another question, it seems like your ‘jprjr/arch:latest’ removed `/bin/sh`? So building the ‘arch-s6-builder’ repo will fail in first RUN instruction.

      • John Regan says:

        So the more correct way to deal with /etc/leapsecs.dat is to pull in the tar file made by my s6-builder image – it has /etc/leapsecs.dat now.

        Also, that was a really odd error with jprjr/arch:latest – the root filesystem is a .tar.xz file, and Docker seemed to just copy the file as-is to / instead of extracting it to /, like it’s supposed to.

        I just updated my git repo with a .tar.bz2 file instead, and it seems to be working as it should. Run “docker pull jprjr/arch” and try it again.

      • Aris Pikeas says:

        `s6-svwait` won’t work, because it only receives notifications and doesn’t poll. I tried waiting in .s6-svscan/finish and it just hangs, because by the time that code is run, the processes have already been termed.

      • John Regan says:

        Hi Aris- I have an update to the article (under handling a Docker stop) , you don’t need a wait command in the finish script at all. I had misunderstood how s6 handles TERM-ing processes after getting a TERM signal.

    • John Regan says:

      A quick heads-up – it turns out my image was missing /etc/leapsecs.dat which is required by some of the s6 programs (but not all), and s6-svwait fails without it.

      I’ve updated my build script to include it. My Ubuntu image is being rebuilt right now, you’ll want to do a docker pull on it if you’re still having problems.

  2. Abe Voelker says:

    Excellent post. I really agree with the concept of having the container die if the core supervised process dies and letting a host supervisor restart the container. However, I could never figure out how to make that work properly with runit or Supervisor – I’ll have to check out S6!

    One nitpick (unless I’m dumb) – I believe Phusion’s baseimage-docker uses “runit”, not Supervisor, which is actually designed to be an init system (but not designed to handle certain Dockerisms).

    • kfei says:

      Phusion’s baseimage uses runit for only process supervision. They implemented a python wrapper called `my_init` for the init process.

    • John Regan says:

      You are absolutely correct – I’m not sure how I managed to make that mistake, thanks for letting me know!

      I’ve updated the article to state that Phusion is based around runit. That actually makes the whole thing more odd, since I’m pretty sure you can use runit as PID 1, though I’m having a hard time finding any documentation for or against that.

      It also seems silly to bring in Python as a dependency just for an init process, compared with a statically-compiled binary.

      • Jonathan Matthews says:

        They (Phusion) state that runit has some problems with being PID1, hence they wrote my_init. I /believe/ the only problem they explicitly mention is Runit not doing cleanup of zombie processes properly, which I’m pretty sure I’ve seen mentioned as working in post-2008(?) versions of Runit.

        s6 looks /very/ similar to Runit which is unsurprising given (I assume!) their joint DJB-esque heritage. Did you actively choose s6 over Runit, may I ask?

      • John Regan says:

        Hi Jonathan – I did actively choose s6 over runit. When Docker stops a container, it sends a TERM signal to that container’s PID1.

        The documentation for runit doesn’t list what happens when runit receives a TERM, and the runsvdir documentation makes it sound like runsvdir just quits when getting a TERM, instead of going into a shutdown procedure or sending a TERM signal to monitored services.

        The documentation for s6-svscan (basically) states that when it receives a TERM signal, it will send a TERM to supervised processes. So as far as I can tell, s6 will do a proper shutdown/exit when Docker stops a container.

  3. […] Docker and S6 – My New Favorite Process Supervisor […]

  4. gigablah says:

    Awesome article! I’ve created a phusion replacement using busybox and s6 and I’m really happy with the result (~30mb with nodejs and other bells and whistles installed).

    After reading the s6 documentation, it appears that the “finish” script in each service directory is optional? So you don’t have to symlink to /bin/true if you have nothing to run.

    Also, what do you think of adding the -t0 flag to s6-svscan? Since you’re unlikely to be adding new services to your Docker container at runtime.

    • John Regan says:

      Hi there! Thanks for reading! Your busybox+s6+nodejs image sounds really cool, do you have a link to it you could share?

      The documentation makes the “finish” script sound optional, but s6 winds up getting chatty/noisy when it can’t find a finish script. Also, I like the idea of keeping all my services the same – they all have a “run” and “finish” script, whether or not the “finish” script does anything. That way, I’m not surprised when I come back in 6 months and wonder where the heck my finish script went.

      Also, I think adding the -t0 flag is a great idea, because you’re right – it’s pretty unlikely that anybody will be adding services into a running container.

      • gigablah says:

        Hello! Just an update, seems like Alpine Linux is a better solution since it has an up-to-date nodejs package. So I guess I’ll be switching to that instead of rolling my own BusyBox and manually compiling nodejs. Looks like you’re working on adding s6 to their package index 🙂

  5. Thank you for this article, this really helped me to start getting our docker environment setup properly. Was trying to find a solution to having the container restart it’s own processes. This works really well. I will be using runit on the host OS and setup a service to run the app. I have just started testing this and it is working without any issue. If the s6 service inside the container stops, exists or quits, it will restart on it’s own. On the host, when running docker stop container-name, when the runit service managing the container is up, it will restart the container. And finally with running sv stop container-service-name it stops everything. I am going to keep testing this with adding more containers to manage and linking other containers, hopefully this all will continue to work. Also I have built an image from your ubuntu image that includes Ruby 2.1. If you are interested here is the link, https://registry.hub.docker.com/u/tetheredge/s6-ruby2.1/ to build a Ruby app with.

  6. michael says:

    Thanks for the article.
    How does s6 handles environment variables passed to container?

  7. […] There are two goals, that we want to accomplish at the same time as finding a solution to the PID 1 problem.  These goals, both coincide with fixing the problem with PID 1 issue.  The first goal, is to have the host be able to detect that the container has an issue and needs to restart the container again, one specific instance we want to solve for is to auto start the container after reboot or shutdown of the host.  Yes, I know that Docker is able to do this, but I want more control over the containers as well and a more simpler interface to manage the start, stop and restarting of containers.  The second goal, is to have the container setup in a process supervisor so that it can recover from it’s own failure or to terminate the process it is running in and send that signal up the chain to the host.  Once the host receives the signal it will then restart the container.  Now Docker does not have this level of control, which is why I looked around to find a solution.  The answers came from the links I posted above, and the concept of for one of the answers, came from this blog post. […]

  8. […] past December I wrote about s6, a small process supervisor that works very, very well in Docker containers. It provides all the […]

  9. Alper says:

    I seem not to understand the purpose of process supervisors in containerized environments. As far as i recall the docker team itself encourages “one service per container”. Therefore, the service should be started directly in non-daemon mode. If you need more than one process you should split the services. Is that too idealistic or did i miss the point?

    • John Regan says:

      Hi Alper! I don’t think splitting up containers into individual services is “too idealistic” at all, and it’s what I encourage.

      The way I think of it is this: what defines a service? I think a single service is not necessarily the same as a single process or program. In fact, many services that run in a foreground mode still spawn processes in the background and act as a process supervisor – NGINX and Postfix both operate like that. Also, using a processor supervisor lets you abstract the concept of a service – for example, I think of Gitlab as a service, and I have a container that runs GitLab, SSH, and Nginx, since I think of that as making up one logical service.

      Even if you stick to “one process per container,” there’s still a few good reasons for using a supervisor:

      • As I mentioned, many programs launch background processes. Some (like NGINX) handle process supervision correctly and clean up after processes exit, but many don’t and assume that PID 1 will handle it. If you’re running a program directly (as PID 1) it may never clean up orphaned processes, so I prefer to use a supervisor that is explicitly meant to handle those duties.
      • When you run docker stop, Docker sends a TERM signal to whatever’s running as PID 1 in the container. Most programs catch the TERM and use it to signal a normal shutdown, but some don’t. I know Consul prefers to use an INT signal by default. If you use a supervisor, you can catch that TERM signal and send a different signal to your supervised process.

      There’s tons of approaches to building images, I like to make my images reasonably self-contained and flexible. If I build an image for running a web service like WordPress, I know some users won’t particularly care about setting up a separate NGINX container, so my image will run NGINX along with PHP-FPM with a reasonable set of default configs. For those users that do prefer to run NGINX in a separate container, I’ll use environment variables to disable NGINX startup.

      Like I mentioned above, even if the container is only running one process, that process may not be designed to run as PID 1, so I think it’s worth using a supervisor that can take care of PID 1 duties.

      Thanks for the feedback!
      -John

      • Alper says:

        That’s very insightful, thank you!

      • cappa22 says:

        Thanks for the posts on S6. I’m actually trying it out by trying to define an OpenSSH service in my container but I haven’t yet managed to do so. In this message, you mention you have a SSH service running on a container. How did you do that?

        The problem I’m facing when executing my container is this:


        [services.d] starting services
        [services.d] done.
        sshd re-exec requires execution with an absolute path
        sshd exited 255
        [cont-finish.d] executing container finish scripts…
        [cont-finish.d] done.

        The main problem is that I do not know how to write the run script to launch the SSH as it is required by s6 (without forking and with exec).

        This does not work:

        #!/bin/bash
        mkdir /var/run/sshd
        chmod 0755 /var/run/sshd
        exec /usr/sbin/sshd -D

        Cheers!

  10. Sameer Naik says:

    I was investigating the use of s6 as a process manager and in “Handling docker stop” you mention that s6-svscan executes following behaviour when it receives a TERM signal.

    – Send a TERM signal to each instance of s6-supervise (each of your monitored processes has a corresponding s6-supervise process).
    – s6-supervise will send a TERM signal to the monitored process, then execute your service’s finish script
    – After that, s6-svscan will run your .s6-svscan/finish script.

    Since I am not able to see the output of the process when it receives the TERM signal. I decided to check if the services finish script was getting executed by adding a `sleep 10` in it. So the test was, if the container takes some time to exit when it receives the TERM signal, then I could be certain that the finish script was being executed. But unfortunately I did not find this to be the case. It terminated immediately like some have mentioned in the comments.

    Regarding the TERM being propagated to the processes. I decided to test it out using mariadb, since on startup mariadb clearly states if it was shutdown properly the last time. And sure enough the logs said `InnoDB: Database was not shut down normally!`

  11. xataz says:

    Hi,

    s6 looks good. But I just question :
    For a container, I need to launch a script before all daemon/services.
    example :
    A LEMP container, for this, I need to create table, create file, create directory before launch nginx, php et mysql.
    How do you this with s6 ,

    sorry for my english.

    Thanks

  12. Great post, helped me a lot. Thanks John!

  13. […] can read about s6 primarily here, but also some success container stories here and here. It’s useful to know that the groundwork has been layed and that s6 is indeed viable […]

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories
%d bloggers like this: