What follows is a lightly modified excerpt from chapter 10 of Docker in Action. Chapter 10 covers the Docker Distribution project in depth.
Simple Storage Service (or S3) from AWS offers several features in addition to blob storage. You can configure blobs to be encrypted at rest, versioned, access audited, or made available through AWS’s content delivery network.
Use the “s3” storage property to adopt S3 as your hosted remote blob store. There are four required sub-properties: “accesskey,” “secretkey,” “region,” and “bucket.” These are required to authenticate your account and set the location where blob reads and writes will happen. Other sub-properties specify how the Distribution project should use the blob store. These include, “encrypt,” “secure,” “v4auth,” “chunksize,” and “rootdirectory.”
Setting the encrypt property to, “true” will enable encryption-as-rest for the data your registry saves to S3. This is a free feature that enhances the security of your service.
The “secure” property controls the use of HTTPS for communication with S3. The default is false, and will result in the use of HTTP. If you are storing private image material, you should set this to true.
The “v4auth” property tells the registry to use version 4 of the AWS authentication protocol. In general this should be set to true, but defaults to false.
Files greater than 5GB must be split into smaller files and reassembled on the service side in S3. However, chunked uploads are available to files smaller than 5GB and should be considered for files greater than 100MB. File chunks can be uploaded in parallel and individual chunk upload failures can be retired individually. The Distribution project and its S3 client perform file chunking automatically, but the “chunksize” property sets the size beyond which files should be chunked. The minimum chunk size is 5MB.
Finally, the “rootdirectory” property sets the directory within your S3 bucket where the registry data should be rooted. This is helpful if you want to run multiple registries from the same bucket. The following is a fork of the default configuration file and has been configured to use S3 (provided you fill in the blanks for your account).
# Filename: s3-config.yml
accesskey: <your awsaccesskey>
secretkey: <your awssecretkey>
region: <your bucket region>
bucket: <your bucketname>
After you’ve provided your account access key, secret, bucket name, and region, you can pack the updated registry configuration into a new image with the following Dockerfile:
# Filename: s3-config.df
# Set the default argument to specify the config file to use
# Setting it early will enable layer caching if the
# s3-config.yml changes.
And built it with the following docker build command:
docker build -t dockerinaction/s3-registry -f s3-config.df .
Launch your new S3 backed registry with a simple “docker run” command:
docker run -d --name s3-registry dockerinaction/s3-registry
An alternative to building a full image would be to use bind-mount volumes to load the configuration file in a new container and set the default command. For example, you could use the following Docker command:
docker run -d --name s3-registry -v "$PWD"/s3-config.yml:/s3-config.yml registry:2 s3-config.yml
If you wanted to get really fancy, you could inject the new configuration as environment variables and use a stock image. This is a good idea for secret material handling (like your account specifics). Running the container that way would look something like:
docker run -d --name s3-registry \
-e REGISTRY_STORAGE_S3_ACCESSKEY=$AWS_ACCESS_KEY \
-e REGISTRY_STORAGE_S3_SECRETKEY=$AWS_SECRET_KEY \
-e REGISTRY_STORAGE_S3_REGION=us-west2 \
-e REGISTRY_STORAGE_S3_BUCKET=my_registry \
-e REGISTRY_STORAGE_S3_ENCRYPT=true \
-e REGISTRY_STORAGE_S3_SECURE=true \
-e REGISTRY_STORAGE_S3_V4AUTH=true \
-e REGISTRY_STORAGE_S3_CHUNKSIZE=5242880 \
-e REGISTRY_STORAGE_S3_ROOTDIRECTORY=/s3/object/name/prefix \
In practice a blended approach where you provide environment agnostic and insensitive material via configuration file, but inject the remaining components through environment variables will meet your needs.
S3 is offered under a use-based cost model. There is no upfront cost to get started, and many smaller registries will be able to operate within the “free tier” of either service.
If you are not interested in a hosted data service and do not hesitate in the face of some technical complexity, then you might alternatively consider running a Ceph storage cluster and the RADOS blob storage backend.