Efficient development with Docker and docker-compose

We are going to set up a development environment for a project consisting of various services. All of these services will be containerized with Docker, and they will all run simultaneously during development using docker-compose.

Our environment will feature instant code reloading, test-driven development, database connectivity, dependency management, and more™. It will be possible to easily deploy it to production with docker-compose or with Rancher. And as a bonus, we’ll set up continuous integration on Gitlab.

The article is about efficiency, so I’ll get straight to the point.

The goals

We want to:

Type code and observe changes as soon as possible in our Docker containers, preferably without manual actions;
Have a local environment that is representative of the actual deployment environment;
Support a number of workflows.

Let’s make it happen.

The prerequisites

You are going to need to install the following tools:

docker (CE is fine)
docker-compose

The premise

We’ll set up a project consisting of a Python and Java service, together with a Postgres database. The Postgres database will run on our own machine in the development environment, but is assumed to be external during production (it might use Amazon RDS for example).

The Python service contains unit tests supported by Pytest, for which we will set up test-driven development. The Java service uses Maven for its build process.

Finally, we will use Gitlab’s container registry and Gitlab’s CI service. The code as described below is also available in a Github or Gitlab repository.

This setup should demonstrate most essential concepts. However, the approach described below should work regardless of technology.

The setup

The file structure:

|/myproject| /python| /mypackage| run.py| /tests| my test.py| Dockerfile| setup.py| requirements.txt|| /java| Dockerfile| pom.xml| /src| /main| /java| /com| /example| /Main.java|| docker-compose.common.yml| docker-compose.dev.yml| docker-compose.prod.yml| Makefile| python-tests.sh| .gitlab-ci.yml

The Dockerfile for the Python service is as follows:

FROM python:3.6-slim

COPY . /codeWORKDIR /code

RUN pip install --no-cache-dir -r requirements.txtRUN pip install -e .

ENTRYPOINT python ./mypackage/run.py

This adds the service code to the container, installs its dependencies (contained in the requirements.txt, which in this example will contain pytest and watchdog), and installs the Python service itself. It also defines the command to be executed when the container is started.

The Dockerfile for the Java service can be found below:

FROM maven:3.5-jdk-8

COPY . /usr/src/appWORKDIR /usr/src/app

RUN apt-get update && apt-get install entr -y

RUN mvn clean package --batch-modeENTRYPOINT java -jar target/docker-compose-java-example-1.0-SNAPSHOT.jar

Like the Python Dockerfile, this also first adds the code to the container. It then proceeds to install the Unix utility entr which we will need later. Maven is used to create a JAR file, after which we define the container command to execute the JAR file.

Finally, the docker-compose.common.yml file forms the basis for our environment, and contains all configuration that is important for the application, regardless of environment in which it is being executed. It is fairly straightforward:

version: '2'

services:python:build: ./pythonenvironment:- POSTGRES_USER- POSTGRES_PASSWORD- POSTGRES_DB- POSTGRES_HOST

java:build: ./javaenvironment:- POSTGRES_USER- POSTGRES_PASSWORD- POSTGRES_DB- POSTGRES_HOST

That gives us all the ingredients to create a development configuration.

The development configuration

Let’s have a look at the docker-compose.dev.yml file. It might seem daunting but don’t worry, we’ll go through it step by step below.

version: '2'

services:python:image: registry.gitlab.com/mycompany/myproject/python:devvolumes:- ./python/:/codeentrypoint: watchmedo auto-restart --recursive --pattern="*.py" --directory="." python mypackage/run.pydepends_on:- postgreslinks:- postgresenvironment:- POSTGRES_USER=user- POSTGRES_PASSWORD=password- POSTGRES_DB=myproject- POSTGRES_HOST=postgres

python-tests:image: registry.gitlab.com/mycompany/myproject/python:devvolumes:- ./python/:/codeentrypoint: watchmedo auto-restart --recursive --pattern="*.py" --directory="." pytestdepends_on:- python

java:image: registry.gitlab.com/mycompany/myproject/java:devvolumes:- ./java/:/usr/src/appentrypoint: sh -c 'find src/ | entr mvn clean compile exec:java --batch-mode --quiet'depends_on:- postgreslinks:- postgresenvironment:- POSTGRES_USER=user- POSTGRES_PASSWORD=password- POSTGRES_DB=myproject- POSTGRES_HOST=postgres

postgres:image: postgres:9.6environment:- POSTGRES_USER=user- POSTGRES_PASSWORD=password- POSTGRES_DB=myprojectvolumes:- /data/aedspy/postgres:/var/lib/postgresql/data

pgadminer:image: clue/adminerports:- "99:80"

The development configuration — Python

Let’s start with the Python service. I’ll point out the interesting parts.

volumes:

./python/:/code

What effectively happens here is that the python subdirectory on our host machine, containing the code for our Python service, is now mapped to the/code directory in our container. To answer the question on why that’s relevant, let’s have a quick look again at the relevant lines in the Python Dockerfile:

COPY . /codeWORKDIR /code

Without the volumes statement in the docker-compose.dev.yml file, the contents of the python subdirectory would simply be added to the container. If a change is made on the host machine, the container has to be rebuild before we can see those changes within the container.

However, with the volumes statement contained in the docker-compose.dev.yml file, any changes you make will immediately be reflected inside the container. This is because both directories now point to the exact same files.

The next lines in the docker-compose.dev.yml file make use of this:

entrypoint: watchmedo auto-restart --recursive --pattern="*.py" --directory="." python mypackage/run.py

This line overrides the entrypoint for the container (which is the command that is executed with you start the container). The default entrypoint was defined in the Dockerfile, as follows:

ENTRYPOINT python ./mypackage/run.py

Thanks to the entrypoint statement in our docker compose file, however, this entrypoint will now be replaced by the command starting with watchmedo. The watchmedo command is part of the watchdog package which we included in the requirements.txt file. It monitors files with the provided pattern (in this case, all *.py files) in a given directory, and if any of them is modified, watchmedo will restart the running process and execute the provided command (in this case, python ./mypackage/run.py).

This line, combined with the volume mapping we’ve mentioned earlier, means that every modification of any Python file in the ./python directory on our host machine will restart our application. If you open any Python file and modify it, you will see that every change you make will immediately be reflected in the running container.

It might be just me, but this is one of the coolest things I’ve ever seen.

Note: Keep in mind that you do need to rebuild the image if you add a new dependency.

The development configuration — Python unit tests and Test-driven development

Let’s have a look at the docker-compose.dev.yml file for the Python unittests service, named python-tests:

python-tests:image: registry.gitlab.com/mycompany/myproject/python:deventrypoint: watchmedo auto-restart --recursive --pattern="*.py" --directory="." pytestdepends_on:

python

Interesting to note is that the image is the same as the image of the Python service. This means that it will use the exact same environment that the Python service uses; the image will only be built once.

The depends_on statement advises docker-compose to build the python service before running the python-tests service.

But the most important line is, once again, the entrypoint. We do something quite similar here to what we do in the regular Python service, but instead, we now let watchemedo execute pytest on every modification (which, if you recall, was also included in the requirements.txt file).

The result of this is that every code change will now automatically execute all tests that pytest can find, giving you instant feedback on the status of your tests.

This makes test-driven development with Docker trivial.

The development configuration — Java

Java, by being a compiled language, is a little bit more complicated to get working. Fortunately, Maven helps us most of the way.

The first important thing to note is the following line in the Dockerfile:

RUN apt-get update && apt-get install entr -y

What happens here is that the command line tool [entr](http://entrproject.org/) is installed. It functions very similar to watchdog’s watchmedo command that we used with Python, but the advantage is that it doesn’t need Python to function; it’s just a general purpose Unix utility. In fact, we could have used it in the Python service as well, but, well, we didn’t.

We can see it in action in the docker-compose.dev.yml file, in the entrypoint of the java service:

entrypoint: sh -c 'find src/ | entr mvn clean compile exec:java --batch-mode --quiet'

This says that ‘whenever any file in the directory src/ changes, ask maven to clean, compile and then execute the Java project’.

None of this will work out of the box; Maven first requires some fairly extensive configuration in the pom.xml file. More specifically, we require a few plugins, namely the Maven compiler plugin, the Maven JAR plugin to execute the container’s default entrypoint (java -jar) and the Exec Maven plugin to run during development (use the Java goal).

Once this is all configured, together with the other content what makes up a valid pom.xml file, the result is the same (and actually slightly better) as with the Python service: every change to a Java source file restarts the application, compiles the Java files, installs new dependencies (thank you Maven!) and restarts the application.

The development configuration—PostgresDB

Let’s again look at the relevant lines in the docker-compose.dev.yml file:

postgres:image: postgres:9.6environment:- POSTGRES_USER=user- POSTGRES_PASSWORD=password- POSTGRES_DB=myprojectvolumes:- /data/myproject/postgres:/var/lib/postgresql/data

pgadminer:image: clue/adminerports:- "99:80"

The postgres service uses a standard Postgres image, which comes with a default configuration. The environment variables configures Postgres by defining a database called “myproject”, with as username “user” and password “password”.

The Python and Java services also define these environment variables. They are expected to use these in the application code to connect to the database. In the docker-compose.dev.yml file these values are all hardcoded. When building the production container, however, it is expected that the production values are passed in as environment variables from an external source. This allows for integration with a proper secrets management toolchain.

As last instruction for the postgres service, a volume is defined. This maps the Postgres database data to a location on your host machine. While not strictly necessary, this is useful to persist the data of Postgres if the container is deleted for whatever reason.

Finally, we also define the pgadminer service, which starts adminer, a useful tool for doing database management through a web interface. With this configuration, it will be accessible through port 99 in your host machine (so http://127.0.0.1:99). As hostname you should use the name of the Postgres service, which is postgres in this case, as they share the same Docker network by virtue of being defined in the same docker compose file, and as such DNS is performed automagically for you.

The development configuration — Building and executing

Now let’s take it for a spin.

First we have to build all containers for development. From your command line:

docker-compose -f docker-compose.common.yml -f docker-compose.dev.yml build

And to start all services:

docker-compose -f docker-compose.common.yml -f docker-compose.dev.yml up

As this is a lot to type, and because I like self-documenting entrypoints, I tend to define these and other essential project-wide commands in a Makefile:

dev-build:docker-compose -f docker-compose.common.yml -f docker-compose.dev.yml build --no-cache

dev:docker-compose -f docker-compose.common.yml -f docker-compose.dev.yml up

Probably you will want to run your Python unit tests without bringing up all your services at some point. For that reason, we define python-tests.sh as follows:

#!/bin/bashdocker-compose -f docker-compose.common.yml -f docker-compose.dev.yml run --rm --entrypoint pytest python $*

This will execute pytest in the python container, executing all tests. Any arguments provided to the script will be directly passed to the pytest command in the container (thanks to the $*), allowing you to run it like you would normally run pytest. Finally, we extend the Makefile with the following:

test:./python-tests.sh

The production configuration

Almost there. Let’s have a look at docker-compose.prod.yml:

version: '2'

services:python:image: $IMAGE/python:$TAGrestart: always

java:image: $IMAGE/java:$TAGrestart: always

That’s really all there is to it. Most of the configuration should be in docker-compose.common.yml, and the commands and entrypoints are all in the Dockerfiles. You do need to pass in the environment variables that have no value yet (defined in docker-compose.common.yml and in this file) however, but that should be handled by your build script.

With this, we are ready to build the services for production. So let’s do exactly that, and build it with the CI service from Gitlab. Let’s have a look at the .gitlab-ci.yml file. It’s quite bare-bones and allows for optimization, but it gets the job done.

stages:

build
test

variables:TAG: $CI_BUILD_REFIMAGE: $CI_REGISTRY_IMAGE

services:

docker:dind

image: docker

before_script:

apk add --update py-pip
pip install docker-compose
docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY

build:stage: buildscript:- docker-compose -f docker-compose.common.yml -f docker-compose.prod.yml build- docker-compose -f docker-compose.common.yml -f docker-compose.prod.yml push

test-python:stage: testscript:- docker-compose -f docker-compose.common.yml -f docker-compose.prod.yml pull python- docker-compose -f docker-compose.common.yml -f docker-compose.prod.yml run --rm --entrypoint pytest python

There are a few Gitlab CI specific things in here, such as the definition of the docker:dind service and the image within which to run the build, both required to have docker available. Also, the before_script part is important, as it installs docker-compose (because I couldn’t find a good image with an up to date version). You will also notice that the $TAG and $IMAGE variables are set by using the environment variables passed in by Gitlab’s CI runner by default.

Furthermore, Gitlab has a concept of secret variables that are passed as environment variables in your build. Simply set the right values and docker-compose will pick them up. If you use another CI environment, I’m sure it has some mechanism for this as well. And if you prefer to be a little more low-tech, you can of course also write your own deploy script.

Wrapping up

So there you have it; an efficient but not very complicated setup to orchestrate and develop most projects inside Docker.

If you’re interested, I’ve also written an article on setting up simple deployment with docker-compose on Gitlab. That should take you from a development environment that builds on Gitlab CI right to continuous deployment with docker-compose.