Using Docker For CI Pipelines

August 21, 2014 Johannes Petzold

featured-dcloudIn a previous post, we provided an overview of the CI pipelines that were recently set up for loggregator.

This post zooms in and discusses both why and how we use Docker in our pipelines.

If you have never used Docker before, check out the official online tutorial!

Why Docker?

When we were initially getting started with GoCD, we quickly discovered that the environment on the GoCD build agents (the hosts which run our build scripts) did not have all the tools readily available that we needed for our builds. Some tools were not available at all, others were not available in the version we desired, and yet others were not on the $PATH. Furthermore, we found that the environment changed over time.

Docker provided an ideal solution to this problem. It allowed us to create a lightweight Linux runtime environment that starts out exactly the same way for every build, and lets us control every detail of its configuration, without affecting other teams sharing the same build agents.

There was an additional benefit we got out of using Docker. We can now run the build scripts on developer machines and be highly confident that they will behave exactly the same as on the GoCD agents. This helped us speed up the feedback and development cycle for the GoCD pipeline itself. As well, this continues to be useful whenever we need to troubleshoot non-trivial build problems by letting us easily reproduce them.

How We Are Using It

Most of our builds share the same Docker image. We’re hosting this image on the public Docker registry. Occasionally, we update this image by editing its corresponding Dockerfile whenever we need to make changes to our build environment.

In order to start a container from this image and run a build script within it, we’re using the following Bash helper function:

run_in_docker() {
  docker pull cloudfoundry/lamb-ci-base
  docker run --rm -v $WORKSPACE:$WORKSPACE --env-file <(env | grep -vE 'PATH|TMP|HOME|LANG|GOROOT') cloudfoundry/lamb-ci-base "$@"
}

With this helper, we can execute build scripts in Docker like this:

run_in_docker $WORKSPACE/lamb-ci-tools/pipelines/loggregatorlib/run-unit-tests-docker.sh

Some of the details include:

  • We use docker pull to ensure the container uses the latest image. If the image is already cached locally, this operation is very fast.
  • $WORKSPACE is a path on the build host containing all relevant sources and build scripts, as well as build output directories. To avoid having to copy potentially large amounts of data into and out of the container, we use the -v option to mount this path directly into the container.
  • We use the –env-file option to copy all environment variables (minus a few blacklisted ones) from the build host into the container. This makes GoCD configuration accessible from within the container.

Issues We’ve Seen

The main issues we ran into did not come from Docker itself, but were all related to Boot2docker (the official “Mac version” of Docker) and were therefore not an issue for actual pipeline runs on the GoCD agents (which run Linux), but did pose a bit of an obstacle during pipeline development (which happened on Macs).

First, we realized that mounting host directories into the container simply isn’t supported by Boot2docker. Fortunately, we did find a workaround that involved building a custom Boot2docker image to enable VirtualBox guest additions.

Second, the host directories mounted through VirtualBox (into the container) do not support symlinks, which caused problems for some of our build steps that attempted to create symlinks. Our workaround was to run these steps in a separate, non-shared directory within the container, after initializing this directory with symlinks to the contents of the shared directory they needed to operate on. This allowed the build steps to create additional symlinks within their directory.

Third, we encountered sporadic network issues by build steps attempting to transfer large files (hundreds of MB and above); the transfers sometimes aborted midway due to network connection resets, or even resulted in corrupted data. We haven’t found a good workaround for this issue yet; however we have seen related issues in Boot2docker’s backlog, so there is a chance this will get resolved in future versions.

Learn More:

About the Author

Biography

More Content by Johannes Petzold
Previous
Docker Service Broker for Cloud Foundry
Docker Service Broker for Cloud Foundry

Pivotal's Cloud Foundry team is a fan of Docker. In this post, learn how the Docker service broker for Clou...

Next
Session Replication on Cloud Foundry
Session Replication on Cloud Foundry

Recently, the Java Buildpack added support for session replication using Redis. This support complements th...

Enter curious. Exit smarter.

Register Now