Quantcast
Channel: Tomas Fernandez, Author at Semaphore
Viewing all articles
Browse latest Browse all 186

Change Management for Containers

$
0
0

Touching a working Dockerfile can feel like playing with fire. We know that an innocent-looking change can have branching, hard-to-debug consequences. It’s easy to get burned. But change is inevitable, and while commits on Dockerfiles are easy to control, the impact of those changes on the resulting image are not. Fortunately, where there’s a need, there’s a tool. So, let’s elaborate a bit more in our container-diff tutorial.

Introducing container-diff

Available in macOS, Linux, and Windows, container-diff (like the name suggests) is diff for container images.

The project, developed by many of the same faces behind Container Structure Tests, does a lot more than just diffing: it can analyze container images, show installed packages, and reverse-engineer the commands used to generate them.

Testing containers

Container-diff has the following test modes:

  • Size: shows the total filesystem size.
  • Packages: shows a list of OS-installed packages (only for Debian-based distros), as well as those installed with pip and npm.
  • Filesystem: shows all the files in the image and their size.
  • Layer history: prints the commands that generated each of the layers in the image.

The command to analyze an image looks like this:

container-diff analyze [--type=TEST_TYPE] <IMAGE_NAME>

The tool pulls the image from the registry and unpacks the filesystem into $HOME/.container-diff/cache. Then, the contents are scanned, and a report is printed out.

So, for instance, we can analyze a PostgreSQL image with:

$ container-diff analyze postgres:14

-----Size-----

Analysis for postgres:14:
IMAGE           DIGEST                                                       SIZE
postgres       sha256:3ee027aeb3c8bc4a5870b21 ... 6e27685ac1eab6d4ada        352.9M

The default test is size. Change it to --type=apt to find out which OS-level packages are installed.

$ container-diff analyze --type=apt postgres:14

-----Apt-----

Packages found in postgres:14:
NAME                             VERSION                             SIZE
-adduser                         3.118                               849K
-apt                             2.2.4                               4.2M
-base-files                      11.1 deb11u1                       340K
-base-passwd                     3.5.51                             243K
-bash                            5.1-2 b3                            6.3M
-bsdutils                        1:2.36.1-8                         394K
-coreutils                       8.32-4 b1                           17.1M

...

-util-linux                      2.36.1-8                            4.5M
-xz-utils                        5.2.5-2                             612K
-zlib1g                          1:1.2.11.dfsg-2                     166K

Similarly, you can get a list of globally-installed packages for Node and Python with --type=node and --type=pip.

$ container-diff analyze --type=pip python:3.10-bullseye

-----Pip-----

Packages found in python:3.10-bullseye:
NAME               VERSION       SIZE         INSTALLATION
-pip               21.2.4         5.1M         /usr/local/lib/python3.10/site-packages
-setuptools        57.5.0         2.4M         /usr/local/lib/python3.10/site-packages
-wheel             0.37.0         94.4K       /usr/local/lib/python3.10/site-packages

You can see every file in the image with --type=file, along with its size.

$ container-diff analyze --type=file postgres:14

-----File-----

Analysis for postgres:14:
FILE             SIZE
/bin              5.1M
/bin/bash         1.2M
/bin/cat          42.9K

...

/var/spool       7B
/var/spool/mail   7B
/var/tmp          0

💡 Use --order to show files ordered by size instead of alphabetically.

Finally, the history test shows the Docker layers, which roughly reflect the Dockerfile. The output of --type=history is hard to read, so we’ll format it with sed.

$ container-diff analyze --type=history postgres:14 | sed 's/ */ /g;s/;/\n\t/g'

-----History-----

Analysis for postgres:14:
-/bin/sh -c #(nop) ADD file:16dc2c6d1932194edec28d730b004fd6deca3d0f0e1a07bc5b8b6e8a1662f7af in /
-/bin/sh -c #(nop) CMD ["bash"]
-/bin/sh -c set -ex
 if ! command -v gpg > /dev/null
 then apt-get update
 apt-get install -y --no-install-recommends gnupg dirmngr
 rm -rf /var/lib/apt/lists/*
 fi
-/bin/sh -c set -eux
 groupadd -r postgres --gid=999
 useradd -r -g postgres --uid=999 --home-dir=/var/lib/postgresql --shell=/bin/bash postgres
 mkdir -p /var/lib/postgresql
 chown -R postgres:postgres /var/lib/postgresql

...

Comparing containers

We’re only scratching the surface so far. Container-diff really shines when comparing images. The command for this is:

container-diff diff [--type=TEST_TYPE] <IMAGE1> <IMAGE2>

Let’s see some use cases for image comparison.

Use case 1: generating a changelog

Container-diff works great for generating changelogs. And, as we’ll see in the next section, the output format can be customized using a template.

We can list what changed at the OS level:

$ container-diff diff --type=size --type=apt postgres:13 postgres:14

-----Apt-----

Packages found only in postgres:13:
NAME                         VERSION                 SIZE
-postgresql-13               13.5-1.pgdg110 1        46.9M
-postgresql-client-13        13.5-1.pgdg110 1        6.3M

Packages found only in postgres:14:
NAME                         VERSION                 SIZE
-postgresql-14               14.1-1.pgdg110 1        48.9M
-postgresql-client-14        14.1-1.pgdg110 1        7.1M

Version differences: None

-----Size-----

Image size difference between postgres:13 and postgres:14:
SIZE1         SIZE2
350.2M        352.9M

In the same vein, we can compare globally-installed Node packages:

$ container-diff diff --type=node node:16 node:17

-----Node-----

Packages found only in node:16: None

Packages found only in node:17: None

Version differences:
PACKAGE IMAGE1 (node:16) IMAGE2 (node:17)
-npm 8.1.0, 8M 8.1.2, 8M

Or changes in Python packages:

$ container-diff diff --type=pip python:3.6.15-buster python:3.10-bullseye

-----Pip-----

Packages found only in python:3.6.15-buster:
NAME VERSION SIZE
-argparse 1.2.1 87.1K
-mercurial 4.8.2 9.5M
-wsgiref 0.1.2 98.7K

Packages found only in python:3.10-bullseye: None

Use case 2: troubleshooting containers

Debugging a failing container is easy when we have a healthy image to use as a reference. To see all the file changes, run container-diff with --type=file:

$ container-diff diff --type=file myapp/myservice:v1 myapp/myservice:v2

-----File-----

These entries have been added to myapp/myservice:v1:

FILE SIZE
/app/node_modules/fsevents 186.2K
/app/node_modules/fsevents/LICENSE 1.1K
/app/node_modules/fsevents/README.md 2.9K


These entries have been deleted from myapp/myservice:v1:

FILE SIZE
/app/.npm/_cacache/index-v5/ce/9f/58654f1 310B
/app/.npm/_cacache/index-v5/3d/b7/10f6556 309B
/app/.npm/_cacache/index-v5/7e/eb/c1538ff 308B

These entries have been changed between myapp/myservice:v1: and myapp/myservice:v2:
FILE SIZE1 SIZE2
/app/package-lock.json 554.6K 554.6K
/app/node_modules/.package-lock.json 297.7K 298.1K
/app/node_modules/clean-css/History.md 77.5K 77.8K

Once the problematic file is identified, you can compare the files in both containers to see what changed.

$ container-diff diff <IMAGE1> <IMAGE2> --type=file --filename=PATH/TO/FILE

Use case 3: test-driving new containers

You can run container-diff to preview the impact of your changes in a build. For instance, to quickly try out different base images or play with the Dockerfile. You can iterate until you’re sure you’ve got it right.

Container-diff is not limited to images in remote repositories. You can analyze any local image by prefixing its name with daemon://.

container-diff diff --type=TEST_TYPE daemon://IMAGE_NAME:TAG daemon://IMAGE_NAME:TAG

Imagine that you’re building a container for a Ruby app and want to try upgrading from Ruby 2.7 to 3.0. As a Ruby developer, you know what to expect from the language side, but can you say the same about the container?

To answer the question, let’s compare the respective Ruby images:

$ container-diff diff --type=size --type=apt ruby:2.7.4-bullseye ruby:3.0.2-bullseye

-----Apt-----

Packages found only in ruby:2.7.4-bullseye: None

Packages found only in ruby:3.0.2-bullseye: None

Version differences: None

-----History-----

Docker history lines found only in ruby:2.7.4-bullseye:
-/bin/sh -c #(nop) ENV RUBY_MAJOR=2.7
-/bin/sh -c #(nop) ENV RUBY_VERSION=2.7.4
-/bin/sh -c #(nop) ENV RUBY_DOWNLOAD_SHA256=2a80824e0ad6100826b69b9890bf55cfc4cf2b61a1e1330fccbcb30c46cef8d7


Docker history lines found only in ruby:3.0.2-bullseye:
-/bin/sh -c #(nop) ENV RUBY_MAJOR=3.0
-/bin/sh -c #(nop) ENV RUBY_VERSION=3.0.2
-/bin/sh -c #(nop) ENV RUBY_DOWNLOAD_SHA256=570e7773100f625599575f363831166d91d49a1ab97d3ab6495af44774155c40

-----Size-----

Image size difference between ruby:2.7.4-bullseye and ruby:3.0.2-bullseye:
SIZE1 SIZE2
819.2M 835.8M

Compare that with changing the OS flavor in the Node image. What happens if you want to swap out Bullseye for Bullseye Slim?

$ container-diff diff --type=size --type=apt --type=node node:17-bullseye node:17-bullseye-slim

-----Apt-----

Packages found only in node:17-bullseye:
NAME VERSION SIZE
-autoconf 2.69-14 1.8M
-automake 1:1.16.3-2 1.8M
-autotools-dev 20180224.1 nmu1 157K
...

-----Node-----

Packages found only in node:17-bullseye: None

Packages found only in node:17-bullseye-slim: None

Version differences: None

-----Size-----

Image size difference between node:17-bullseye and node:17-bullseye-slim:
SIZE1 SIZE2
942.9M 230.7M

Comparing regular Bullseye vs. Slim shows that:

  • Node stays the same.
  • Slim image is about 12 MB smaller.
  • The smaller image has a long list of missing packages.

This information will help you decide which is the best version for you. It makes sense to pick Slim in order to reduce the attack surface if you don’t need the extra packages.

Extending and customizing container-diff

When the default text output is not enough, we can write an output template. You can see the examples in the built-in template file.

The --format option lets us customize how information is printed out, giving us a way to export the data to other formats, such as CSV:

$ container-diff diff python:3.9-bullseye python:3.10-bullseye --type=pip --format='
package,{{.Image1}},{{.Image2}}
{{range .Diff.InfoDiff}}{{.Package}},{{range .Info1}}{{.Version}}{{end}},{{range .Info2}}{{.Version}}{{"\n"}}{{end}}{{end}}
'

package,python:3.9-bullseye,python:3.10-bullseye
pip,21.2.4,21.2.4
setuptools,57.5.0,57.5.0
wheel,0.37.0,0.37.0

When custom formats are not enough, container-diff can be extended by writing your own differ. You’ll need solid knowledge of Go for that, though.

Automated container testing with CI/CD

How does container-diff help us deploy safely? Well, if you’re doing continuous integration, you’re probably deploying several times a day, which means each new container is only a little bit different from the previous one.

Following that logic, we can assume that if too many things change at once, it may be a signal that further analysis is needed before deployment. Maybe some unexpected file snuck into the build and the image size doubled, or the base image was updated in the registry and unexpectedly shipped with different libraries.

container-diff comparing two images

We have to strike the right balance between stability and mutability. Every team will have different thresholds but, as a starting point, let’s say that we’ll reject images that:

  • Grow more than 10% in size.
  • Have different OS libraries.
  • Have different globally-installed Node packages.
  • Were built from a different Dockerfile.

Gauging change rate between images

We can evaluate the changes by running container-diff with --json and processing the output. The format is:

{
"Image1": "foo",
"Image2": "bar",
"DiffType": "Test_Type",
"Diff": {
// Differences Object
}
}

We can process the report with a combination of shell scripts and jq, the JSON Query CLI tool. First, run all the tests at once and save the output in a file:

$ container-diff --type=size --type=apt --type=node --type=history --json > diff.json

Then, pipe the output to jq. You can filter the results per test by selecting DiffType. Use the following command to see the APT changes:

$ jq '.[] | select(.DiffType=="Apt")' diff.json

You can get the total number of changed packages by appending .Diff.Packages1 + .Diff.Packages2 | length to the query.

$ jq '.[] | select(.DiffType=="Apt") | .Diff.Packages1 + .Diff.Packages2 | length' diff.json

💡 You can try jq online at jq play.

Once we have all the jq queries ready, we can write a script that runs the differ, filters the results, and fails if the changes exceed certain thresholds.

#!/bin/bash
# Compare container and stop pipeline when changes exceed control parameters
# Parameters expected:
# $ALLOWED_APT_CHANGES - max number of allowed APT packages changed
# $ALLOWED_HISTORY_CHANGES - max number of Dockerfile commands changed
# $ALLOWED_NPM_CHANGES - max number of NPM packages changed
# $MAX_GROWTH_RATIO - percentual growth size allowed (0 is no growth, 100 is double size)

set -ex

image1=$1
image2=$2

diffile=$(mktemp XXXXXX.json)

container-diff diff \
--type=history --type=node --type=size --type=apt --json \
"$image1" \
"$image2" \
> ${diffile}

changes_apt=$(jq '.[] | select(.DiffType=="Apt") | .Diff.Packages1 + .Diff.Packages2 | length' ${diffile})

changes_history=$(jq '.[] | select(.DiffType=="History") | .Diff.Adds + .Diff.Dels | length' ${diffile})

changes_npm=$(jq '.[] | select(.DiffType=="Node") | .Diff.Packages1 + .Diff.Packages2 | length' ${diffile})

# When sizes are equal jq returns a string "null"
size1=$(jq '.[] | select(.DiffType=="Size") | .Diff[0].Size1 ' ${diffile})
if [ "$size1" = "null" ]
then
size_ratio=0
else
size_ratio=$(jq '.[] | select(.DiffType=="Size") | 100 * .Diff[0].Size2 / .Diff[0].Size1 - 100 | floor' ${diffile})
fi

# Evaluate thresholds
if [ $changes_apt -gt $ALLOWED_APT_CHANGES ] \
|| [ $changes_history -gt $ALLOWED_HISTORY_CHANGES ] \
|| [ $changes_npm -gt $ALLOWED_NPM_CHANGES ] \
|| [ $size_ratio -gt $MAX_GROWTH_RATIO ]
then
exit 1
else
echo OK
fi

Adding a change-control job to CI/CD

Where were we? Let’s see, we have two images and a script to compare them. What we need now is a CI/CD pipeline that builds the image. Semaphore has the capabilities that we want for this task. If you’ve never used Semaphore before, I recommend checking out the getting started guide.

Open the workflow editor and add a block after the container image build step. Then, add the following commands in the job:

curl -LO https://storage.googleapis.com/container-diff/latest/container-diff-linux-amd64
sudo install container-diff-linux-amd64 /usr/local/bin/container-diff
echo "${DOCKER_PASSWORD}" | docker login -u "${DOCKER_USERNAME}" --password-stdin
checkout
chmod a+x container-diff-test.sh && ./container-diff-test.sh "${DOCKER_USERNAME}"/mycontainer:latest "${DOCKER_USERNAME}"/mycontainer:$SEMAPHORE_WORKFLOW_ID

This job installs container-diff in the CI machine, logs in the Docker Hub registry (you’ll need to activate a secret), clones the repository, and runs the comparison script. Change the parameters in container-diff-test.sh as needed. In this case, we’re comparing the latest image against the one tagged with the unique id $SEMAPHORE_WORKFLOW_ID.

Setting up a container control job
Container control job

That’s it! You can complete the pipeline with the deployment method of your choice.

An example pipeline.
Example pipeline

If you need inspiration for setting up a deployment, check these resources to learn how you can deploy with Semaphore:

Wrapping up

Container-diff is yet another quality tool to keep containers in check. Remember, when using containers, you’re responsible for the whole mini OS that comes with them, not just the code.

Increase your Docker-fu with these posts:

Thank you for reading!

The post Change Management for Containers appeared first on Semaphore.


Viewing all articles
Browse latest Browse all 186

Trending Articles