5 easy-to-implement tricks to trim down your Docker image size
Minimize your docker image (plus one bonus)
In this short article we’ll go through 5 tricks (and one bonus) on how to trim down the size of docker images. In doing so we learn more about how Docker builds images and how to use base images. Let’s code!
Before we start: We’re using a lot of terminal commands. Check out this article if you are unfamiliar.
1. Bundling layers
The size of a Docker image is the sum of it’s layers. Since each layer has a little bit of overhead we can make a small but very easy improvement by reducing the number of layers.
Just change:FROM python:3.9-slim
RUN apt-get update -y
RUN apt-get install some_package
RUN apt-get install some_other_package
To:FROM python:3.9-slim
RUN apt-get update -y && apt install -y \
package_one \
package_two
I’ve tested this with some packages used for building and listed them below. As you can see we save about 10MB by bundling the layers. Although this is not a huge size reduction, but it’s pretty impressive since it’s such a small effort.
# packages:
git, cmake, build-essential, jq, liblua5.2-dev, libboost-all-dev, libprotobuf-dev, libtbb-dev, libstxxl-dev, libbz2-d
2. Avoid installation of unnecessary packages
If you’re installing packages with apt-get add the --no-install-recommends
tag. This avoids the installation of packages that are recommended but no required alongside a package that you are installing. This is what it looks like:FROM python:3.9-slim
RUN apt-get update -y && apt install -y --no-install-recommends \
package_one \
package_two
By adding this single flag we save about 40MB. Again, this is not a lot in the whole scheme of things but quite significant since we merely add a single flag.
3. Clean up after install
Installing packages has some overhead as well. We can clean this up by adding rm -rf /var/lib/apt/lists/*
to the same line as you apt-get like this :FROM python:3.9-slim
RUN apt-get update -y && apt install -y --no-install-recommends \
package_one \
package_two \
&& rm -rf /var/lib/apt/lists/*
Saved another 40MB with a simple line of code!
4. How about a smaller image?
This one is very obvious; just use a smaller image! As you may have noticed in the previous snippets we’re already using python:3.9-slim
, which results in a total image size of almost 1GB, adding all of the previous optimizations. This already is a nice improvement compared to python:3.9
(which would result in a total size of 1.3GB).
One improvement would be to use python:3.9-alpine
. Alpine containers are specifically built to run in containers and are very small, weighing in at around 49MB. This is over 20x smaller then if we’d use python:3.9
. So why not use alpine all the time? There are some downsides:
- Compatibility issues: some Python wheels are built for Debian and need to be rebuilt to be compatible with Alpine. This might result in some very strange bugs due to compatibility issues that are very hard to debug.
- Dependencies: Alpine installs dependencies with
apk add
in stead ofapt-get install
. Also not all packages may be available or, again, compatible with alpine.
I would only advise using alpine if disk space is a major concern. Also check out the next part for how to mitigate the downsides of alpine by combining it with a build-stage.
5. Use a .dockerignore
If you create a new Docker image then Docker needs access to the files you want to create the image from. These files are known as the build context. They get sent every time you build an image so it’s a good idea to keep this as small as possible.
In addition adding a dockerignore is a very good idead to prevent the exposure of secrets. If you copy files into your image with ADD
or COPY
you might unintentionally copy over files or folders (like .git
) that you don’t want to bake into your image.
Bonus: Multi stage build
Sometimes you need tools like curl
, git
or make
to build your image. After this is done you don’t need these packages anymore. All they do now is make our image bigger. One fine way to solve this problem is to use a multi-stage build. Below we’ll go through a simple example that uses curl
as one of these packages that we need for building our image, but is redundant once our image is built:
- Take an image like
ubuntu
- apt-get install curl
- curl a data-file into the image (for example a classifier)
- now we take another image like
python:3.9-slim
- copy the curled file from
ubuntu
into thepython:3.9-slim
stage - the rest (e.g. install packages with pip etc.)
When done this way we have our classifier file in our final image without our image being bloated by curl. Another advantage is that we can use richer images to perform our installs in (e.g. python:3.9
and then copy over all of our resulting files to a very small image (e.g. python:3.9-alpine
).
This solution is simple in it’s idea but a bit trickier to execute. Check out this article for a nice walkthrough.
Conclusion
In this article we’ve gone through 5 quick and easy to implement tips on how to reduce your Docker image size. Along the way we’ve learnt more about the way Docker builds images. This article has focused mostly on cleaning up but the best way to trim down image size is to leave behind the build layers altogether. This way we can even use a smaller image since we’ve already performed all complex tasks. Check out this article for more information on these so-called multi-stage builds.
I hope this article was clear but if you have suggestions/clarifications please comment so I can make improvements. In the meantime, check out my other articles on all kinds of programming-related topics like these:
- Docker for absolute beginners
- Docker Compose for absolute beginners
- Turn Your Code into a Real Program: Packaging, Running and Distributing Scripts using Docker
- Why Python is slow and how to speed it up
- Advanced multi-tasking in Python: applying and benchmarking threadpools and processpools
- Write you own C extension to speed up Python x100
- Getting started with Cython: how to perform >1.7 billion calculations per second in Python
- Create a fast auto-documented, maintainable and easy-to-use Python API in 5 lines of code with FastAPI
Happy coding!
— Mike
P.S: like what I’m doing? Follow me!