The Mental Model Of Docker Container Shipping
How container shipping explains the inner workings of Docker
Programming is a marvellous activity. By typing on a keyboard and feeding code to a computer, a programmer can create entirely new (virtual) worlds out of nothing. There are few constraints on what a program could be - the only limitations are memory and a CPU's processing power. Whatever you can think of, you could, in theory, bring into existence on a computer.
But since virtual worlds are created in the mind of a developer and not based on physical reality, it can be difficult to talk about what they are or how they work. There is no immediate language to refer to. To overcome this lack of terminology, engineers often use analogies from the real world to describe what software does.
Docker is no exception. At its centre stands the "container", the main building block of what Docker calls a "standard of shipping software". One can't help but see container ships towing heavy loads over oceans. Even the logo depicts the main idea:
Containers and shipping software
When we learn about new technology with strong ties to another technology, we immediately apply our existing knowledge of the substitute technology to fill gaps in our understanding. This "replacement reasoning" can be helpful or misleading, depending on how accurate the analogy turns out to be. Therefore, to avoid misconceptions, it is beneficial to evaluate existing mental models first, as they are the ground on which we build our individual models.
In the case of Docker, we want to know how helpful the idea of "container shipping" is and how far we can take the analogy.
It turns out that there are many parallels between real and virtual containers and that the properties of container shipping cover quite a lot of concepts of Docker.
Containers (those made from steel)
Shipping containers revolutionized international trade by making it possible to move freight between different modes of transport without the need to re-package goods. Due to standardization and a complex system of docking ports, the cost and time it takes to ship goods worldwide have been significantly reduced.
What makes containers so valuable is that they can be stacked neatly on top of each other, thanks to their standardized dimensions. They can also carry nearly any kind of goods and can be used interchangeably.
Docker Containers
Like shipping containers, Docker containers also increased efficiency. Most applications run on Webservers. In the past, it was not possible to execute more than one application per server. Not only was managing a fleet of individual servers difficult, but it also lead to a waste of resources, as applications can vary significantly in their requirements that change over time (servers needed to keep a capacity reserve).
Then a company called VMWare came along and invented the virtual machine. I mentioned in the intro that computers create virtual worlds. They can also create worlds inside worlds. A virtual machine is an emulation of hardware. Each emulated hardware is a new virtual computer (creating computers inside a computer), making it possible to run multiple applications side by side on the same server.
One OS per VM (Source: "Docker Deep Dive" 1)
This makes deploying applications much more efficient. Applications run inside their own virtual machine (VM), and hardware resources can be individually distributed to meet the requirements of each application.
But VMs are big. Virtualizing the hardware requires each virtual machine to include an entire operating system.
Docker takes a different approach. Instead of virtualizing the hardware, it aims higher and abstracts away the operating system. Docker containers share the same OS resources, making them more lightweight 2.
Docker containers are smaller (Source: Docker Deep Dive)
We talked about why Docker exists, but we still haven't said what a container actually is.
How do Docker containers compare to physical containers?
The physical container is an enclosing made of steel that carries goods. Similarly, a Docker container provides applications with an isolated environment (the virtual equivalent of an enclosing).
Actual containers are filled with goods. A Docker container runs an application. To understand how the application ends up on a Docker container (and how that is done efficiently), we need to understand three basic concepts:
Docker engine
Docker image
Containerizing an app
Docker Engine
The word "Docker" can mean two things.
First, there is Docker, the company. They develop the tools and libraries required to run Docker and maintain the Docker hub, a central repository.
Then there is Docker, the technology. There are three main components 3:
The runtime provides low-level libraries and components to run containers.
The engine (or Daemon) provides higher-level functionality, most of all the interface for Docker commands.
A Client is an application that talks to the Docker Daemon (e.g. the
docker
ordocker-compose
commands)
When someone learns Docker, they will spend most of their time understanding the interactions of a Docker client with the Docker daemon.
Actual containers are made out of steel. Similarly, the Docker engine + runtime are "the fabric of Docker containers".
Docker image
The Docker glossary describes containers as a "runtime instance of an image" 4. An image is a collection of files that are executed when a container runs. They are, we could say, the "content" of a container - in other words, the application.
Images are constructed in layers, where each layer corresponds to a set of files of a container. More precisely, each layer is a set of file changes. When an image is built, layers are executed sequentially, possibly overriding files of a previous stage.
Image layers, overriding changes (source: "Docker Deep Dive" 1)
Layers can be cached. That makes building images efficient because every time an image changes, we only need to update those layers that have changed.
Containerizing an application
The final piece of the Docker puzzle is the so-called process of "containerizing an application". Or in other words, how do we turn an application into an image that we can then execute on a container?
This is done via a Docker file. In the container shipping model analogy, a Docker file is similar to a list of items that describe the content of a container. Except that it not only lists the files but also describes the order of their occurrence or modification.
Side-by-side comparison of container shipping and Docker
Finally, we get to the fun part! How far can we take the analogy of container shipping to create a mental model of Docker?
Container shipping is straightforward to visualize - starting with containers loaded with goods to the cranes lifting them onto ships and so forth until they reach their destination. For every step in the journey of a container, we can find a similar action in the Docker world.
Let's compare them side-by-side.
(sorry for the suboptimal formatting using images - substack please make it possible to insert tables! You can read the post here as well with better rendering)
We can even create a little diagram that compares both worlds:
Think of a Docker container as a vessel for applications.
The Docker engine (and runtime) provides the enclosing for goods, similar to steel used to fabricate a physical container.
The Dockerfile is like a description of the contents of a container.
The image provides the content of a container, each file (change) is like a single good.
Building and uploading an image prepares the content of a container.
Downloading an image and creating a container is like a crane lifting a container onto a ship.
Running the container can be compared to a ship transporting containers over the ocean.
Stopping a container, destroying it and deleting an image is similar to a crane lifting a container off a ship, taking out the goods and scraping the empty container.
The disadvantage of Docker is that all applications need to run on the same OS - but since Linux is the de-facto standard for most applications, this is usually not a problem.
A fourth component, swarm, is often mentioned as well, which coordinates multiple instances of containers