A Short Explanation of What Docker and Containers Are
An explanation of the core concepts in Docker
I will make an analogy with software development. Here we make a simple C++ project.
$ mkdir my_project
$ cd my_project
$ touch Makefile
$ touch foobar.cpp
We have some source code files and a Makefile which explains how to combine this source code with other libraries to create an executable.
We create the executable with:
Once we got the executable we can run it which creates a process which we can interact with and which actually does stuff. An executable can instantiate any number of processes.
Lets do something analogous in docker. We create a project folder, which is used to create an container image (not an executable).
$ mkdir my_image
$ cd my_image
$ touch Dockerfile
$ touch foobar.json
Like a a Makefile, the Dockerfile contains a set of instructions for how to build a docker image. Typically a lot of these instructions will involve downloading files and packages from the internet and run configuration commands. So there will not be that many files like
foobar.json. Unlike our app development example these files will mainly be data and configuration files and not source code.
In the project directory we can create a docker image with:
$ docker build -t my_image:2.0 .
The analogy is that you create an executable when you build source code using a Makefile. Using a Dockerfile we get a docker image. Like an executable it is just dead data on disk. It doesn’t run or do anything.
How is an image different from an executable?
First of all it does not represent a single file. However when we build application we also don’t just create single files. There are multiple libraries and resource files created as well. On e.g. a Mac this all get collected into a directory we call a bundle. So an application on the mac such as
Mail.app is actually just a directory containing executables, libraries and resource files needed to run the Mail application.
However an application is never fully self contained. Even on macOS, a bundle will utilize libraries and resources provided by the operating system in fixed locations.
An image in contrast can contain any number of applications and all the libraries and resources they need is contained within the image.
The name container makes it easy to think of it as a database or storage primarily. However containers are instances of images, just like a process is an instance of an executable.
Just because it exists as a data structure doesn’t mean it is not an active thing. A process also exists as data structures in computer memory. They are not always running. A process can be suspended and reactivated.
Likewise a container could be stopped and restarted. I can create a container, by instantiating it from a named image.
$ docker run my_image:2.0
Like a process this gives our container a random name we can refer to it. But using something like process IDs would be cumbersome. So we can give our docker containers names.
$ docker run --name my_container my_image:2.0
So keep this in mind, both images and containers can have names. For every image there may be many named containers created from it.
A process maintains and stores state about itself. Likewise each container has independent state from each other stored on your hardisk.
One way to think about it is just like you may interact with a server over an ssh connection. In fact docker runs as a sort of server. So you connect to it and you can issue commands just as if you made an ssh connection to another machine.
The difference is that the files and directories you see when connected to the “docker server” isn’t the files and directories of another computer, but the files and directories contained within a docker container.
You can create, remove or modify files inside the container, but you are never able to see anything outside the container, unless it has been explicitly mounted.
How is a container different from a process?
You can think of a container as a potentially a collection of several processes. They exist in a namespace so they can’t see the all the other processes on your computer. Also they can’t see files or folders outside the container. This is what tricks the processes into thinking they are on a different computer.
The Purpose of Container Technology like Docker
We already have virtual machines which can simulate whole computers where you can install any operating system. That has a noticeable performance overhead, as a whole computer needs to be simulated.
With containers were are using the same kernel for all containers. We aren’t trying to pretend we are on a different hardware platform or operating system.
We are just faking the files and process you see. That means inside a container you can pretend to be any Linux OS, because they all use the same kernel. The only difference is the files, folders and locations which differ between Linux distributions.
A benefit of this is that each container is basically a linux distribution in a nutshell. If e.g. you prefer to develop on a Mac, you can do that and just create a mini-linux world inside a container on your mac. When you are done developing, you can just copy the container over to your company’s Linux server.
The benefit of this is that you don’t have to setup the right environment for say the server software you are building twice. Just get it right once, in your container and then install the container on any number of machines.