Friday, December 5, 2014

Docker: Containers != Images

So, you're learning Docker, but every video you watch and blog you read uses the terms containers and images seemingly interchangeably. They say things like, "Docker allows you to share your container throughout your environments as an immutable application." However, this isn't accurate.

You can't actually share a container at all. The container only exists as an instance on a node. This instance can be running or stopped, but it is an instance none-the-less. You can "export" a container to a tar file, but when you "import" it back in, it's an image. That's because it's not an instance. A container is an instance of an image. The image is what you share.

An image is actually one or many layers of images, including the actual image. So, you can have a base image that consists only of that single image, or you can have an application with dozens of images. Each command in a Dockerfile actually creates a new image underlying the final image. These images will always remain with the final image unless you "export" a container, which then flattens all of the images into one.

The benefit of this layering is that you can base all of your apps on the same base image and that base image only exists once on the system, which helps with storage capacity. The images are also uploaded to Docker Hub, but the base images can be skipped if the Hub already holds a reference to that image. This will speed deployments, as only the changes need to be pulled down.

Containers are the instant state of these images, and are able to consume very little space as they are simply copies of the images. The only space they consume is what is added during operation.

Images are the saved state of a filesystem composed of layers of filesystem states. If a file is uploaded in one image layer, then that file will always exist as part of that image layer and thus the final image will be that much larger. So, if you add a large tar and unpack it and then delete it in a different command, then there really is no reason to delete it, because the file will still exist in the layering and consume that amount of space. It is best to combine commands on a single line and make other optimization decisions when it comes to importing data that won't need persisting.

I hope this helps differentiate between containers and images as I found them a bit confusing when I first started learning Docker in the early days.