Sunday, November 30, 2014

Docker: The basics (with Mac OS X and boot2docker)



Docker is essentially an abstraction of Linux Containers. Although over time it has strayed from that definition. A docker container is an abstraction away from whatever the container happens to be running on. It is an abstraction of its architecture. The container could be running on a laptop, a barebones server, a Raspberry Pi, a high-end server, or anything else that supports docker. It doesn't concern itself with where it is running, and therefore the app inside also doesn't concern itself. The app will be concerned with its local environment, and that is something Docker currently solves. I have talked with some of the developers, and they have led me to believe that they intend to add support for Docker to understand a distributed environment allowing links between containers on different nodes. There are other tools that solve this already which I will talk about in later posts.

This post will particularly cover getting Mac OS X setup with Docker so you can develop locally. It should be short. The major step is just getting a VM installed that runs Docker natively, because OS X has no native support for Docker. Luckily, there's already one made which is pretty cool.

Basically, just go to the Docker website and follow the directions. Make sure that you either add the export commands that are output after the command boot2docker up on each terminal window you are accessing Docker in, or add them to one of the bash configuration files. However, these commands will change if you bring the VM down and back up again.

Another point to remember, is that your VM can grow pretty large if you are using it for a long time without cleaning it up. It may be good to map some host storage for the Docker images to avoid your virtual environment from growing too large. I've had it crash my computer as it contended for resources with other processes.

Another aspect to realize, is that all of the Docker functions are occurring on the VM. That is where the daemon is hosted, so that is where your containers are running. This requires you to map your VM ports to your Mac ports in order to access your container endpoints (assuming it's something with ports and an ip address). You can still access your containers through the Docker commands, but you won't be able to say curl a website from outside boot2docker without the port mappings from the VM to the Mac. This is not the same as the port mappings done by docker to the VM.

With that all said, it is pretty simple to access all of your containers, set up a local registry in case you want to share with co-workers, and push your containers to an external registry for deployment. I have had a lot of problems using Packer from outside the VM. I was using Chef, which just caused more issues. I ultimately bailed on Packer, as I don't think it's the right solution for docker containers.

Docker containers use layers, which allows for optimizations that are part of the reason to use Docker. With these layers, you can have a common base image, like ubuntu, that only resides on the host system once no matter how many images use that layer. This is even better if you have several different apps that may need other layers that are the same. Perhaps you have several websites sharing a server that all need an httpd setup, but the final layer of actual code is different. This allows for the layers up to the website code to be downloaded once. This can save a lot of space, and it also makes building your images during development easier and faster.

I have used the terms images and containers, and they may have seemed synonymous to some extent. This is one area where I wish there was a better delineator when being discussed. A container is not actually what is shipped out; it's an image, or rather many layers of images that make up a conceptual container. The actual container is the running image that is downloaded to each host. Many times people say they are shipping the container, but they are actually shipping a set of images that constitute a conceptual container. I suppose the container analogy breaks down if this distinction is made, but it can be confusing if there is no delineation between the two concepts.

A final note should be made that containers should not be tarred for shipping. My original confusion with images and containers was that I thought to ship a container it had to be exported. As the project developed over the last year, it became clear to me that wasn't the case. Originally, there was no private registry available. If you tar a container, then you lose all of the benefit of layers. That could be good if you didn't manage your layers well (i.e. downloading and unzipping a tar and then removing it all in different RUN commands. Each of those commands creates a new image layer, so you do nothing to the final size by removing the tar in a separate RUN command. These should be chained together in one RUN command). Therefore, if you have to tar, then go ahead, but it should be avoided.