Welcome to Daniel Barker's blog for dbdevs.com. This blog will mostly focus on small tutorials, prognosticating the future of technology, and anything else related to technology. This blog will provide insights to the future through collaboration and analyses of current technologies and research. However, it will also serve as a practical repository for small tutorials and examples which I currently have no place for presentation.
Friday, December 5, 2014
Docker: Containers != Images
So, you're learning Docker, but every video you watch and blog you read uses the terms containers and images seemingly interchangeably. They say things like, "Docker allows you to share your container throughout your environments as an immutable application." However, this isn't accurate.
You can't actually share a container at all. The container only exists as an instance on a node. This instance can be running or stopped, but it is an instance none-the-less. You can "export" a container to a tar file, but when you "import" it back in, it's an image. That's because it's not an instance. A container is an instance of an image. The image is what you share.
An image is actually one or many layers of images, including the actual image. So, you can have a base image that consists only of that single image, or you can have an application with dozens of images. Each command in a Dockerfile actually creates a new image underlying the final image. These images will always remain with the final image unless you "export" a container, which then flattens all of the images into one.
The benefit of this layering is that you can base all of your apps on the same base image and that base image only exists once on the system, which helps with storage capacity. The images are also uploaded to Docker Hub, but the base images can be skipped if the Hub already holds a reference to that image. This will speed deployments, as only the changes need to be pulled down.
Containers are the instant state of these images, and are able to consume very little space as they are simply copies of the images. The only space they consume is what is added during operation.
Images are the saved state of a filesystem composed of layers of filesystem states. If a file is uploaded in one image layer, then that file will always exist as part of that image layer and thus the final image will be that much larger. So, if you add a large tar and unpack it and then delete it in a different command, then there really is no reason to delete it, because the file will still exist in the layering and consume that amount of space. It is best to combine commands on a single line and make other optimization decisions when it comes to importing data that won't need persisting.
I hope this helps differentiate between containers and images as I found them a bit confusing when I first started learning Docker in the early days.
Monday, December 1, 2014
Docker: Adding HAProxy and Fig to my docker websites
In a previous post, I discussed moving my websites into Docker containers with their own separate httpd servers from their previous setup as virtual hosts on a single httpd server. This post will discuss integrating HAProxy and Fig into my installation. This will allow for load balancing, proper routing, and easy deployments.
To add HAProxy, I simply used the library haproxy image. You will just need to create your own Dockerfile and copy in your haproxy.cfg as the instructions state from the link. The first part of the config file is fairly standard:
1: global
2: log 127.0.0.1 local0
3: log 127.0.0.1 local1 notice
4: chroot /var/lib/haproxy
5: user haproxy
6: group haproxy
7:
8:
9: defaults
10: log global
11: mode http
12: option httplog
13: option dontlognull
14: option forwardfor
15: option http-server-close
16: timeout connect 5000
17: timeout client 50000
18: timeout server 50000
19: errorfile 400 /etc/haproxy/errors/400.http
20: errorfile 403 /etc/haproxy/errors/403.http
21: errorfile 408 /etc/haproxy/errors/408.http
22: errorfile 500 /etc/haproxy/errors/500.http
23: errorfile 502 /etc/haproxy/errors/502.http
24: errorfile 503 /etc/haproxy/errors/503.http
25: errorfile 504 /etc/haproxy/errors/504.http
26: stats enable
27: stats uri /haproxy?stats
It is the rest of the file that does the work for us. I have multiple websites served from the same server, and each website is also accessed by multiple domains (i.e. cafezvous.com is also cafezvous.net). There may be a less verbose way of doing what I've done, but I haven't found that method. We need to declare our frontend and bind it to port 80 on any incoming IP address. We then define ACL's (Access Control Lists) for each domain and associate them with the appropriate generic host. We then associate each host with the backend cluster that will serve the calls to that website. We finally use cafezvous as the default cluster if no ACL is matched.
29: frontend http-in
30: bind *:80
31:
32: # Define hosts
33: acl host_cafezvous hdr(host) -i cafezvous.com
34: acl host_cafezvous hdr(host) -i cafezvous.co
35: acl host_cafezvous hdr(host) -i cafezvous.info
36: acl host_cafezvous hdr(host) -i cafezvous.org
37: acl host_cafezvous hdr(host) -i cafezvous.net
38: acl host_cafezvous hdr(host) -i www.cafezvous.com
39: acl host_cafezvous hdr(host) -i www.cafezvous.co
40: acl host_cafezvous hdr(host) -i www.cafezvous.info
41: acl host_cafezvous hdr(host) -i www.cafezvous.org
42: acl host_cafezvous hdr(host) -i www.cafezvous.net
43: acl host_dbdevs hdr(host) -i dbdevs.com
44: acl host_dbdevs hdr(host) -i dbdevs.co
45: acl host_dbdevs hdr(host) -i dbdevs.info
46: acl host_dbdevs hdr(host) -i dbdevs.org
47: acl host_dbdevs hdr(host) -i dbdevs.net
48: acl host_dbdevs hdr(host) -i www.dbdevs.com
49: acl host_dbdevs hdr(host) -i www.dbdevs.co
50: acl host_dbdevs hdr(host) -i www.dbdevs.info
51: acl host_dbdevs hdr(host) -i www.dbdevs.org
52: acl host_dbdevs hdr(host) -i www.dbdevs.net
53: acl host_danpluslaura hdr(host) -i danpluslaura.com
54: acl host_danpluslaura hdr(host) -i danpluslaura.co
55: acl host_danpluslaura hdr(host) -i danpluslaura.info
56: acl host_danpluslaura hdr(host) -i danpluslaura.org
57: acl host_danpluslaura hdr(host) -i danpluslaura.net
58: acl host_danpluslaura hdr(host) -i www.danpluslaura.com
59: acl host_danpluslaura hdr(host) -i www.danpluslaura.co
60: acl host_danpluslaura hdr(host) -i www.danpluslaura.info
61: acl host_danpluslaura hdr(host) -i www.danpluslaura.org
62: acl host_danpluslaura hdr(host) -i www.danpluslaura.net
63:
64: ## figure out which one to use
65: use_backend cafezvous_cluster if host_cafezvous
66: use_backend dbdevs_cluster if host_dbdevs
67: use_backend danpluslaura_cluster if host_danpluslaura
68:
69: default_backend cafezvous_cluster
Normally our backends would be associated to some IP address, but we don't know what the IP address of a container will be until it's created. We could wait until all of the servers are started to then add each IP address and port individually, but that doesn't seem reasonable or easy. Also, we'd be creating a new container for each restart. We could also map each website to a particular port on the host and then update this with the host IP address/localhost and port. The problem isn't as bad with this solution, but this won't scale well. Plus, the smart guys at Docker have thought of a clever way to associate an IP with a container. They create an entry in /etc/hosts when the --link command is used to link containers together. So, the cafezvous container can be referenced by cafezvous with a trailing port number. And dbdevs is referenced the same way. All of them can continue to use the common port of 80 and this can be deployed anywhere without checking to make sure a port isn't already mapped to the host.
71: backend cafezvous_cluster
72: balance leastconn
73: option httpclose
74: option forwardfor
75: cookie JSESSIONID prefix
76: server node1 cafezvous:80 cookie A check
77:
78: backend dbdevs_cluster
79: balance leastconn
80: option httpclose
81: option forwardfor
82: cookie JSESSIONID prefix
83: server node1 dbdevs:80 cookie A check
84:
85: backend danpluslaura_cluster
86: balance leastconn
87: option httpclose
88: option forwardfor
89: cookie JSESSIONID prefix
90: server node1 danpluslaura:80 cookie A check
Now we can start our containers:
$ docker run -dtP --name dbdevs barkerd427/dbdevs
ed2ac1134324a7dda48d20567efb52e051d57f844c89b2a98a220a1c8b297a74
$ docker run -dtP --name cafezvous barkerd427/cafezvous
ec7bfaf06985ace3f617ffc466c93eabfe4cfca916ff8c71c4a125e5d0a7dfae
$ docker run -dtP --name danpluslaura barkerd427/danpluslaura
8ea8b1261d2ec1e5268e55f7504e4fab308cdaea1a5301e382ccb5de0e0718ee
$ docker run -dtP --name haproxy --link cafezvous:cafezvous --link dbdevs:dbdevs --link danpluslaura:danpluslaura barkerd427/haproxy
7971754d55d10972370b401c8061218bf53b495aafdc82d366f9232aaa192a03
$ docker ps -a -s
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES SIZE
7971754d55d1 barkerd427/haproxy:0.1 "bash /haproxy-start 18 seconds ago Up 6 seconds 0.0.0.0:49161->443/tcp, 0.0.0.0:49162->80/tcp haproxy 0 B
8ea8b1261d2e barkerd427/danpluslaura:0.20 "httpd -DFOREGROUND" 2 minutes ago Up 2 minutes 0.0.0.0:49158->80/tcp danpluslaura 2 B
ec7bfaf06985 barkerd427/cafezvous:0.4 "httpd -DFOREGROUND" 2 minutes ago Up 2 minutes 0.0.0.0:49157->80/tcp cafezvous 2 B
ed2ac1134324 barkerd427/dbdevs:0.4 "httpd -DFOREGROUND" 3 minutes ago Up 2 minutes 0.0.0.0:49156->80/tcp dbdevs 2 B
This is a bit cumbersome to have to add all of these links individually, especially as the infrastructure becomes more complex. So, we'll go ahead and add a simple Fig configuration file to the mix and get all of these commands condensed into one. Here's the configuration fig.yml file:
haproxy:
image: barkerd427/haproxy
ports:
- "80:80"
links:
- cafezvous
- dbdevs
- danpluslaura
cafezvous:
image: barkerd427/cafezvous
ports:
- "80"
dbdevs:
image: barkerd427/dbdevs
ports:
- "80"
danpluslaura:
image: barkerd427/danpluslaura
ports:
- "80"
We explicitly call out the port mapping for HAProxy to the host, but we leave the others to be assigned by Docker. We don't really care what the ports are on the host for this purpose, but we do need to know that the container port 80 is mapped generically. We also explicitly call out the links that HAProxy needs to the other container names. Now when I run fig up -d, everything will start in the correct order and run in the background. If you don't add the -d, then everything will shutdown when that session ends.
To run fig up, you need to be in the directory with the fig.yml file or reference that file with the -f or --file option. The project name defaults to the directory name, but this can be overridden with -p or --project-name. You can also attach to fig to see the output of the containers by using fig logs.
$ fig up -d
Creating websitefig_danpluslaura_1...
Creating websitefig_dbdevs_1...
Creating websitefig_cafezvous_1...
Creating websitefig_haproxy_1...
$ fig ps
Name Command State Ports
-------------------------------------------------------------------------------------
websitefig_cafezvous_1 httpd -DFOREGROUND Up 0.0.0.0:49171->80/tcp
websitefig_danpluslaura_1 httpd -DFOREGROUND Up 0.0.0.0:49169->80/tcp
websitefig_dbdevs_1 httpd -DFOREGROUND Up 0.0.0.0:49170->80/tcp
websitefig_haproxy_1 bash /haproxy-start Up 443/tcp, 0.0.0.0:80->80/tcp
$ docker ps -a -s
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES SIZE
81ee5adf1aa3 barkerd427/haproxy:0.1 "bash /haproxy-start About a minute ago Up 59 seconds 443/tcp, 0.0.0.0:80->80/tcp websitefig_haproxy_1 0 B
7df01566911e barkerd427/cafezvous:0.4 "httpd -DFOREGROUND" About a minute ago Up 59 seconds 0.0.0.0:49171->80/tcp websitefig_cafezvous_1 2 B
0f7c971a41aa barkerd427/dbdevs:0.4 "httpd -DFOREGROUND" About a minute ago Up About a minute 0.0.0.0:49170->80/tcp websitefig_dbdevs_1 2 B
e02f0cdc37b1 barkerd427/danpluslaura:0.20 "httpd -DFOREGROUND" About a minute ago Up About a minute 0.0.0.0:49169->80/tcp websitefig_danpluslaura_1 2 B
$ fig logs
Attaching to websitefig_cafezvous_1, websitefig_dbdevs_1, websitefig_danpluslaura_1
danpluslaura_1 | [Mon Dec 01 14:30:59.367252 2014] [mpm_event:notice] [pid 1:tid 140438754260864] AH00489: Apache/2.4.10 (Unix) configured -- resuming normal operations
danpluslaura_1 | [Mon Dec 01 14:30:59.367462 2014] [core:notice] [pid 1:tid 140438754260864] AH00094: Command line: 'httpd -D FOREGROUND'
dbdevs_1 | [Mon Dec 01 14:30:59.708416 2014] [mpm_event:notice] [pid 1:tid 140529524463488] AH00489: Apache/2.4.10 (Unix) configured -- resuming normal operations
dbdevs_1 | [Mon Dec 01 14:30:59.708595 2014] [core:notice] [pid 1:tid 140529524463488] AH00094: Command line: 'httpd -D FOREGROUND'
cafezvous_1 | [Mon Dec 01 14:31:00.034189 2014] [mpm_event:notice] [pid 1:tid 140501628454784] AH00489: Apache/2.4.10 (Unix) configured -- resuming normal operations
cafezvous_1 | [Mon Dec 01 14:31:00.034919 2014] [core:notice] [pid 1:tid 140501628454784] AH00094: Command line: 'httpd -D FOREGROUND'
cafezvous_1 | 172.17.0.25 - - [01/Dec/2014:14:31:15 +0000] "GET / HTTP/1.1" 200 18045
danpluslaura_1 | 172.17.0.25 - - [01/Dec/2014:14:31:19 +0000] "GET / HTTP/1.1" 200 24280
dbdevs_1 | 172.17.0.25 - - [01/Dec/2014:14:31:26 +0000] "GET / HTTP/1.1" 200 11154
We can then bring everything down by using the command fig stop. We can restart them all by using fig up again, which will actually reuse the containers from the previous run unless there has been an update. In fact, you don't even have to call fig stop to make an update. Simply run fig up and containers with an update will be updated.
Currently this is a good solution for my often unvisited websites, but if I made a serious push to make a business out of these sites, then this solution would not suffice. Currently, I cannot add another website container for cafezvous to scale its ability to handle additional workloads. This is something that is definitely needed, and I will post my experiences in the future with Serf and hopefully Kubernetes. These systems will allow for dynamically growing my services if needed.
Docker: How I dockerized my websites
The Problem:
My current system for deploying my pretty awful websites is by using Chef. I'm not a web developer, and I haven't spent much time developing these websites, so don't judge me too much on them. I dabble with them every couple of months to get them closer to something good. They don't get much traffic, so I can run all three on one very tiny AWS instance. I'm still using the free tier, so who knows where I'll go once that's done. Actually, this brings me to my decision for my current setup. I originally hosted all three sites on my server at home, but power outages had become a problem.
However, I had no reliably consistent way of deploying these three servers to another server without taking a rather significant amount of time and effort. I had been using Chef at my day job, and I thought this would be a great opportunity to learn more and complete a task that needed to be finished quickly, as one of the websites was for my wedding and people would soon be needing to access it for RSVP's and other information. I couldn't have it going down for half the day whenever the power went out. I had already gotten complaints when my ISP randomly changed my IP address which I thought was static.
So, I endeavored to setup a deployment method that would allow for fast redeployments and code updates. I wanted this to be a cross-platform implementation, but it never quite got there. I was running Debian at home and RHEL at work, so I went with RHEL as it would be more beneficial for my knowledge base at work. Debian uses apt-get, so I had originally developed my Chef cookbook/roles with apt-get in mind. It didn't take much to switch to yum and the epel repo, but it did take extra effort that I wish hadn't been needed.
This Chef setup did allow for multiple websites to be deployed at once using a single httpd server with multiple virtual hosts configured. Updates could be made when one of the website's code had changed. Everything could be initiated with a simple chef-solo run, but this wasn't portable enough. I still had to manage the dependent cookbooks, but Berkshelf made this much easier. I also had to restart the single running httpd server when any website was updated, which creates some risks if the updates I'm making bring the httpd server down. I didn't have a sandbox environment that I could use to ensure everything would work on an identical AMI. In fact, I didn't even have another RHEL or Centos box that closely resembled the production AMI. This is not the way to do it.
The Solution (the part you actually care about):
That is when I discovered Docker. We had been discussing it at work for awhile before it was officially released as a 1.0 version. I had even created Jenkins slave containers to more efficiently utilize some of our resources. I had also setup a private registry at work so we could share our images when the time came to ramp up our Docker development. However, it hadn't occurred to me that I could run my servers using Docker containers. I had originally thought about just using Chef to provision a container with the exact setup I was already running. That's lazy and really doesn't fit the purpose of Docker.
Chef can be utilized to manage your Docker configuration on a node, but containers should be immutable once created. Containers should also be used in composition of an app or system similar to what developers already do with classes and libraries. If a container is running syslog, then you're probably doing it wrong (for the record, I haven't gotten to the point where I have a container for logging). My goal with Docker was to separate the concerns of my current setup so that I could update the code in one website without needing to restart all the servers. I also wanted an easier way to manage my deployment and upgrades.
I chose to separate each website into its own container running its own httpd server. This allowed a simpler configuration without virtual hosts and a single directory within the container where the code is stored. I also chose to use haproxy to route my website traffic to the appropriate container. To do this, I had to use links between the containers. I'll show how to manually do this below along with the better way using fig.
First, we need to create our website containers. With Chef, I had to install git or curl so that I could download the entire website repo during the Chef run. That's not a huge deal, but I didn't need either on my production machine. This isn't needed on the container, but it can be installed if you really want to. I chose to clone locally and then COPY the contents into the container.
The Dockerfile for each website is the same except the website name:
danpluslaura.com
1: FROM httpd:2.4
2:
3: MAINTAINER Daniel Barker (barkerd@dbdevs.com)
4:
5: COPY ./danpluslaura/ /usr/local/apache2/htdocs
6: COPY ./httpd.conf /usr/local/apache2/conf/httpd.conf
dbdevs.com
1: FROM httpd:2.4
2:
3: MAINTAINER Daniel Barker (barkerd@dbdevs.com)
4:
5: COPY ./dbdevs/ /usr/local/apache2/htdocs
6: COPY ./httpd.conf /usr/local/apache2/conf/httpd.conf
cafezvous.com
1: FROM httpd:2.4
2:
3: MAINTAINER Daniel Barker (barkerd@dbdevs.com)
4:
5: COPY ./cafezvous/ /usr/local/apache2/htdocs
6: COPY ./httpd.conf /usr/local/apache2/conf/httpd.conf
Note that I have also included a custom httpd.conf file, but this is only needed so I can remove the www from the website address and set the ServerName:
191: ServerName cafezvous.com:80
...
504: RewriteEngine on
505: Options +FollowSymlinks
506: RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
507: RewriteRule ^(.*)$ http://cafezvous.com$1 [R=301,L]
With this setup, the only images that are needed for each different website are the same as the original implementation (the website files and the special configuration), except I can now run three separate httpd servers. This setup also allows me to develop locally using Docker's -v option to mount those files on my host system and have the httpd server use those files. This allows for the container to immediately see changes to my code during development. I could also use this method with a data container for production deployments, but that really wouldn't be very useful for this instance.
Now you may be asking how I plan to route each website to the appropriate container since they can't all run on port 80, and that's probably how I'll want visitors to access them. I could probably run an httpd server on the host system or another container to route each site to a particular port, but that isn't really a good use of httpd and is more suited to an actual proxy. I chose to use haproxy to route my website traffic, but I'll leave that for another blog post. The current setup will allow you to test your websites locally with hardly any code. If you want to update to a newer version of httpd, then you simply change the FROM command and rebuild and push your images. I won't cover this process here, as the Docker docs cover the basics very well and are updated regularly.
Note: I could have installed curl or git or some other program to get my code into the container, but I thought that would just create more images with more space taken up by code that would never be needed again. The only problem with this is ensuring you have the correct version locally. Therefore, I may have done this differently if it were a corporate environment where I could download a tar and unpack it in a single RUN command to eliminate most of the overhead. This would also allow for versioning to be maintained in source control.
Labels:
apache,
containers,
Docker,
docker registry,
haproxy,
httpd,
LXC,
RHEL,
websites
Subscribe to:
Posts (Atom)