Friday, February 6, 2015

Docker: Fundamental Change


Friday, December 5, 2014

Docker: Containers != Images


So, you're learning Docker, but every video you watch and blog you read uses the terms containers and images seemingly interchangeably. They say things like, "Docker allows you to share your container throughout your environments as an immutable application." However, this isn't accurate.

You can't actually share a container at all. The container only exists as an instance on a node. This instance can be running or stopped, but it is an instance none-the-less. You can "export" a container to a tar file, but when you "import" it back in, it's an image. That's because it's not an instance. A container is an instance of an image. The image is what you share.

An image is actually one or many layers of images, including the actual image. So, you can have a base image that consists only of that single image, or you can have an application with dozens of images. Each command in a Dockerfile actually creates a new image underlying the final image. These images will always remain with the final image unless you "export" a container, which then flattens all of the images into one.

The benefit of this layering is that you can base all of your apps on the same base image and that base image only exists once on the system, which helps with storage capacity. The images are also uploaded to Docker Hub, but the base images can be skipped if the Hub already holds a reference to that image. This will speed deployments, as only the changes need to be pulled down.

Containers are the instant state of these images, and are able to consume very little space as they are simply copies of the images. The only space they consume is what is added during operation.

Images are the saved state of a filesystem composed of layers of filesystem states. If a file is uploaded in one image layer, then that file will always exist as part of that image layer and thus the final image will be that much larger. So, if you add a large tar and unpack it and then delete it in a different command, then there really is no reason to delete it, because the file will still exist in the layering and consume that amount of space. It is best to combine commands on a single line and make other optimization decisions when it comes to importing data that won't need persisting.

I hope this helps differentiate between containers and images as I found them a bit confusing when I first started learning Docker in the early days.

Monday, December 1, 2014

Docker: Adding HAProxy and Fig to my docker websites


In a previous post, I discussed moving my websites into Docker containers with their own separate httpd servers from their previous setup as virtual hosts on a single httpd server. This post will discuss integrating HAProxy and Fig into my installation. This will allow for load balancing, proper routing, and easy deployments.

To add HAProxy, I simply used the library haproxy image. You will just need to create your own Dockerfile and copy in your haproxy.cfg as the instructions state from the link. The first part of the config file is fairly standard:

1:  global  
2:   log 127.0.0.1 local0  
3:   log 127.0.0.1 local1 notice  
4:   chroot /var/lib/haproxy  
5:   user haproxy  
6:   group haproxy  
7:     
8:    
9:  defaults  
10:   log global  
11:   mode http  
12:   option httplog  
13:   option dontlognull  
14:   option forwardfor  
15:   option http-server-close  
16:   timeout connect 5000  
17:   timeout client 50000  
18:   timeout server 50000  
19:   errorfile 400 /etc/haproxy/errors/400.http  
20:   errorfile 403 /etc/haproxy/errors/403.http  
21:   errorfile 408 /etc/haproxy/errors/408.http  
22:   errorfile 500 /etc/haproxy/errors/500.http  
23:   errorfile 502 /etc/haproxy/errors/502.http  
24:   errorfile 503 /etc/haproxy/errors/503.http  
25:   errorfile 504 /etc/haproxy/errors/504.http  
26:   stats enable  
27:   stats uri /haproxy?stats  

It is the rest of the file that does the work for us. I have multiple websites served from the same server, and each website is also accessed by multiple domains (i.e. cafezvous.com is also cafezvous.net). There may be a less verbose way of doing what I've done, but I haven't found that method. We need to declare our frontend and bind it to port 80 on any incoming IP address. We then define ACL's (Access Control Lists) for each domain and associate them with the appropriate generic host. We then associate each host with the backend cluster that will serve the calls to that website. We finally use cafezvous as the default cluster if no ACL is matched.

29:  frontend http-in  
30:      bind *:80  
31:    
32:      # Define hosts  
33:      acl host_cafezvous hdr(host) -i cafezvous.com  
34:      acl host_cafezvous hdr(host) -i cafezvous.co  
35:      acl host_cafezvous hdr(host) -i cafezvous.info  
36:      acl host_cafezvous hdr(host) -i cafezvous.org  
37:      acl host_cafezvous hdr(host) -i cafezvous.net  
38:      acl host_cafezvous hdr(host) -i www.cafezvous.com  
39:      acl host_cafezvous hdr(host) -i www.cafezvous.co  
40:      acl host_cafezvous hdr(host) -i www.cafezvous.info  
41:      acl host_cafezvous hdr(host) -i www.cafezvous.org  
42:      acl host_cafezvous hdr(host) -i www.cafezvous.net  
43:      acl host_dbdevs hdr(host) -i dbdevs.com  
44:      acl host_dbdevs hdr(host) -i dbdevs.co  
45:      acl host_dbdevs hdr(host) -i dbdevs.info  
46:      acl host_dbdevs hdr(host) -i dbdevs.org  
47:      acl host_dbdevs hdr(host) -i dbdevs.net  
48:      acl host_dbdevs hdr(host) -i www.dbdevs.com  
49:      acl host_dbdevs hdr(host) -i www.dbdevs.co  
50:      acl host_dbdevs hdr(host) -i www.dbdevs.info  
51:      acl host_dbdevs hdr(host) -i www.dbdevs.org  
52:      acl host_dbdevs hdr(host) -i www.dbdevs.net  
53:      acl host_danpluslaura hdr(host) -i danpluslaura.com  
54:      acl host_danpluslaura hdr(host) -i danpluslaura.co  
55:      acl host_danpluslaura hdr(host) -i danpluslaura.info  
56:      acl host_danpluslaura hdr(host) -i danpluslaura.org  
57:      acl host_danpluslaura hdr(host) -i danpluslaura.net  
58:      acl host_danpluslaura hdr(host) -i www.danpluslaura.com  
59:      acl host_danpluslaura hdr(host) -i www.danpluslaura.co  
60:      acl host_danpluslaura hdr(host) -i www.danpluslaura.info  
61:      acl host_danpluslaura hdr(host) -i www.danpluslaura.org  
62:      acl host_danpluslaura hdr(host) -i www.danpluslaura.net  
63:    
64:      ## figure out which one to use  
65:      use_backend cafezvous_cluster if host_cafezvous  
66:      use_backend dbdevs_cluster if host_dbdevs  
67:      use_backend danpluslaura_cluster if host_danpluslaura  
68:    
69:      default_backend cafezvous_cluster  

Normally our backends would be associated to some IP address, but we don't know what the IP address of a container will be until it's created. We could wait until all of the servers are started to then add each IP address and port individually, but that doesn't seem reasonable or easy. Also, we'd be creating a new container for each restart. We could also map each website to a particular port on the host and then update this with the host IP address/localhost and port. The problem isn't as bad with this solution, but this won't scale well. Plus, the smart guys at Docker have thought of a clever way to associate an IP with a container. They create an entry in /etc/hosts when the --link command is used to link containers together. So, the cafezvous container can be referenced by cafezvous with a trailing port number. And dbdevs is referenced the same way. All of them can continue to use the common port of 80 and this can be deployed anywhere without checking to make sure a port isn't already mapped to the host.

71:  backend cafezvous_cluster  
72:      balance leastconn  
73:      option httpclose  
74:      option forwardfor  
75:      cookie JSESSIONID prefix  
76:      server node1 cafezvous:80 cookie A check  
77:    
78:  backend dbdevs_cluster  
79:      balance leastconn  
80:      option httpclose  
81:      option forwardfor  
82:      cookie JSESSIONID prefix  
83:      server node1 dbdevs:80 cookie A check  
84:    
85:  backend danpluslaura_cluster  
86:      balance leastconn  
87:      option httpclose  
88:      option forwardfor  
89:      cookie JSESSIONID prefix  
90:      server node1 danpluslaura:80 cookie A check  

Now we can start our containers:

 $ docker run -dtP --name dbdevs barkerd427/dbdevs  
 ed2ac1134324a7dda48d20567efb52e051d57f844c89b2a98a220a1c8b297a74  
 $ docker run -dtP --name cafezvous barkerd427/cafezvous  
 ec7bfaf06985ace3f617ffc466c93eabfe4cfca916ff8c71c4a125e5d0a7dfae  
 $ docker run -dtP --name danpluslaura barkerd427/danpluslaura  
 8ea8b1261d2ec1e5268e55f7504e4fab308cdaea1a5301e382ccb5de0e0718ee  
 $ docker run -dtP --name haproxy --link cafezvous:cafezvous --link dbdevs:dbdevs --link danpluslaura:danpluslaura barkerd427/haproxy  
 7971754d55d10972370b401c8061218bf53b495aafdc82d366f9232aaa192a03  
 $ docker ps -a -s  
 CONTAINER ID    IMAGE                         COMMAND               CREATED          STATUS          PORTS                                                NAMES         SIZE  
 7971754d55d1    barkerd427/haproxy:0.1        "bash /haproxy-start  18 seconds ago   Up 6 seconds    0.0.0.0:49161->443/tcp, 0.0.0.0:49162->80/tcp  haproxy       0 B  
 8ea8b1261d2e    barkerd427/danpluslaura:0.20  "httpd -DFOREGROUND"  2 minutes ago    Up 2 minutes    0.0.0.0:49158->80/tcp                             danpluslaura  2 B  
 ec7bfaf06985    barkerd427/cafezvous:0.4      "httpd -DFOREGROUND"  2 minutes ago    Up 2 minutes    0.0.0.0:49157->80/tcp                             cafezvous     2 B  
 ed2ac1134324    barkerd427/dbdevs:0.4         "httpd -DFOREGROUND"  3 minutes ago    Up 2 minutes    0.0.0.0:49156->80/tcp                             dbdevs        2 B  

This is a bit cumbersome to have to add all of these links individually, especially as the infrastructure becomes more complex. So, we'll go ahead and add a simple Fig configuration file to the mix and get all of these commands condensed into one. Here's the configuration fig.yml file:

 haproxy:  
  image: barkerd427/haproxy
  ports:
   - "80:80"
  links:
   - cafezvous  
   - dbdevs  
   - danpluslaura  
 cafezvous:   
  image: barkerd427/cafezvous
  ports:  
   - "80"  
 dbdevs:   
  image: barkerd427/dbdevs
  ports:  
   - "80"  
 danpluslaura:   
  image: barkerd427/danpluslaura
  ports:  
   - "80"  

We explicitly call out the port mapping for HAProxy to the host, but we leave the others to be assigned by Docker. We don't really care what the ports are on the host for this purpose, but we do need to know that the container port 80 is mapped generically. We also explicitly call out the links that HAProxy needs to the other container names. Now when I run fig up -d, everything will start in the correct order and run in the background. If you don't add the -d, then everything will shutdown when that session ends.

To run fig up, you need to be in the directory with the fig.yml file or reference that file with the -f or --file option. The project name defaults to the directory name, but this can be overridden with -p or --project-name. You can also attach to fig to see the output of the containers by using fig logs.

 $ fig up -d  
 Creating websitefig_danpluslaura_1...  
 Creating websitefig_dbdevs_1...  
 Creating websitefig_cafezvous_1...  
 Creating websitefig_haproxy_1...  
 $ fig ps  
      Name                  Command              State   Ports  
 -------------------------------------------------------------------------------------  
 websitefig_cafezvous_1     httpd -DFOREGROUND   Up      0.0.0.0:49171->80/tcp  
 websitefig_danpluslaura_1  httpd -DFOREGROUND   Up      0.0.0.0:49169->80/tcp  
 websitefig_dbdevs_1        httpd -DFOREGROUND   Up      0.0.0.0:49170->80/tcp  
 websitefig_haproxy_1       bash /haproxy-start  Up      443/tcp, 0.0.0.0:80->80/tcp  
 $ docker ps -a -s  
 CONTAINER ID    IMAGE                         COMMAND               CREATED             STATUS             PORTS                           NAMES                      SIZE  
 81ee5adf1aa3    barkerd427/haproxy:0.1        "bash /haproxy-start  About a minute ago  Up 59 seconds      443/tcp, 0.0.0.0:80->80/tcp  websitefig_haproxy_1       0 B  
 7df01566911e    barkerd427/cafezvous:0.4      "httpd -DFOREGROUND"  About a minute ago  Up 59 seconds      0.0.0.0:49171->80/tcp        websitefig_cafezvous_1     2 B  
 0f7c971a41aa    barkerd427/dbdevs:0.4         "httpd -DFOREGROUND"  About a minute ago  Up About a minute  0.0.0.0:49170->80/tcp        websitefig_dbdevs_1        2 B  
 e02f0cdc37b1    barkerd427/danpluslaura:0.20  "httpd -DFOREGROUND"  About a minute ago  Up About a minute  0.0.0.0:49169->80/tcp        websitefig_danpluslaura_1  2 B  
 $ fig logs  
 Attaching to websitefig_cafezvous_1, websitefig_dbdevs_1, websitefig_danpluslaura_1  
 danpluslaura_1 | [Mon Dec 01 14:30:59.367252 2014] [mpm_event:notice] [pid 1:tid 140438754260864] AH00489: Apache/2.4.10 (Unix) configured -- resuming normal operations  
 danpluslaura_1 | [Mon Dec 01 14:30:59.367462 2014] [core:notice] [pid 1:tid 140438754260864] AH00094: Command line: 'httpd -D FOREGROUND'  
 dbdevs_1    | [Mon Dec 01 14:30:59.708416 2014] [mpm_event:notice] [pid 1:tid 140529524463488] AH00489: Apache/2.4.10 (Unix) configured -- resuming normal operations  
 dbdevs_1    | [Mon Dec 01 14:30:59.708595 2014] [core:notice] [pid 1:tid 140529524463488] AH00094: Command line: 'httpd -D FOREGROUND'  
 cafezvous_1  | [Mon Dec 01 14:31:00.034189 2014] [mpm_event:notice] [pid 1:tid 140501628454784] AH00489: Apache/2.4.10 (Unix) configured -- resuming normal operations  
 cafezvous_1  | [Mon Dec 01 14:31:00.034919 2014] [core:notice] [pid 1:tid 140501628454784] AH00094: Command line: 'httpd -D FOREGROUND'  
 cafezvous_1  | 172.17.0.25 - - [01/Dec/2014:14:31:15 +0000] "GET / HTTP/1.1" 200 18045  
 danpluslaura_1 | 172.17.0.25 - - [01/Dec/2014:14:31:19 +0000] "GET / HTTP/1.1" 200 24280  
 dbdevs_1    | 172.17.0.25 - - [01/Dec/2014:14:31:26 +0000] "GET / HTTP/1.1" 200 11154  

We can then bring everything down by using the command fig stop. We can restart them all by using fig up again, which will actually reuse the containers from the previous run unless there has been an update. In fact, you don't even have to call fig stop to make an update. Simply run fig up and containers with an update will be updated.

Currently this is a good solution for my often unvisited websites, but if I made a serious push to make a business out of these sites, then this solution would not suffice. Currently, I cannot add another website container for cafezvous to scale its ability to handle additional workloads. This is something that is definitely needed, and I will post my experiences in the future with Serf and hopefully Kubernetes. These systems will allow for dynamically growing my services if needed.

Docker: How I dockerized my websites


The Problem:


My current system for deploying my pretty awful websites is by using Chef. I'm not a web developer, and I haven't spent much time developing these websites, so don't judge me too much on them. I dabble with them every couple of months to get them closer to something good. They don't get much traffic, so I can run all three on one very tiny AWS instance. I'm still using the free tier, so who knows where I'll go once that's done. Actually, this brings me to my decision for my current setup. I originally hosted all three sites on my server at home, but power outages had become a problem.

However, I had no reliably consistent way of deploying these three servers to another server without taking a rather significant amount of time and effort. I had been using Chef at my day job, and I thought this would be a great opportunity to learn more and complete a task that needed to be finished quickly, as one of the websites was for my wedding and people would soon be needing to access it for RSVP's and other information. I couldn't have it going down for half the day whenever the power went out. I had already gotten complaints when my ISP randomly changed my IP address which I thought was static.

So, I endeavored to setup a deployment method that would allow for fast redeployments and code updates. I wanted this to be a cross-platform implementation, but it never quite got there. I was running Debian at home and RHEL at work, so I went with RHEL as it would be more beneficial for my knowledge base at work. Debian uses apt-get, so I had originally developed my Chef cookbook/roles with apt-get in mind. It didn't take much to switch to yum and the epel repo, but it did take extra effort that I wish hadn't been needed.

This Chef setup did allow for multiple websites to be deployed at once using a single httpd server with multiple virtual hosts configured. Updates could be made when one of the website's code had changed. Everything could be initiated with a simple chef-solo run, but this wasn't portable enough. I still had to manage the dependent cookbooks, but Berkshelf made this much easier. I also had to restart the single running httpd server when any website was updated, which creates some risks if the updates I'm making bring the httpd server down. I didn't have a sandbox environment that I could use to ensure everything would work on an identical AMI. In fact, I didn't even have another RHEL or Centos box that closely resembled the production AMI. This is not the way to do it.

The Solution (the part you actually care about):


That is when I discovered Docker. We had been discussing it at work for awhile before it was officially released as a 1.0 version. I had even created Jenkins slave containers to more efficiently utilize some of our resources. I had also setup a private registry at work so we could share our images when the time came to ramp up our Docker development. However, it hadn't occurred to me that I could run my servers using Docker containers. I had originally thought about just using Chef to provision a container with the exact setup I was already running. That's lazy and really doesn't fit the purpose of Docker.

Chef can be utilized to manage your Docker configuration on a node, but containers should be immutable once created. Containers should also be used in composition of an app or system similar to what developers already do with classes and libraries. If a container is running syslog, then you're probably doing it wrong (for the record, I haven't gotten to the point where I have a container for logging). My goal with Docker was to separate the concerns of my current setup so that I could update the code in one website without needing to restart all the servers. I also wanted an easier way to manage my deployment and upgrades.

I chose to separate each website into its own container running its own httpd server. This allowed a simpler configuration without virtual hosts and a single directory within the container where the code is stored. I also chose to use haproxy to route my website traffic to the appropriate container. To do this, I had to use links between the containers. I'll show how to manually do this below along with the better way using fig.

First, we need to create our website containers. With Chef, I had to install git or curl so that I could download the entire website repo during the Chef run. That's not a huge deal, but I didn't need either on my production machine. This isn't needed on the container, but it can be installed if you really want to. I chose to clone locally and then COPY the contents into the container.

The Dockerfile for each website is the same except the website name:

danpluslaura.com
1:  FROM httpd:2.4  
2:    
3:  MAINTAINER Daniel Barker (barkerd@dbdevs.com)  
4:    
5:  COPY ./danpluslaura/ /usr/local/apache2/htdocs  
6:  COPY ./httpd.conf /usr/local/apache2/conf/httpd.conf  

dbdevs.com
1:  FROM httpd:2.4  
2:    
3:  MAINTAINER Daniel Barker (barkerd@dbdevs.com)  
4:    
5:  COPY ./dbdevs/ /usr/local/apache2/htdocs  
6:  COPY ./httpd.conf /usr/local/apache2/conf/httpd.conf  

cafezvous.com
1:  FROM httpd:2.4  
2:    
3:  MAINTAINER Daniel Barker (barkerd@dbdevs.com)  
4:    
5:  COPY ./cafezvous/ /usr/local/apache2/htdocs  
6:  COPY ./httpd.conf /usr/local/apache2/conf/httpd.conf  

Note that I have also included a custom httpd.conf file, but this is only needed so I can remove the www from the website address and set the ServerName:

191:  ServerName cafezvous.com:80  
...    
504:  RewriteEngine on  
505:  Options +FollowSymlinks  
506:  RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]  
507:  RewriteRule ^(.*)$ http://cafezvous.com$1 [R=301,L]  

With this setup, the only images that are needed for each different website are the same as the original implementation (the website files and the special configuration), except I can now run three separate httpd servers. This setup also allows me to develop locally using Docker's -v option to mount those files on my host system and have the httpd server use those files. This allows for the container to immediately see changes to my code during development. I could also use this method with a data container for production deployments, but that really wouldn't be very useful for this instance.

Now you may be asking how I plan to route each website to the appropriate container since they can't all run on port 80, and that's probably how I'll want visitors to access them. I could probably run an httpd server on the host system or another container to route each site to a particular port, but that isn't really a good use of httpd and is more suited to an actual proxy. I chose to use haproxy to route my website traffic, but I'll leave that for another blog post. The current setup will allow you to test your websites locally with hardly any code. If you want to update to a newer version of httpd, then you simply change the FROM command and rebuild and push your images. I won't cover this process here, as the Docker docs cover the basics very well and are updated regularly.

Note: I could have installed curl or git or some other program to get my code into the container, but I thought that would just create more images with more space taken up by code that would never be needed again. The only problem with this is ensuring you have the correct version locally. Therefore, I may have done this differently if it were a corporate environment where I could download a tar and unpack it in a single RUN command to eliminate most of the overhead. This would also allow for versioning to be maintained in source control.

Sunday, November 30, 2014

Docker: The basics (with Mac OS X and boot2docker)



Docker is essentially an abstraction of Linux Containers. Although over time it has strayed from that definition. A docker container is an abstraction away from whatever the container happens to be running on. It is an abstraction of its architecture. The container could be running on a laptop, a barebones server, a Raspberry Pi, a high-end server, or anything else that supports docker. It doesn't concern itself with where it is running, and therefore the app inside also doesn't concern itself. The app will be concerned with its local environment, and that is something Docker currently solves. I have talked with some of the developers, and they have led me to believe that they intend to add support for Docker to understand a distributed environment allowing links between containers on different nodes. There are other tools that solve this already which I will talk about in later posts.

This post will particularly cover getting Mac OS X setup with Docker so you can develop locally. It should be short. The major step is just getting a VM installed that runs Docker natively, because OS X has no native support for Docker. Luckily, there's already one made which is pretty cool.

Basically, just go to the Docker website and follow the directions. Make sure that you either add the export commands that are output after the command boot2docker up on each terminal window you are accessing Docker in, or add them to one of the bash configuration files. However, these commands will change if you bring the VM down and back up again.

Another point to remember, is that your VM can grow pretty large if you are using it for a long time without cleaning it up. It may be good to map some host storage for the Docker images to avoid your virtual environment from growing too large. I've had it crash my computer as it contended for resources with other processes.

Another aspect to realize, is that all of the Docker functions are occurring on the VM. That is where the daemon is hosted, so that is where your containers are running. This requires you to map your VM ports to your Mac ports in order to access your container endpoints (assuming it's something with ports and an ip address). You can still access your containers through the Docker commands, but you won't be able to say curl a website from outside boot2docker without the port mappings from the VM to the Mac. This is not the same as the port mappings done by docker to the VM.

With that all said, it is pretty simple to access all of your containers, set up a local registry in case you want to share with co-workers, and push your containers to an external registry for deployment. I have had a lot of problems using Packer from outside the VM. I was using Chef, which just caused more issues. I ultimately bailed on Packer, as I don't think it's the right solution for docker containers.

Docker containers use layers, which allows for optimizations that are part of the reason to use Docker. With these layers, you can have a common base image, like ubuntu, that only resides on the host system once no matter how many images use that layer. This is even better if you have several different apps that may need other layers that are the same. Perhaps you have several websites sharing a server that all need an httpd setup, but the final layer of actual code is different. This allows for the layers up to the website code to be downloaded once. This can save a lot of space, and it also makes building your images during development easier and faster.

I have used the terms images and containers, and they may have seemed synonymous to some extent. This is one area where I wish there was a better delineator when being discussed. A container is not actually what is shipped out; it's an image, or rather many layers of images that make up a conceptual container. The actual container is the running image that is downloaded to each host. Many times people say they are shipping the container, but they are actually shipping a set of images that constitute a conceptual container. I suppose the container analogy breaks down if this distinction is made, but it can be confusing if there is no delineation between the two concepts.

A final note should be made that containers should not be tarred for shipping. My original confusion with images and containers was that I thought to ship a container it had to be exported. As the project developed over the last year, it became clear to me that wasn't the case. Originally, there was no private registry available. If you tar a container, then you lose all of the benefit of layers. That could be good if you didn't manage your layers well (i.e. downloading and unzipping a tar and then removing it all in different RUN commands. Each of those commands creates a new image layer, so you do nothing to the final size by removing the tar in a separate RUN command. These should be chained together in one RUN command). Therefore, if you have to tar, then go ahead, but it should be avoided.

Monday, August 18, 2014

Create a base docker image (RHEL) from vagrant

We'll be creating a RHEL 6.5 base docker image using vagrant and VirtualBox. This is a private image, but the process will be the same with any image you have. This will also work with any other yum-oriented linux distro.

Note: I had trouble creating this image from a VM based on a similar image hosted on VMWare, so that is why I decided to build this in vagrant. Also, vagrant can be run from anywhere, which makes this a much more portable solution.

I won't go through the installation process for vagrant and VirtualBox as it's different for each platform, but both of these need to be installed prior to the next step.

We need to create a Vagrantfile to start our instance. This will be a very simple instance, but feel free to use any that you already have available.

Vagrant.configure("2") do |c|
  c.vm.box = "rhel65-1.0.0"
  c.vm.box_url = "http://example.com/rhel65/1.0.0/rhel65-1.0.0.box"
  c.vm.hostname = "default-rhel65-100.vagrantup.com"
  c.vm.synced_folder ".", "/vagrant", disabled: true
  c.vm.provider :virtualbox do |p|
  end
end

Make sure you change the url to point to your .box file.

Now, open a terminal to the directory where you saved the Vagrantfile we just created and type:

vagrant up



This will create the VM and start it up based on the settings from the Vagrantfile. Please refer to the vagrant docs for additional configuration options, but none are needed for creating the docker image.

Next, we'll need to login on the vagrant image by issuing the command:

vagrant ssh

My instance logs me in as the vagrant user with sudoer powers. We will need sudo or root access to create the base image. This is required to access and read all of the system files that need to be copied to the image.

We need to install docker so our base image can actually be stored and pushed to a repo after it's created. You'll notice in the script later that docker commands are called from the script to commit the image with the name provided. We'll install the EPEL and then docker-io.

sudo rpm -ivh http://mirror.pnl.gov/epel/6/i386/epel-release-6-8.noarch.rpm
sudo yum install -y docker-io

Inside the vagrant image, we need to clone the docker repo and cd to the docker/contrib directory:

sudo yum install -y git # I had to install git first as it wasn't in the image I'm using
git clone https://github.com/docker/docker.git
cd docker/contrib

This is where the magic script is stored. I have added it here in case it breaks in the future, but always try the version in the repo as long as it exists. The file we need is mkimage-yum.sh

    #!/usr/bin/env bash
    #
    # Create a base CentOS Docker image.
    #
    # This script is useful on systems with yum installed (e.g., building
    # a CentOS image on CentOS).  See contrib/mkimage-rinse.sh for a way
    # to build CentOS images on other systems.

    usage() {
        cat <eoopts basename="" name="">
    OPTIONS:
      -y <yumconf>  The path to the yum config to install packages from. The
                    default is /etc/yum.conf.
    EOOPTS
        exit 1
    }

    # option defaults
    yum_config=/etc/yum.conf
    while getopts ":y:h" opt; do
        case $opt in
            y)
                yum_config=$OPTARG
                ;;
            h)
                usage
                ;;
            \?)
                echo "Invalid option: -$OPTARG"
                usage
                ;;
        esac
    done
    shift $((OPTIND - 1))
    name=$1

    if [[ -z $name ]]; then
        usage
    fi

    #--------------------

    target=$(mktemp -d --tmpdir $(basename $0).XXXXXX)

    set -x

    mkdir -m 755 "$target"/dev
    mknod -m 600 "$target"/dev/console c 5 1
    mknod -m 600 "$target"/dev/initctl p
    mknod -m 666 "$target"/dev/full c 1 7
    mknod -m 666 "$target"/dev/null c 1 3
    mknod -m 666 "$target"/dev/ptmx c 5 2
    mknod -m 666 "$target"/dev/random c 1 8
    mknod -m 666 "$target"/dev/tty c 5 0
    mknod -m 666 "$target"/dev/tty0 c 4 0
    mknod -m 666 "$target"/dev/urandom c 1 9
    mknod -m 666 "$target"/dev/zero c 1 5

    yum -c "$yum_config" --installroot="$target" --setopt=tsflags=nodocs \
        --setopt=group_package_types=mandatory -y groupinstall Core
    yum -c "$yum_config" --installroot="$target" -y clean all

    cat > "$target"/etc/sysconfig/network <<EOF
    NETWORKING=yes
    HOSTNAME=localhost.localdomain
    EOF
    
    # effectively: febootstrap-minimize --keep-zoneinfo --keep-rpmdb
    # --keep-services "$target".  Stolen from mkimage-rinse.sh
    #  locales
    rm -rf "$target"/usr/{{lib,share}/locale,{lib,lib64}/gconv,bin/localedef,sbin/build-locale-archive}
    #  docs
    rm -rf "$target"/usr/share/{man,doc,info,gnome/help}
    #  cracklib
    rm -rf "$target"/usr/share/cracklib
    #  i18n
    rm -rf "$target"/usr/share/i18n
    #  sln
    rm -rf "$target"/sbin/sln
    #  ldconfig
    rm -rf "$target"/etc/ld.so.cache
    rm -rf "$target"/var/cache/ldconfig/*

    version=
    if [ -r "$target"/etc/redhat-release ]; then
        version="$(sed 's/^[^0-9\]*\([0-9.]\+\).*$/\1/' "$target"/etc/redhat-release)"
    fi

    if [ -z "$version" ]; then
        echo >&2 "warning: cannot autodetect OS version, using '$name' as tag"
        version=$name
    fi

    tar --numeric-owner -c -C "$target" . | docker import - $name:$version
    docker run -i -t $name:$version echo success

    rm -rf "$target"
We'll now execute this script which will create a docker image and commit it to our local docker instance, but first we need to start our docker instance.

sudo service docker start
sudo ./mkimage-yum.sh rhel

The script will now copy all of the data it needs to a tmp directory, create the image, commit the image using the name provided, in this case rhel, and the version from /etc/redhat-release, and then clean up the tmp directory. Once this is complete, we'll see our docker image in our local docker instance. The last few lines of output show the image id and "name":


Then we can see it in our local list of images, the "name" is made of the REPOSITORY and TAG:


We can also see this in our containers list:


We'll now push it to a private registry just to show the basics of that interaction. We'll use our server at ipfacecobld26 on port 5000, but first we need to commit it with our registry as part of the REPOSITORY:


We used the CONTAINER ID from the previous step to commit the image with the REPOSITORY set as ipfacecobld26:5000/rhel and the tag as 6.5. This returns the IMAGE ID of the new image. We can now push this image to our server:


Notice that it pushed two images. The second image is the one we just tagged, and the first is the original image. This shows how docker "stacks" images on top of each other. Every image in the stack has to be pushed to the registry for the top image to work. We now have our own RHEL 6.5 image that others can pull down and use to run long-running VM's, short-lived test-kitchen tests, or to build new images with additional functionality.

Wednesday, August 6, 2014

The Right To Be Forgotten

When I wrote an article for a journal a couple years ago about the loss of the "right to be forgotten", this was not my desired outcome. I doubt my article was ever viewed in this endeavor as many others were written previously. If it had been read, then the court would have put into place time limits and notoriety standards for the information. Actually, the court would have done no such thing as I advocated companies to implement these features. I also advocated that items should degrade on the search results page over time and not be completely eliminated until a significant amount of time had passed based on the notoriety of the topic. It would then be the responsibility of private industry to back these items up on searchable databases, but not through normal search methods. I suggested that it could become like old newspapers at the library. The theory my coauthors and I posited was that prior to the Internet, we were able to move somewhere new and start again or the memory of an event would fade from the minds of those around us over time. Or we would be able to perform some offsetting action that would overshadow the previous negative events. This is simply no longer true. One misstep in life can permanently damage a reputation forever. This is a theory worthy of discussion within our society, but not worthy of government involvement.

This post is in reference to this article: http://www.theguardian.com/technology/2014/aug/02/wikipedia-page-google-link-hidden-right-to-be-forgotten