Getting started with configuration management: Ansible

Lately I’ve been working on configuration management for my infrastructure. Since I started freelancing, the number of clients have grown and so did my infrastructure. Managing one server isn’t a daunting task, but from the moment you need to expand, things get more serious.

At this point, my infrastructure consists of 1 database server, 1 git server and 4 web servers. Managing those specific servers ( like 1 database and 1 git server) is still something that you might consider doing manually. But running 4 web servers, with mostly the same configuration is more tedious.

For instance, lets say we run nginx on all 4 servers. One day, a security update gets released and we need to upgrade al nginx installations. Without an automated way for doing this, we’ll need to upgrade 4 servers manually. Think about the time you are waisting.

Actually, even from the moment you have only one server, you should consider automating the process of installation and configuration. Having everything automated will save you time from the moment your server crashes. Having your whole server layout defined and pre-configured should be part of your disaster recovery plan.

It doesn’t matter how big or small your are. Even in a worst case scenario, being able to provide your clients with a real estimation when a problem will be fixed (the less time it takes, the better), will keep them happy.

Configuration management

There are a lot of tools out there that help you with this. The best known ones are definitely Chef and Puppet. I experimented a bit with both of them, but none of them seemed to be what I was looking for. They all felt bloated to me.

One day, during an evening chat session with Serge van Ginderachter, he pointed me to a newcomer in the configuration management ring: Ansible.

Ansible

Ansible is just like Chef or Puppet, an open-source software configuration management tool. It combines multi-node software deployment, ad-hoc task execution and configuration management.

In contrast with Chef or Puppet, Ansible doesn’t require the use of any additional software or daemons on the nodes you which to manage. Instead, it manages nodes over SSH. Modules work over JSON and standard output and they can be written in any programming language, although the main language is Python.

Ansible uses YAML to express reusable descriptions of systems. So no real programming skills are required. Just knowing how YAML formatting works and you’re good to go.

Requirements

The steps described here are tested on OS X Mountain Lion, although they should work on other systems as well. I’m using the most current stable release at this time ( 1.2 ) .

Since Ansible is written in Python, you’ll need Python 2.6 or greater in order to run it. You will also want the following Python modules installed:

  • paramiko: Python module that implements the SSH2 protocol for secure ( encrypted and authenticated ) connections to remote machines.
  • PyYAML: Python YAML parser
  • jinja2: modern and designer friendly templating language for Python

The easies way is to use easy_install. Easy_install is a python module bundled with setuptools that lets you automatically download, build, install and manage Python packages.

So to install them, just open your terminal and run:

$ easy_install paramiko PyYAML jinja2

Getting Ansible

Ansible has a very active community. The release cycles are very short, approximately every two months. My preferred way is to install it from source.

Ansible is hosted on Github so a simple checkout does the trick:

$ git clone git://github.com/ansible/ansible.git

The only thing left to do is to modify your environment by adding the Ansible binaries to your path. This can be done by sourcing a bash script provided in the Ansible source. So go in the Ansibe source directory and source the env-setup bash script:

$ cd ./ansible
$  source ./hacking/env-setup

Keep in mind that the environment settings will get lost once you reboot your system. To make them more permanent, add it to your bash configuration so it gets loaded every time you run a new shell:

source /path/to/ansible/hacking/env-setup > /dev/null 2>&1

Note the output redirection I’ve added at the end. This way, you won’t get bothered with the output the env-setup script shows.

You may also want to specify an inventory file. An inventory file is listing of all systems in your infrastructure. A system might be part part of a group (more on that later).

By default, the inventory file can be found at /etc/ansible/hosts. Though it might be considered best practice, to keep the inventory file in a git repository. Most likely, it will reside in the same repository as your playbooks. You have two options to specify your inventory location:

Providing it every time you use the Ansible command by using the -i parameter:

$ ansible -i /path/to/inventory/hosts

Or by exporting the ANSIBLE_HOST environment variable. You can simply add it to your bash configuration as well:

…
export ANSIBLE_HOSTS=/path/to/inventory/hosts
…

As last part of the basic configuration, you get to choose between Paramiko and native SSH to connect to your nodes. By default, Ansible uses Paramiko to talk to nodes over SSH. Paramiko is fast and works very transparently. However is does not support some advanced SSH features like for instance ssh agent forwarding. But I can assure you that in most cases, Paramiko will do just fine for you.

If you do wish to use native SSH, you can pass the ‘–connection=ssh’ option to any Ansible command, or again, set the ANSIBLE_TRANSPORT environment variable to ‘ssh’

export ANSIBLE_TRANSPORT=ssh

Testing your installation

Now that everything should be installed properly, it is time to test your installation. But before you can start automating, you’ll need 2 more things.

First of all, you need to populate your hosts file. We’ll start of simple by just adding some hosts, without the use of groups. So add or create a hosts file in the proper location and add nodes as content. Keep one node per line:

192.168.1.150
node1.example.org
node2.example.org

The second thing you’ll need to take care of, is make sure your SSH key is present in the authorized_keys files. By default, Ansible will attempt to remote connect to the machine using your current user name, just like SSH would.

What you need to do is add your SSH key to the user you want Ansible to connect with. Again, in most cases, that will be the root user. I know this might sound weird, since we’ve all learned never to login on a remote machine as a root user, but in this case, it makes since, since we don’t manually login. Secondly, we’ll be installing and configuring different types of software and tools on our node, which will require root access most of the times. So it’s a bit silly to create a dedicated user for that, that will need to sudo 99% the time anyway.

Now lets ping our nodes:

$ ansible all -m ping

Ansible will now attempt to remote connect to all the nodes specified in your hosts file using your current user name. To override the remote user name, just use the ‘-u’ parameter.

If you would like to access sudo mode, there are also flags to do that:

# as michael
$ ansible all -m ping -u michael
# as michael, sudoing to root
$ ansible all -m ping -u michael –sudo
# as michael, sudoing to batman
$ ansible all -m ping -u michael –sudo –sudo-user batman

The sudo implementation is changeable in Ansible’s configuration file if you happen to want to use a sudo replacement. Flags passed dot sudo can also be set.

Now run a live command on all of your nodes:

$ ansible all -a "/bin/echo hello"

As the ‘command’ module is the default , “-m command” can be left out here.

At this point, Ansible should be working without any problems. You are now ready to create some real-world playbooks. But due to the length of this post, I’ll dedicated a new post shortly covering more details about playbooks and using some modules, showing you how to automate certain parts and deploy your configuration.