Multistage environments with Ansible

7 minute read

Ansible has excellent documentation but one thing I was confused about was the best way to store the configuration for multistage projects: say, different passwords for dev, staging, production. This isn’t really covered in the ansible-examples repo because it’s specific to your project and while the documentation has recommendations, it doesn’t spell it out completely (which I need since I’m an idiot).

In the old days, the easiest place to keep the configuration was in your inventory file. Since these store the server IPs you run your playbooks against, they’re inherently stage-specific. However, storing everything in the inventory can be limiting and is officially discouraged.

Instead, the best practices guide suggests a different structure, one that’s based on keeping your config in the group_vars and host_vars directories. At first glance, the linked example confused me because it seemed to be mixing a lot things together in one file: IP addresses, role assignments, location, datacenter, etc and then mixing these together. However, after some trial & error, talking to some smart folks and a lot of googling, I’ve hit on a structure that’s worked well for my last couple of projects so I’d like to write about it here.

So, let’s take the example above and pare it down to something simpler:

We’ll create an inventories directory and place a “production_servers” inventory file in there.

; inventories/production_servers
[web]
4.2.2.1
4.2.2.2

[database]
8.8.8.8

This file does one thing and does it well, it sorts our server IPs into different groups. Now, “Group” is the magic word here. Whenever we run a playbook with this inventory, Ansible isn’t just loading the inventory. It’s also looking at the group names we set (the header sections of the INI) and then trying to match those to a file with the same name in the group_vars directory. This isn’t explicitly configured, it’s just something Ansible does by default.

So, since we mentioned a “web” group and a “database” group, Ansible will try to load the files “group_vars/web” and “group_vars/database”. These are expected to be YAML key/values lists and we can use them to define all Ansible variables that you likely have sprinkled throughout your roles. For example, the database vars file might look like this:

# group_vars/database
---
db_port: 3306
db_user: app_user
db_password: SuperSecureSecretPassword1
# group_vars/web
---
domain_name: myapp.com
csrf_secret: foobarbaz

Here we’ve defined a few variables you’d use in a role like {{ db_password }} or {{ domain_name }}.

So far, so good. By now, our ansible directory probably looks something like the example below. Keep in mind the group_var file names are based entirely on the header names inside the inventory file, NOT the naming of the inventory file themselves.

    .
    ├── group_vars
    │   ├── database
    │   └── web
    │
    ├── inventories
    │   └── production_servers
    │
    ├── roles
    │   └── ...
    │
    └── my_playbook.yml

Now comes the multistage part. We don’t want to use the same db_password for dev, staging and production, that’d be terrible security. And we probably want to change the domain name. And the SSL certificates. And all sorts of other things, which we’d prefer to maintain in just one place. How can we group the configuration together per stage?

Remember, Ansible will try to load a group_vars file for any group it encounters in your inventory. All it takes to define a group is adding a section for it in the inventory’s INI file. So, why don’t we create a “production” group?

; inventories/production_servers
[web]
4.2.2.1
4.2.2.2

[database]
8.8.8.8

[production]
4.2.2.1
4.2.2.2
8.8.8.8

We’ve now created a production group and assigned all the production servers to live underneath it so they all get the exact same configuration. I haven’t tested it completely but this is really important if your configuration overlaps between roles, such as using the db_password on the web servers.

However, duplicating all of the IP addresses is a real pain and it would super easy to add another web server and forget to update the list at the bottom of the file. Luckily, Ansible has an inheritance syntax to make this easier.

; inventories/production_servers
[web]
4.2.2.1
4.2.2.2

[database]
8.8.8.8

[production:children]
web
database

This example does the exact same thing as the previous version: it creates a group called “production” but now it’s defined as a group of groups. Any IP address added to the “web” or “database” groups is automatically part of the “production” group (at least, when running a playbook with this inventory).

That means we can now create a group_vars/production file where we can group the parts that are specific to this stage:

# group_vars/production
---
domain_name: myapp.com
db_password: SuperSecureSecretPassword1
csrf_secret: foobarbaz

These are the things we’re interested in changing per stage. Other stuff that’s the same 99% of the time like port numbers or standard users, we can leave in group_vars/database.

Now, if we wanted to add a staging setup, we only need to add two files: a new inventory…

; inventories/staging_servers
[web]
8.8.4.4

[database]
4.2.2.3

[staging:children]
web
database

and a group_var/staging config.

# group_vars/staging
---
domain_name: staging.myapp.com
db_password: LessSecretButStillSecurePassword
csrf_secret: St4gingCSRFToken

Notice that the basic format is the same, and we can use this to add any number of stages we like:

    .
    ├── group_vars
    │   ├── all
    │   ├── database
    │   ├── dev
    │   ├── production
    │   ├── staging
    │   └── web
    │
    ├── inventories
    │   ├── dev_servers
    │   ├── production_servers
    │   └── staging_servers
    │
    ├── roles
    │   └── ...
    │
    └── my_playbook.yml

In the above example, we’ve now added a dev stage which probably lists our Vagrant IP as both the web server and db server. You might also notice a group_vars/all file. This is a special file that Ansible loads in every time, no matter what groups you use, making it an excellent place to stash your default config.

So, using this setup we have a working and reasonably well centralized multistage setup in Ansible. We’ve also got the config split out nicely so we can use ansible-vault to encrypt our staging and production settings. This works really well and I’ve used it successfully in a couple projects now.

However, there are a couple gotchas to notice. The big one is inheritance order. If you define a config value in multiple groups (say, “db_port” in both “database” and “all”), then Ansible follows particular rules to determine which one wins. For this setup, the priority from highest to lowest is:

  • type (web, database)
  • stage (dev, staging)
  • the “all” file

This is kind of bad, because we probably want the stage files to take precedence but the type files are overriding. It turns out, this is because we used the “:children” style to define the stage in our inventory. This marks the “web” and “database” servers as children of the “production” group and as the Ansible documentation says “Child groups override parent groups”. We could try to work around it by making more specific groups and controlling the hierarchy more tightly:

[production-web]
4.2.2.1
4.2.2.2

[production-database]
8.8.8.8

[production:children]
web
database

[web:children]
production-web

[database:children]
production-database

But this hasn’t worked in testing for me because when the groups are equal, Ansible does precedence alphabetically so “web” is still overriding “production”. Also, while more specific, this is quite a bit more boilerplate for each ansible file.

In practice, I haven’t used the type specific group_vars files much, instead relying on role defaults or the “all” file. The end result has been much simpler and I don’t have to worry as much about where something is defined.

This brings us to the second gotcha: This is reasonably simple on the small scale but it can be more complex. Super nice guy Ramon de la Fuente told me he’s been running this setup (or one very similar) for a while and has found it a bit awkward to manage as it grows. I haven’t tried it on a very large installation yet but I’m inclined to believe him. You should check out his latest Ansible article for more tips.

Still, for a small to mid-size project, this is a straightforward, practical setup. If you do need very fine-grained control and you’re not running several servers per stage, consider looking into the host_vars directory which is the same thing as group_vars but per server instead of per group. And finally, remember, Ansible’s group system is essentially a free-form tagging system: it’s a simple, powerful way to build any setup you want.

Like, you know, Ansible itself.

Update: Erika Heidi wrote a great add-on post to this that talks about integrating this setup with Vagrant and how to configure your remote servers

The setup presented here is essentially that from the (excellent) Ansible Docs and Mailing List, I’m just explaining it a bit more. Many thanks to Ramon de la Fuente and Rafael Dohms for Ansible feedback, as well as Erika Heidi and Rick Kuipers for proofreading this article.

Updated: