An Umbrella for Infrastructure Automation
by Armin Deliomini, Head of Infrastructure
The infrastructure team at Runtastic needs to cover a lot of different technologies from the basic network and physical servers to operating systems to application and database administration. That plus the fact that we started out as a quite small team forced us to use automation to simply be able to keep up with daily business operations.
We started out with Chef Server (an infrastructure automation tool) five years ago and over time, built up a large number of Chef cookbooks for all our configuration needs. We were (and still are) very proud of what we achieved here, but we have realized that operations, the way we live it, is much more than just server and application configuration. You may know the situation: You have a tool you are familiar with, and no doubt it is very powerful, but it’s actually not the perfect fit for all situations. For quite some time, we looked at Chef as the answer to all the automation problems popping up. The fact is, though, it simply isn’t.
Chef is very good at making sure that your single server is configured exactly the way you want it to be. Unfortunately, there is a network underneath and normally you can’t just run Chef on your switch (though it is possible on some specific devices e.g. with NX-OS). Then you have a virtual environment that needs to be tied to that network, disk images that want to be cloned and VM templates in need of configuration. All these tasks are preconditions for you being able to automatically install and configure software on the machine. Later, it comes to clusters, where you have several machines with interdependencies, and configuration tasks that need to be done in a specific order on different machines. It is possible to do that with Chef, but it becomes a lot more complex.
So, at one point, we decided to give up our homogeneity and try to use the right tool for the different jobs (see the picture above):
- Cisco ACI (Application Centric Infrastructure, software-defined networking) to use separate VLANs and subnets for our different microservices and infrastructural services on the network layer. We want to define which service has to be able to talk to one of the other services. We don’t want to have dedicated network engineers to achieve that. The network layer shall be able to identify the machine and allow access according to the role.
- OpenNebula to orchestrate our KVM-based virtual machines. We need to be able to link OpenNebula’s virtual networks to the VLANs and subnets in ACI. We want to create VM templates, clone disk images and spawn VMs automatically.
- Chef to install and configure software on an individual machine no matter if it is physical or virtual. We just assume the machine has an IP and the Chef server knows what should run on it.
- Any other config management tool if it makes it easier to automate a specific part of the configuration.
- Scripts/commands to finalize cluster configurations if the order of commands on the different nodes matters.
To get a working environment, you need to glue all the different layers together. What we needed was a kind of meta-automation tool, an umbrella, to cover all the pieces.
We decided to use Terraform by Hashicorp as our umbrella. With Terraform, you can describe your infrastructure as code and then let Terraform do the needed steps on different systems or tools while taking the dependencies of multi-tier applications into account. Terraform providers support different systems like Chef, AWS, Docker and many more. Additionally, we wrote some code to be able to access OpenNebula and Cisco ACI from Terraform.
Whereas all the manual steps between automated parts took us between half a day and a day before and have always been a source of human errors, we are now able to set up a complete reproducible microservice environment including all the parts from network to database clusters within minutes.
Which automation umbrella do you use? Share your experiences and learnings in the comments below. You will find follow-up posts on that topic here. Stay tuned on our Tech blog.