In his post 3 Reasons Why Monitoring Has Failed Operations, our own Floyd Strimling explains that operations teams must use automation if they are building the sort of environment that can support a DevOps model:
Every major aspect of a datacenter is under unprecedented change including comput[ing], storage and networking … Monitoring, orchestration, provisioning, service catalogs, development, testing and more must march in perfect cadence, a tough task for any operations team. New models such as IAAS, PAAS, and SAAS offer unique challenges that demand decisive actions to reduce MTTR (Mean Time To Restore) … [S]omeone must understand how services are constructed, the underlying infrastructure, and the impact of issues across the datacenter.
Puppetmaster Luke Kanies told me that without automation, you can’t trust your developers to do their work correctly:
It’s critical that the applications team not have to make decisions about, for example, security – not because they’re incapable of understanding security but because their job isn’t security. The operations team needs to automate everything so that the things the applications team do will not lead to an insecure or noncompliant state.
Automation gives you those constraints. If you’ve got automation that maintains both your development and production environments, you have confidence that your development, test and production environments are all configured the same. Therefore, a tool or application that works in one [environment] works in all three. The confidence provided by automation delivers low-friction, high-throughput application deployments.
The great thing about automating your infrastructure is that you can give your developers more agency in the applications they’re building, while limiting much of the damage that often occur in the old wall of confusion paradigm, with development and operations teams walled in separate silos.
And with automation, developers can compact, say, six months of testing to just a few days, Kanies said:
Now the applications team can say, “I’m going to do an upgrade, and because all the pieces of the puzzle I care about most are automated, I can have confidence that my upgrade won’t break any internal rules, put us in insecure state or accidentally delete my data.”
Image from Dev2Ops.org