Originally posted on the Zenoss Google+ page.
We’re fully virtualized on VMware, which seems to be the norm for earlier stage software companies like us. Primarily, we run IT systems, build servers and labs. Because we’re also in the business of comprehensive monitoring software, we’re always looking for new technologies to incorporate into our monitoring portfolio. Unsurprisingly this combination led us to develop extensive support for VMware technologies as our first foray into virtualization monitoring.
Of course VMware isn’t the only game in town when it comes to virtualization. This means that Zenoss also needs to support environments that are using other technologies. Through 2011, we added support for many such as vCloud, OpenStack Nova and Swift, CloudStack and CloudFoundry. Getting familiar with these technologies requires using them, so we decided to deploy our own OpenStack cluster to learn, have a reference implementation and see how it compares with VMware for our uses.
Simon Jakesch and I set about building the compute cluster across three servers. Because one of our objectives is to have a representative system for developing monitoring against, we decided to build on Ubuntu 10.04 LTS using the 2011.3 PPAs. That is to say the latest official Diablo milestone packages. This method of installation turned out to be a breeze at least for getting Nova (compute) and Glance (image) services running.
It quickly became apparent that with a multi-node cluster the first challenge would be to configure networking properly so that VMs running within the cluster could reach the Internet, and how we could reach them. In Diablo, there are at least three networking modes: flat, flat DHCP and VLAN. We ruled out VLAN in our environment immediately because control over the switch wasn’t something we were likely to get for this project. That left flat DHCP as the obvious choice because we didn’t want to worry about injecting static IP addresses into the operating systems.
The biggest networking misstep we made was to erroneously start the nova-network service on the two alternate compute nodes. This caused nova-network to setup iptables NAT rules on the alternate compute nodes that broke networking for any VMs deployed to the alternate compute nodes. It also illuminated the fact that flat DHCP mode results in a network single point of failure. If the single nova-network node (in our case the primary compute node) goes down, all VMs across the cluster become unreachable. It’s an understatement to say that we’re looking forward to Quantum to put more serious networking in OpenStack.
After getting the networking sorted out, we made a foolhardy attempt to setup Keystone (authentication service) and Horizon (dashboard / web interface.) We plan to have sales engineers using the cluster for hosting demo and lab systems so we wanted to make the barrier to entry as low as possible. In the end we ran into too many bugs between the latest releases of Keystone and Horizon to get it working reliably and put the endeavor off until it stabilizes.
Without Dashboard, we were left to decide how our users would boot servers in the cluster. Simon came up with the idea of simply creating a bastion host with accounts for each user that had their certificates and environment setup so they could run the nova client (python-novaclient) immediately upon logging in. The login message also gives some instruction on what servers they’re already running and how to perform common operations. It’s a simple solution, but it works very well.
We already have a new, continuous integration environment running in the cluster that allows us to spin up build slaves of various Linux distributions, architectures and Zenoss versions to build, test and package ZenPacks.
While OpenStack may be new and have some rough edges, if you stay on the well-worn path it can be straightforward to setup and use. We’re eagerly looking forward to Essex and the “F” release and all of the goodness they promise!
As this is my first Zenoss post, I should probably introduce myself. I’ve been working at Zenoss in various technical capacities for the last 4+ years, ranging from tech and sales support to services work and development. Prior to joining Zenoss, I used early versions of the software to monitor a cable ISP’s national network after transition from a Frankenstein of a Nagios system. These days I head the Zenoss Labs group that’s primarily responsible for ZenPack development.