From what I can tell, the first fix for practically every data center problem is to reboot something. It doesn’t matter whether it’s a Linux OS or a remote branch office network node. Restart it and hope the problem goes away.
That’s brilliant when it works. But when it doesn’t?
Then we have to determine where the root cause actually might be. Might be a storage array disk drive, or a set of failed fans causing a processor slowdown, or a port channel running under capacity with a yanked network cable, or any of a thousand other issues. In a large data center, there might be a dozen concurrent issues, and thousands of sympathetic failures downstream from the real failures.
What do we know? Usually which device someone is complaining about. What don’t we know? In a big data center, we often have never worked on this problem before. If the failure is in a Linux OS, we might not even know if it is running at AWS or on a server in one of our racks, or whether a hypervisor is involved, or what switch the traffic runs through.
We’re going to need to talk to lots of other people, and that takes time.
Wouldn’t it be nice if we were smarter?
What if we could look at any piece of our infrastructure, and know everything it depended on, and everything that depended on it? And if we could instantly spot where there were issues up and down the list of dependencies. We’d have data center omniscience!
Then we wouldn’t have to wait so long. And there wouldn’t be a single thing we couldn’t do. And then we’d be happy!
Secret Feature Fixes All
Zenoss delivered the “Dependency View” feature about a year ago. I’m amazed how many people don’t know about it. It’s automatically enabled and available everywhere, and it’s designed exactly to help you pivot up and down the technology stack.
Here’s a really simple example. Someone complains that a web server has inconsistent response time. What’s wrong?
Looking at the web server OS we don’t see anything - everything is green. Let’s use the Dependency view to look down the stack for any warning signs.
Here’s what I see - the hypervisor it’s running on has an issue. Who knew this OS was even virtualized? This smart guy, thanks to the dependency view. Now I’ll look into the hypervisor and hopefully fix the problem. And look at all the places I don’t have to look - no port group networking issues, no failed fans, no filled file systems, no hot power supplies, all the vNICS working.
Automatic Dependencies for the Win
Use the dependency view feature to quickly understand what’s working and not through the infrastructure for any monitored element. By the way, it works perfectly well for cloud technologies, too.
Want to learn more about how automatic dependencies can help you? Let us know!