One of the many challenges in enterprise IT operations environments is appropriately detecting and responding to event flaps. Event flaps are when you have an event where the severity state changes from a normal state to a Warning
, Error
, or Critical
state, and then back, over and over again. Event flapping most commonly is an indicator of transient network, configuration, or service problems.
The Event Flap Headache
Event flap scenarios can be a major source of headaches for system and network administrators. Because event flaps are typically caused by transient issues, they are especially hard to detect and troubleshoot.
For example, assume you have a router card in your environment going up and down intermittently. This transient issue generates a lot of events, or “noise”. First it generates an event with a severity state of Warning
, but then a short time later – sometimes even within seconds – it generates another event with a severity state of Info
.
As the router card generates events that flap between severity states, although you may see the warning generated by the router card initially, when you start to drill down your event management console to take a closer look, you see the next event generated by the router is back to Info
. Since it looks like everything is now fine, you move on.
However, within a few seconds the card once again generates an event with a severity state of Warning
, quickly followed once again by an event with a severity state of Info
. This “event flapping” continues on and on, generating a lot of “noise” in the event console. Also, because the event flaps between Warning
and Info
so frequently, it is hard to determine whether or not you do have a configuration, network, or service problem you need to address.
Detecting and Managing Event Flaps
Both Zenoss Core, the open-source Zenoss product, as well as the enterprise version of Zenoss, Zenoss Service Dynamics Resource Manager, can help you decipher the noise around event flapping and help you identify and better manage event flapping in your environment.
With Zenoss, you get a clear, actionable signal out of event flapping noise by specifying event flapping settings as part of your event configuration properties.
By specifying event flapping configuration properties for events, if the severity level of an event changes a certain number of times within a certain time range, Zenoss generates an event flapping event. With this special event flapping event, you now have a way to track and remediate transient issues.
For more information on how you can use Zenoss to detect and manage event flaps, see the step-by-step instructions in the following article on the Zenoss Wiki: Detecting Event Flaps.
New to Zenoss? Learn More!
Want to learn more about how you can use Zenoss to more effectively manage and monitor your environment?
Watch a demo that gives you an overview of Zenoss and how Zenoss can help your more efficiently and cost-effectively monitor your environment.
Read the Zenoss Service Dynamics: 4 Profiles in Unified Monitoring Successs white paper to learn more about how some of our customers use Zenoss to improve their monitoring efficiency and productivity and avoid outages.
Read the Zenoss Service Dynamics Architecture Overview to learn more about how Zenoss works.
Request a free trial! See how you can use Zenoss more in your environment to more effectively and cost-effectively monitor and manage your environment using a single, unified monitoring view and unified monitoring processes.
Share This Tip!
If you’ve found this article helpful, feel free to share it with others via LinkedIn, Twitter, Google+ or Facebook, or follow our blog to get the latest news and information from Zenoss.