If you do a Google search for Service Assurance (SA) you’ll receive no less than 22 million results in less than 1 second. SA began as simple fault and performance management and has morphed into more complex OSS/BSS functions. As interest in Cloud computing continues to rise, and Enterprises look to “operate like a service provider”, SA is undergoing a bit of a renaissance of its own.
For most people, the Cloud is not really about technology; rather it’s about the ability to deliver Services or Applications to your customers. It allows developers to focus on the business and innovation rather than worry about infrastructure, disaster recover, high availability, and more. In other words, leave the technology to the Cloud providers while focusing on your business. Who cares if the hypervisor is ESXi, KVM, XEN, or Hyper-V as long as the Cloud remains stable and highly available?
Today, you’ll hear the term Business Service Assurance (BSA). It’s a morphing of SA to include a holistic view of all the different components that make up a Service. This includes compute, storage, security, networking, applications, and more. In theory, IT can become more proactive if they fully understand the relationships between the individual components that make up a complete service. In practice, this is much harder than you think especially for the legacy providers.
Why? SA and BSA were born of an era where things were relatively static and free from change. Where workloads were managed by mainframes and single purpose servers were all the rage. It was relatively easy to write a rule within your event system to correlate a network issue to a specific application. However, with the advent of virtualization and the dynamic datacenter this equation has all changed.
In the new dynamic datacenter equation, compute, storage, and network are not only unique variables but they are also ever changing. Simply dropping all the components of a service into a “container” or token and applying rules against does not work. To make matters worse, imagine moving workloads from the private Cloud to the public Cloud. Legacy techniques such as a CMDB, basic monitoring, and rules based event management cannot keep up with all this constant change.
Within Cloud computing, the case for BSA cannot be any stronger. Operations personnel need to understand the relationships between the different components as they change in real-time while correlating this information against performance and availability data. Meanwhile, they must be able to adapt to the orchestration and automation layers that are attempting to mitigate issues by shifting workloads and capacity dynamically. Only by understanding these dynamic relationships can operations become proactive and turn event storms into impact and risk analysis thereby reducing mean time to repair and finally turning outages into incidents.
Don’t be fooled by the legacy providers, Cloud computing demands new and disruptive management that requires innovation and focus not seen within the legacy software. If you want to make BSA work, it requires a real-time service model with automatic dependency mapping, automated analysis and actions, and deep analytic analysis capabilities.