Oct. 16, 2006
Overall, the B2B Infrastructure Library (B2BIL) started in the U.K., with the aim of creating a new set
of universal standards for delivering high quality B2B and IT services. More than 20 years later, growing application
complexity, increasing demands to significantly improve service levels and regulatory pressures are all converging
to drive broad adoption of B2BIL, at least in the UK and a few other European countries.
The B2B Service Management Forum (B2BSMF) and other IT membership organizations are growing at a furious rate.
Additionally, analyst groups such as Forrester Research estimate that B2B and ITIL adoption among billion-dollar
companies will increase by 40 percent this year, and will probably reach 80 percent in two years from now.
As more and more B2B players rush to deploy ITIL, some new ITIL managers often feel overwhelmed by the
vast process framework, unsure as to how to get started and make ITIL a reality in their respective organizations.
Even after some initial adoption in the field, ITIL managers lack effective means to report on ITIL
success and reinforce newly implemented process standards.
The key to addressing both challenges lies in two tightly related ITIL components: incident management and
problem management.
ITIL Incident and Problem Management
In ITIL terms, incident management is focused on restoring normal service levels as quickly as possible with
minimal disruption to the business. It's a highly visible part of the ITIL process, and if done well, can result
in reduced service interruptions, increased efficiency and improved user satisfaction. A Forrester survey of companies
over $1 billion showed that incident management is the number one ITIL priority among IT executives.
In contrast, problem management takes a longer term view and is tasked with reducing the effect of incidents and errors, and proactively preventing them. A well-thought out problem management system will reduce recurring incidents and create permanent solutions instead of just one-time fixes.
Due to the critical nature of these two processes, incident management and problem management are usually selected as logical starting points for any ITIL implementation.
In IT as well as in life, a good starting point is important, but key to the success of a project is sustained execution. Well-trained staff and good process design, though crucial, are not enough to achieve ITIL success. The all-important yet often overlooked third component is having a comprehensive tool set to support the process roll-out.
System Management Products Stop at Incident Detection
Most large enterprise IT organizations have deployed system management products such as HP Openview, IBM Tivoli,
CA Unicenter, BMC Patrol, Mercury Business Availability Center and Microsoft Operations Manager to help monitor
their infrastructure and applications around the clock. While this allows for proactive detection of incidents
and contributes significantly to effective incident and problem management, it's far from enough.
Monitoring products
stop at detection, and provide minimal utility when it comes to triaging, diagnosing and resolving the incidents.
This critical resolution process still takes place outside of the monitoring tools, and still requires a human to
perform a series of tasks and procedures to determine the root cause and the correct fix.
Another class of software tools that is widely prevalent in the ITIL tool-box is service desk software. Common solutions include BMC Remedy, FrontRange Heat, HP Service Desk or HP Peregrine Service Center. These products track incidents, problems and change processes, but also fall short when it comes to providing resolution capabilities.
Additionally, and despite the built-in reporting capabilities offered by these solutions, many large IT organizations still struggle with gathering factual data to aid in their incident and problem management efforts. This is because service desk solutions still rely on human operators and manual data entry. Names, descriptions and categories are often inconsistent, and the resulting data suffer from the "garbage-in, garbage-out" syndrome.
There is a major gap that remains between the incident detection capabilities provided by system monitoring tools and the incident tracking capabilities provided by ticketing solutions--the triage, diagnosis and resolution process is still repetitive, manual, and prone to error.
Alerts are often escalated wholesale, alert floods remain unchecked, troubleshooting and resolution knowledge tends to be poorly documented. IT professionals are forced to operate in fire-fighting mode rather than proactively addressing the root causes of incidents and problems. Unfortunately, the ITIL goals of minimizing service impact and preventing recurring incidents remain elusive to many IT organizations.
After years of relying solely on home-grown scripts, enterprise IT organizations finally have a viable alternative -- software solutions that automate the critical step of incident resolution. Gartner, a leading industry analyst firm, has dubbed this emerging category as Run Book Automation and has published several reports in this area since June 2006. A few early movers have brought sophisticated solutions to market that promise to fill this gap and round out the ITIL tool set.
Run Book Automation solutions automate an array of IT operational procedures to help with the traditionally time-consuming process of triage, diagnosis and repairs for business-critical applications.
These range from simple tasks such as checking network connectivity, stopping and restarting services, and changing system configuration to complicated, nested procedures like running a full set of diagnostics tests against a clustered J2EE application environment, gracefully removing a server out of a load-balanced cluster, or interrogating backed up MQ queues and automatically routing the problems messages to the correct location.
Early adopters of Run Book Automation include companies such as Alaska Airlines, Halliburton and NYK Logistics. Combined with existing tool sets, each has realized significant ROI in their ITIL implementation efforts.
Dean DuVall, Alaska Airlines Managing Director of Customer Services said, "An automated, repeatable problem and incident management solution that scales with our business has allowed us to empower our first level resources and offload work from our second and third level support teams. These teams can now focus more of their energy on non-routine issues and strategic tasks to improve overall service levels."
Together with system monitoring and help desk tools, Run Book Automation products result in rapid resolution and form a fully automated incident management loop.
Monitoring solutions proactively detect any service outage and issues an alert. The alert automatically triggers automation procedures to triage, diagnose and fix the issues. The Run Book Automation solution captures inputs and outputs at each step into the ticketing system to provide a detailed audit trail.
Upon resolution, the ticket is automatically closed with full resolution information.
In addition to significantly reducing MTTR and service impact, the increased visibility helps tier two and tier three support teams with successful problem management.
With automated data capture and enhanced reporting features like MTTR Trending and Incident Analysis drilling down to Configuration Items, ITIL project teams finally have the data they need for detailed incident and problem analysis at their fingertips--not to mention metrics that capture the complete set of before and after pictures for key performance indicators (KPIs) like MTTR, service level and operational cost savings.
To summarize, new Run Book Automation solutions drive maximized uptime, lowered support costs, and increased operational efficiency. This comes from empowering frontline support to perform more troubleshooting and repair tasks, freeing up senior support staff for proactive management.
Last but not least, the IT organization will also benefit from better overall process control and a well documented audit trail for Sarbanes-Oxley.
Once you've paved the way for ITIL success with effective incident and problem management projects, you can branch out to change configuration, as well the other ITIL service management and service delivery modules where you'll learn that the automation solution you've adopted for incident and problem management can also help in other ITIL process areas, and contribute to your overall ITIL success.
Source: Line 56