Incident Remediation


Organizations are drawn to the promise of AIOps to leverage AI-driven Intelligence and automation to make quick and accurate decisions to maintain resiliency. AIOps uses artificial intelligence to simplify IT operations management and accelerate and automate problem resolution in complex modern IT environments.

A recent blog by Sanjay Chandru set the stage for guiding you on Best practices for taking a hybrid approach to AIOps .  We learned that a key capability of AIOps is to act.   Acting swiftly to resolve issues with intelligent automation.    

In this blog we will focus on collaborative incident remediation: achieve availability of high SLA requirements within changing hybrid environments with faster incident resolution through chat-based operations and user-friendly dashboards.

Challenges and customer needs

Today’s application landscapes are increasingly hybrid, complex and often include IBM Z. However, we often see information, team, and data silos in organizations. These silos inhibit SLA attainment, make collaboration more difficult and increase the time to problem resolution. It often takes days to detect and diagnose a complex issue.

In a typical IT operations team, we find people in different job roles, such as administrators, site reliability engineers, sysprogs, operators, developers, and managers. They have different skills, and they work with a different set of tools. They may be located in different locations and may even live in different time zones.

These operations teams suffer from being overwhelmed by having to use many different tools and by lack of skills. They are struggling with inconsistent alerts, interrupted workflow to swap between tools, and challenges when having to share data across multiple teams.

Furthermore, with the looming retirements of highly skilled mainframe administrators, along with an increasing growth in Z workloads, deep Z skills are getting scarce and it is a priority to onboard the next generation of Z operators and administrators

What’s now required and how is this different than what I have today?


Due to the challenges outlined above, organizations need strong collaboration tools, possibilities to integrate their processes and tools while adopting new DevOps practices and agile methods, and modern user interfaces to address these challenges.

ChatOps solutions are emerging as a best practice to collaborate across departmental silos, personas and tools to rapidly pinpoint problems while working on an incident.

ChatOps is a collaboration model that connects people, processes, tools, and automation together in a seamless and transparent way through a chat platform and extensive use of chatbots.With ChatOps, all technical team members are working together in the same virtual location with everyone seeing the same information using a chat platform that they are familiar with and use for other purposes anyway in their daily work. All the necessary tools are available at their fingertips because they can now be used from within the chat window using bots and other integrations instead of opening a dedicated application window or console.

“ChatOps is very helpful and will minimize downtimes”
         – Large Communications Company in North America

The IBM solution

IBM provides ChatOps capabilities for IBM Z environments in addition to modern, customizable dashboard user interfaces that provide consolidated views into IBM Z AIOps tools.

IBM Z ChatOps provides a chatbot that gives users access to information and tasks from Z AIOps tools such as IBM Z System Automation, IBM Z NetView, and IBM OMEGAMON within popular collaboration platforms like Slack, Microsoft Teams, and Mattermost.

IBM Z ChatOps can be used to notify the operations team in the chat tool about IBM Z events, including recommendations powered by machine learning. This fosters collaboration during incident management and helps to isolate and resolve problems quickly. IBM Z ChatOps integrates with IBM Service Management Unite for a broad access to IBM Z operations data and allows chat users to drill-down to web-based dashboards with more information to help resolve problems fast.

Customer value/outcome of doing this

With IBM Z ChatOps, combined with IBM Service Management Unite, IBM provides unique collaborative incident remediation capabilities with benefits such as the following:

  • Improved collaboration within and across teams
  • Faster incident identification and resolution
  • Faster onboarding of next generation of Z Operators
  • Easy sharing of Z data
  • Integration with other DevOps tooling

What are my next steps?

Depending on where you are on your journey to adopting more of these AIOps best practices we have developed the following resources:

  • To assess your current stage of AIOps maturity and identify action oriented next steps for adopting more AIOps best practices, inquire about the 15-minute online AIOps Assessment for IBM Z.
  • Join the AIOps on IBM Z Communityto follow this blog series about best practices for taking a hybrid approach to AIOps
  • And finally, to research our IBM Z products that are implementing AIOps technologies to improve operational resiliency visit our product portfolio page.

Originally published on the IBM Z Community Blog

Leave a Reply

Your email address will not be published. Required fields are marked *