Understanding Downtime - Quantifying the Impact of Mainframe Outages and the Value of Observability

The Costs of Mainframe Outages

Mainframes come in a variety of different shapes, sizes, and capacities, but at their core they are high-performance computers that process incredibly large amounts of data. Mainframes have vast amounts of memory, meaning they can process billions of transactions in real-time. It is estimated that nearly 70% of the global IT production workloads are handled by mainframes.

Of course, this means that the cost of an outage can be significant, especially if it is a large, complex outage that requires hours or even days to repair. Not only are transactions not processed during this time, but the repairs themselves can be expensive, online data can become more vulnerable, and customer trust can be substantially eroded. Today, many systems are interconnected, which adds real value when everything is working well, but it can lead to even more stress when something goes down.

Take an example from the West Virginia DMV just this year. The DMV is already a place that is typically seen as a slow, tedious destination. But when the mainframe went down, the process became even more difficult. Due to a hardware issue, all the interconnected DMV offices and the online system across the entire state were unable to issue driver’s licenses or vehicle registration renewals. The outage lasted about 24 hours, a significant downtime and major inconvenience for many.

Incorporating Observability

As systems become more highly interconnected, it can become harder to diagnose the root problem when issues arise. This has led to the rise in the importance of observability, which essentially means the ability to measure and understand the internal state of a system by evaluating file logs and other outputs. For many IT professionals, observability is basically the breadcrumb trail necessary to determine the necessary fixes when issues arise.

There are three main components of effective observability in a system:
1. Monitoring – Monitoring services are great tools that enable the system to be tracked in real-time, which can be huge for prevention by helping to catch issues and irregularities before they spin out of control. SaaS monitoring in particular is beneficial because it can monitor more than one network and is beneficial for multitenant databases. It can also help you save on employee costs by allowing you to access your network from anywhere and by not requiring extensive training.

2. Logging – Logging is a valuable feature that enables IT professionals to backtrack to better understand where an issue arose. It is critical for just about anything that goes awry.

3. Tracing – Finally, tracing helps professionals understand how the issue has interacted with the rest of the components of the systems can be used to help determine if other damage has occurred, or if additional steps need to be taken to implement a permanent fix on a mainframe system.

Upgrading Technology

Although mainframes are prized for their serviceability, there will eventually come a day when upgrades need to happen to keep everything in working order. If your company is one that uses a lot of data, it might be worth investing in dark fiber. The fiber offers companies the ability to use as much data as needed at a flat rate and ensures that data speed isn’t interrupted by other traffic.

Other updates include things like incorporating AI-driven analytics into the observability of the mainframe. This technology can be used to automate numerous tasks associated with monitoring, logging, and tracing issues that come up with the mainframe. In addition, AI can alert IT professionals to issues immediately as well as come up with a suite of suggested fixes. Some of the fixes, the AI software may even be able to implement by itself without much monitoring from IT.

The costs of downtime can be disproportionately large and significant. Downtime in any of the technical systems that are used to operate your business is not a great thing. Aside from having the system not working, there are plenty of negative impacts such as fewer transactions, slower services, loss of revenue, and even a decrease in customer trust. All of these things can be hard to recover from. Incorporating greater observability into your mainframe and investing in upgrades when needed are important ways to keep you up and running.

The Open Mainframe Project Has a Trust Problem — And the Mainframe Can’t Afford It

by Theo Ezell

Open Mainframe Project Security Concerns I was chatting with a mainframer buddy the other day, and he expressed some concerns about the overall code security of the Open Mainframe Project (OMP), particularly around member vetting. So, I spent a bit of time digging...

IBM z17 NTS Enhancements: Securing Time Synchronization

by Steve Guendert PhD

In October 2025, Planet Mainframe published the first article of this two-part series on IBM z17 Time Synchronization Enhancements. That article focused on time synchronization resiliency and accuracy enhancements introduced on IBM z17. This article continues with a...

z/OS Security Assessments: What We Missed in 2015 vs. Today

by Niall Ashley

My career has, over the nearly four years of working in this field, predominantly been comprised of RACF Administration and Engineering, interspersed with other opportunities that I foolishly volunteered myself for. Such opportunities have included writing articles,...

PowerTerm End-of-Life: Why Organizations Must Act Now – And Why the December 10 Webinar Is Critical

by Sam Barker

In March 2025, Ericom PowerTerm, a terminal emulator still used by thousands of organizations to access mainframes, IBM i systems, Unix hosts, VMS machines, and other mission-critical environments, officially entered End-of-Life (EOL). For decades, PowerTerm provided...

Understanding Downtime – Quantifying the Impact of Mainframe Outages and the Value of Observability

Ainsley Lawrence

The Costs of Mainframe Outages

Incorporating Observability

Upgrading Technology

0 Comments

Submit a Comment Cancel reply

Sign up to receive the latest mainframe information

Recently Published

ROI, Business Case and Tuning for Value Trivia: Round One

Webinar: Accelerating Mainframe Efficiency for SysProgs and DBAs

Overestimating AI in Mainframe Migrations, Strategic Collaboration Agreement, and more

Read More

The Open Mainframe Project Has a Trust Problem — And the Mainframe Can’t Afford It

IBM z17 NTS Enhancements: Securing Time Synchronization

z/OS Security Assessments: What We Missed in 2015 vs. Today

PowerTerm End-of-Life: Why Organizations Must Act Now – And Why the December 10 Webinar Is Critical