You may have started to hear the term “observability” more and more, and maybe you’re thinking, “How is this different from traditional monitoring and log analysis, which we have been doing for many years?” The differences are subtle but important to understand.
Our modern business applications are more complex and distributed than ever before, spanning both cloud and on-prem environments, interconnected through APIs. Understanding the behavior and performance from end-to-end becomes a significant challenge, which can impact the reliability and availability of these applications. Traditional approaches often divide the environment into disjointed “silos” where teams and tooling focus on just one part of the puzzle, but there is an absence of a complete composite perspective, which results in additional time and effort to detect, isolate, and resolve incidents.
Traditional approaches often divide the environment into disjointed “silos” where teams and tooling focus on just one part of the puzzle, but there is an absence of a complete composite perspective.”
Observability uses an outside-in approach to understanding how applications behave.
Instead of relying on limited, siloed alerts, it looks at a system’s external outputs — its telemetry data like logs, metrics, and traces — to infer what’s happening inside. This makes it possible to see the bigger picture across a hybrid application, not just isolated issues. By analyzing changes in behavior and putting alerts in context, observability helps teams identify and resolve problems in real time, often automatically.

What is OpenTelemetry, and how does it relate to observability?
The emergence of observability strategies over the past decade has also driven the development of the open-source OpenTelemetry standard. Administered by the Cloud Native Computing Foundation (CNCF), the principal aim of OpenTelemetry is to provide a unified, vendor-agnostic, and extensible framework for collecting and managing telemetry (operational) data. This approach appeals to many organizations to address the challenge of the complex distributed environments that run their core applications.
“OpenTelemetry isn’t a product itself. The principal aim of OpenTelemetry is to provide a unified, vendor-agnostic, and extensible framework for collecting and managing telemetry (operational) data.”
Previously, you might have been limited by coverage from a single solution to get visibility across everything. If a particular technology wasn’t supported by that solution, you would be blind to key parts of your environment. Similarly, switching solutions from one vendor to another would be costly and time-consuming as proprietary agents need to be replaced across the entire environment.
OpenTelemetry promotes standardization across the collection and management of data, making it easier to integrate with tools and solutions, including switching between solutions without re-instrumenting applications or data collection. With OpenTelemetry, standards and conventions for source or telemetry data formats (like trace, metrics, and logs) are defined.
It’s important to note that OpenTelemetry isn’t a product itself; instead, there are many observability products available that claim support for ingesting OpenTelemetry formatted data, and an increasing number of software solutions are making operational data about their performance in the OpenTelemetry standard.
Where does the mainframe come in?
So, where does the mainframe fit into the picture? We have access to high-quality, high-fidelity data from the mainframe, such as real-time monitors and SMF. However, this data is often restricted to specialist tools specifically designed for the mainframe and used by those with deep technical expertise. In the realm of complex hybrid applications and environments, it is crucial that mainframe operational data—or at least a subset of it—is accessible to users and teams outside the traditional audiences.
For example, consider a mobile app for banking or travel management. The front-end might be hosted in the cloud. Depending on the functionalities chosen by the user, transactions are initiated that pass through various systems, including the mainframe, where, for instance, a CICS transaction could update a Db2 database.
The operations team, responsible for the overall application’s availability, needs to track the complete customer transaction journey from start to finish. Ignoring the mainframe as a blind spot in this flow is unacceptable and could ultimately lead to blaming the platform when incidents happen—even if the root cause lies elsewhere in the application.
It is important to bring the mainframe into the observability and OpenTelemetry story. Over the past couple years, there have been significant developments to address this challenge. Firstly, under the guidance of CNCF, a Special Interest Group (SIG) has been established, with membership spanning stakeholders such as IBM and independent software vendors, to ensure consistent adoption and adherence to established guidelines.
Secondly, established software products are building OpenTelemetry into their offerings. Earlier this year, IBM issued a Statement of Directionindicating the development of OpenTelemetry support into the latest release of key subsystems. At the time of writing, we are seeing that initial trace support has been delivered for CICS, IMS, MQ, and Db2.
What next?
The emergence of OpenTelemetry on the mainframe is a journey that will significantly change how we create, manage, and consume operational data in future years. With better source data aligned with a full-feature observability platform, such as IBM Instana, managing complex environments that span multiple technologies becomes simplified.
“The emergence of OpenTelemetry on the mainframe is a journey that will significantly change how we create, manage, and consume operational data.”
You can get involved with OpenTelemetry today to understand the concepts in more detail and generate support within existing applications. If this is a topic that interests you, consider joining the OpenTelemetry on Mainframe SIG and contributing to its future direction.
0 Comments