For a long time, disaster recovery (DR) in Db2 z/OS environments was mostly about getting systems back online after a major outage. Whether it was a hardware failure, a data center problem, or a natural disaster, the focus was on full system recovery, bringing the whole environment back using backups and archive logs.
That model worked well for the kind of risks we were dealing with 20 years ago. But today, things have changed. And our approach to recoverability needs to change with it.
The Nature of “Disaster” Has Evolved
We still face traditional risks like power failures or hardware issues. But most of the serious incidents we see now are not physical. They’re logical.
A disaster today might be:
- A cyberattack that alters or deletes part of your data
- A bug that corrupts specific rows in a table
- A script that was accidentally run in production
- An insider misusing access rights
These types of problems don’t crash your whole system. Instead, they corrupt or expose a piece of your data, which can propagate further to other tables. If your only recovery option is restoring a full tablespace or database from a backup taken hours or days ago, you may lose valuable, unaffected data in the process. That’s no longer acceptable in many businesses.
(Of course, there will be cases where full DR will be quickest, particularly if the complexity of rolling data corruption is hard to assess/evaluate).
Disaster Recovery Is Now About Precision
Modern DR strategy for Db2 z/OS needs to go beyond asking, “Can we bring the system back up?” and instead ask these five questions:
- Can we detect when data was changed, and by whom?
- Can we recover only the affected data without wiping out everything else? Or shall we recover our whole world?
- Can we do this fast enough to meet business expectations and the recovery time objective? (RTO)
- Can we recover data in the agreed amount of time, meeting the recovery point objective (RPO)?
- Can we prove our capabilities in regular, realistic test scenarios?
RPO (recovery point objective) is the amount of data loss a company can tolerate, while RTO (recovery time objective) is the time to restore systems and applications after a disruption.
The good news is that this is all possible today. IBM and other vendor companies offer solutions that allow more granular recovery, better auditing, and more innovative backup strategies. But it’s up to each organization to understand what’s available and build a plan that fits its needs.
Regulations Raise the Bar
New regulatory frameworks, especially in finance and critical infrastructure, are shifting from “Do you have a backup?” to “Can you show operational resilience?”
It’s about proving that your business can survive and keep trust, even under attack.
For example, regulations like the Digital Operational Resilience Act (DORA) in the EU or the updated guidance from the Federal Financial Institutions Examination Council (FFIEC) in the US now expect organizations to:
- Test their recovery plans more frequently, and in realistic conditions
- Prepare for data integrity issues, not just system outages
- Ensure they can recover quickly and correctly, not just eventually
This is not just about passing audits anymore. It’s about proving that your business can survive and keep trust, even under attack.
What You Can Do Now
Here’s a short list of actions to take:
- Reassess your DR plans. Are they focused only on system recovery, or do they cover logical data issues too?
- Make sure your team understands what modern recovery capabilities are available for Db2 z/OS.
- Get familiar with the new regulatory requirements; they may already apply to your environment.
- Start small if needed: test more often, simulate a corrupted table, and measure how fast you can fix it.
- Don’t wait for a real incident to discover if your recovery process is ready.
Disaster recovery for mainframe systems is no longer just a technical responsibility. It’s part of your operational resilience. The world has changed, and our risks are different. It’s time our recovery strategies reflect that.