IBM Db2 Analytics Accelerator for Christmas

It’s Db2 month at Planet Mainframe, so what better time to talk about my absolute favorite mainframe technology, the IBM Db2 Analytics Accelerator (we users like to call it IDAA). 

What’s Not in a Name

If you’ve ever heard me talk about IDAA at a conference, the first thing I always like to point out is that I do not like the name, because it implies limitations that simply are not there. For instance:

  • It specifically says Db2, but does that mean it is only for Db2 data?  Absolutely not. We almost had more IMS data in our IDAA, as we had Db2 data, as well as data from SYSLOG, SMF, etc.   
  • The next word, of course, is Analytics, and while it can certainly speed up analytics, it is really about query acceleration, whether the query is analytic in nature or not. So, maybe a more appropriate name would be zData Query Accelerator?

At the end of the day, though, the questions for most companies in these days of tight budgets are, ‘How can it benefit me?’ and ‘What is the cost/benefit?’ Of course, the end answer will still be “it depends”, because every company is different, but this might help you think of more use cases that you had not thought about. So let me give you a little background on why we started the journey, and how it was a game changer for us. 

Internet-Enabling 10 Years Ago

Jump in your “time machine” and travel back about a decade ago.  All companies were internet-enabling everything they possibly could, so the availability of systems was a critical focus.  While the mainframe infrastructure could provide amazing availability, you still had application limitations and workloads that were not as well-behaved, which could impact the customer-facing applications.  One challenge that many companies ran into – if you were anything like us – was ad-hoc queries, and that is a big part of what was causing us issues.

The first problem was that we could not just tell them STOP! The results that the users were getting were important.

As anyone in the financial industry knows, there are lots of regulatory requirements, every state can be different, and they do not tend to be patient. So, we needed to find a way to control those queries while limiting the impact on our customer-facing systems. Here’s how we tackled it:

  1. We used Resource Limit Facility (RLF) to control resource usage.
  2. We used Workload Manager (WLM) to appropriately prioritize what could be considered discretionary workloads.
  3.  We put SQL Review requirements in place.
  4.  We even created a tool specifically around ad-hoc queries that users were forced to go through and forced some queries to run off-shift.  

At this point, many companies were moving data off to Warehouses/Marts/Swamps, and around that time, IBM purchased a company called Netezza, which conveniently had an appliance solution that could help. So began our IDAA adventure.

Managing Ad-hoc Queries

Most of these ad-hoc queries were not analytic; they were more reporting in nature. Because of the volume of data involved, they were very resource-intensive, and in many cases, they did not have indexes to fully support them.

Running natively in Db2, [ad-hoc queries] caused lots of CPU and I/O, capping at times, MLC spikes, and all those unpleasant experiences.

Running natively in Db2, they caused lots of CPU and I/O, capping at times, MLC spikes, all those unpleasant experiences that DBAs wanted to avoid. Once we had our Netezza installed in test, we started replicating data over there, and our users were immediately amazed at how fast the responses were. And, they were even happier that they could again run their queries during the day by using this technology.  

Since they were still going through Db2, there was no training for them like if they were moving the data to something like Hadoop. The Data owners were happy because all the existing security in Db2 was still in control, and there were no new unmanaged copies of the data to worry about. Our Capacity Planning people were happy because we were able to identify workloads that benefited from IDAA, which eliminated many of the spikes and actually lowered our MLC.

Customer Success Stories

The most interesting thing that happened was that when we started advertising the capability, we asked our customers (internal application areas) to share their success stories with other business areas. They started coming to us instead of us having to find them, and use cases started popping up everywhere.  

This was when we really started getting into the IMS space, because we had our largest business area that had most of their data still in IMS.  They had an entire team built around ad-hoc and regulatory reporting, and at that time, with IMS, it meant writing COBOL programs that had to go through the regular deployment lifecycle, which could take weeks or months to write, test, and deploy.

The timeframe for producing a report went from months to – in some cases – hours, just by having IMS data copied to IDAA.

They also had some Db2 data, which they could just query, that was much quicker and easier.  Using the Accelerator Loader, we were able to load their IMS data directly into IDAA using Accelerator Only Tables.  Now they could suddenly say goodbye to writing COBOL programs for those reports. Their timeframe for producing a report changed from months to, in some cases, hours, just by having that IMS data copied to IDAA.

Today’s Cost/Benefit

Let’s return to modern day and talk about cost/benefit. The cost is fairly obvious; you are paying for hardware and software.  With the initial implementation, it was an appliance solution, so you would have a separate footprint in your Data Center, and all the costs that go with that.  

The current IDAA solution is no longer implemented on a separate appliance; it uses IFLs on a regular Z box. It can also be implemented with a separate LinuxOne box (my preference), or it can use your standard mainframe storage.  

When we talk about benefits, it gets a little fuzzy, and this is where each company really needs to think about use cases and how they view the value of technology.  Consider:

  • Do ad-hoc queries ever cause outages? 
  • How expensive is an outage at your company? 
  •  Is the data currently being replicated to other platforms for reporting?  
  • How much is that costing when you count hardware, software, additional security/audit costs, monthly license charges (MLC), and impact of the replication?  
  • How current is your replicated data?  
  • Is that current enough for the users?  
  • What training is required for the other platforms compared to just using Db2 as the interface?  

With IDAA, using our performance data, we were able to quickly identify workloads that were impacting the four-hour rolling average (4HRA) or causing availability issues which we moved to using IDAA.

The IBM Integrated Synchronization solution for replicating Db2 data to IDAA is included and is zIIP enabled. It installs with Db2, so there are no additional products to license, install, and support. Also, people are a major expense, so making them radically more efficient is a huge win.

Real-world Examples

I want to share a couple of very specific examples that surprised me.  

Detecting Sensitive Information

The first one speaks to the value of using IDAA for unstructured data like VARCHAR.  We had one area that was doing auditing that had a job that they ran weekly to scan a VARCHAR field for a business area to detect sensitive information.  Everyone probably sees the warnings about not entering sensitive information in a VARCHAR field, but how do you actually stop or detect it? 

Their job was running across a very large database, and as you can imagine, scanning a VARCHAR field, it was doing a tablespace scan.  Running directly against the Db2 database, they would submit the job at the end of the day. At some point the next day, it would eventually complete, typically about 20 hours of runtime

One of my good friends who happened to work in that area knew the data was already in IDAA, so he had them use that magic SET CURRENT QUERY ACCELERATION statement, and BOOM! The job completed in about 10 minutes and almost completely eliminated its General Processor CPU usage. 

Batch Job Every 15 Minutes

The other example was a batch job that ran every 15 minutes. In this case, it was a BMC unload job with DIRECT NO that ran a complex query against a series of Db2 tables to detect changes that were made; it was a pseudo-replication solution.  It usually ran for about 10 minutes, but, due to parallelism, it actually used more than 10 minutes of CPU.  Because it ran every 15 minutes, we could guarantee that it was running regardless of where the 4HRA hit.  By having the unload run against the IDAA copies of the tables, the runtime was reduced some, but more importantly the CPU was now almost all on IDAA and off the General Processor, so we were able to monitor over the next couple months and directly see the cost benefit of moving this workload to IDAA.

Need more examples? Take a look at IBM Db2 Analytics Accelerator for z/OS for case studies from other customers. One in particular that is very interesting is another huge IDAA consumer, Banco do Brasil, which has some very similar stories.

The Right Team for Success

I can’t end without talking about the number one priority of making it a success: picking the right people to support it and advocate for it.  We were lucky to have several data people who were very passionate about the technology, and it showed when they were talking with application areas and looking at how to best utilize it.  So, find yourself a Vaidya, a Judy, a Tammy, a Derek, and a Mark.  

We also had a Db2 Performance team that partnered closely with us in identifying opportunities, and many contacts in the application areas who were willing to tell their success stories both internally and at conferences.

So, is IDAA on your Christmas wish list now? It’s never too early to consider whether you could benefit from this gift like we did.

Principal Consultant @ Team SWAMI, IBM Champion, IBM Gold Consultant

Greg DeBo is a Principal Consultant and IBM Gold Consultant at TeamSWAMI, having previously spent 34 years at a large Insurance Company with a focus on Mainframe Data. TeamSWAMI focuses on helping small and midsize clients understand the possibilities that the modern mainframe provides, as well as engaging universities to address the mainframe skills pipeline.

Leave a Reply

Your email address will not be published. Required fields are marked *