The massive benefits that machine learning can provide are well documented by this point, yet many organizations are still failing to take advantage of the predictive technology while others are failing to derive the value that vendors have promised.1 This is especially true in the mainframe environment where many leading enterprise analytics engines do not have native support for z/OS, a fact reinforced by BMC’s 2020 Mainframe Survey that showed a significant increase in the priority of implementing AI/ML strategies. The hardest part is often the first step – how do I begin to take advantage of all the data on my mainframe? To solve this, you need to get the “right” data, into the “right” spot before you can run your first learning algorithm. Below are three steps that will help get you to the starting line and prepared for success.
1 – Enrich Your Logs
The first thing you need to do is decide what intelligence you are looking to derive and work backwards to where the data is being accumulated. Too often, organizations will simply gather all of their logs and pipe it into an analytics engine only to find that the correlations they need are missing. This is especially true in the mainframe world which uses the System Management Facility (SMF) for logging events. While SMF does log a tremendous amount of mainframe activity, it was initially designed for accounting which means that a lot of use cases simply cannot be accomplished with the information in the log. For example, the SMF type 80 record will tell you if someone failed to access a resource but wouldn’t it be useful to know what actions they were taking on the other subsytems that did not have logs cut by RACF? In general, the other subsystems fail to include the UserID, port of entry, or job type which would be critical if you were looking to do User Entity Behavior Analytics (UEBA) and track all user activity with a single query. This is where your data forwarder needs to provide the extra level of clarity, or enrichment, to the logs so the analytics engine will have the necessary data to build correlation threads.
2 – Filter For Relevance
The second step is making sure your data forwarder has the ability to filter in real-time. Mainframes produce a tremendous quantity of logs, even in medium size companies, and a good deal of mainframe shops are reporting an increase in data and transaction volumes overall. Referencing back to the same survey, more than half of respondents (54 percent) reported an increase in transaction volume and 47 percent reported an increase in data volumes.
This necessitates the ability to limit what is forwarded to only relevant information. This is immediately important because filtering only at the analytics engine significantly increases the processing overhead for the server which may not be able to handle the load. When you consider the high availability and failover requirements, this will negatively impact your architecture whereas adding the filtering on each agent running on a mainframe Logical Partition (LPAR) will not significantly increase the mainframe’s processing requirements.
The second reason is cost. Even if you aren’t on a pricey ingestion model for your analytics engine, the storage requirements on the analytics server can still drastically increase the overall price of the program. If the forwarder can quickly filter the logs in real time to send only the necessary information from the enriched logs, you will quickly find your ability to run machine learning models will increase significantly while reducing overall costs.
3 – Format for usability
The last step is ensuring your data forwarder has the ability to modify the data into the appropriate form before it reaches the analytics engine. There are several different formats that enterprise analytics engines may use, from Splunk’s Common Information Model (CIM)2, to the RFC3164 SysLog, or even the more common JSON. If you fail to do this on the forwarder side, you will find that your information gets ingested but it will be unable to be correlated with other data across the enterprise. This makes it functionally unusable at meeting any enterprise use cases.
If you are starting with logs, you will need to map the information directly to the semantic model of choice. This is technically simple but can be extremely time consuming when you consider the hundreds of types and sub-types in SMF alone. This seemingly overwhelming task is best solved in the same recommended solution above – take a deliberate approach against a few specific use cases, led by your domain experts. This way you can start small and derive value before growing to additional use cases. The other option is to explore vendor provided solutions who may have an out of the box solution.
Companies that fail to embrace the predictive capabilities of Machine Learning will soon find themselves falling behind their peers. To begin integrating the mainframe into your data program, find a way to get your data enriched, filtered, and correctly formatted to send to your analytics engine in real-time. Let your domain experts guide your initial use cases and determine which data is necessary before you begin to significantly increase your chances of success. Through incremental and agile success, you will see your data initiative grow into a foundational part of the monitoring and operations processes.
Christopher Perry is the Lead Product Manager for BMC AMI Security. Chris first got started in cyber security while studying computer science at the United States Military Academy. While assigned to Army Cyber Command, Chris helped define expeditionary cyberspace operations as a company commander and led over 70 soldiers conducting offensive operations. Chris currently focuses on security issues related to the IBM mainframe.