Today’s mainframe systems run the world’s top industries’ mission-critical workloads. This includes large financial institutions like banks, insurance companies, healthcare organizations, utilities, government, military, and a multitude of other public and private enterprises.
The beauty of the mainframes lays in its flexibility, stability and security. Mainframe systems provide the ability to divide the resources of a single machine into multiple, logical partitions (LPARs), capable of running their own independent operating systems. These LPARs are, in practice, equivalent to separate mainframe systems themselves, and can function independently, or work together as a collection – called sysplex – that cooperate to process computing workloads. This sysplex design can run large commercial mission-critical workloads continuously and very efficiently.
Each mainframe machine has maximum central processing complex (CPC) capacity – the LPARs running on a given machine collectively use the machine’s CPC capacity. And capacity is the key word – mainframe systems have the equivalent processing power of multiple rows of multiple commodity servers – and have tremendous transaction throughput capacity. Much higher capacity than those banks of servers – which is why large organizations continue to use this platform – the mighty and majestic mainframe.
Controlling the Majestic Mainframe
photo credit: Wikimedia Commons
Unlike other computing platforms, a single mainframe system can be configured for a variety of different business needs – performance and cost can be balanced as needed, depending upon the required computing needs. Whether a business needs to use the full capacity of the system, or to limit the capacity of a test LPAR, or to control the cost of usage-based software, the mainframe can accommodate the needs of the business.
The mainframe offers “capping” controls that can limit the CPU resource usage of one or more LPARs. Capping is managed either by the Workload Manager (WLM) or the Processor Resource/System Manager (PR/SM). There are 7 effective techniques available to control resource usage – below, we go through each one of them.
Initial Capping (Hard Cap)
The usage of CPC capacity for a given LPAR is based on the weight assigned to it in the Hardware Management Console (HMC) – each LPAR has its share of CPC capacity. However, if an LPAR needs more than its share of CPC capacity, and other LPARs are using less than their shares, the PR/SM can then allocate additional capacity for the capacity-hungry LPAR.
The Initial Capping (IC) setting prevents PR/SM from giving an LPAR more than its share even when there is capacity available in the CPC, meaning the LPAR can never exceeds its share. The IC limit is defined in the HMC as relative weight.
Scope is single LPAR. This capping is managed by the PR/SM.
LPAR Absolute Capping
LPAR absolute capping is a PR/SM-controlled capping limit that applies to a single LPAR. Its limit is defined in the HMC as a fraction of the total number of processors. This capping is enforced independent of the 4-hour rolling average (4HRA) – the measure of the LPAR’s resource usage – and is applicable to both z/OS LPARs as well as non-z/OS LPARs.
Scope is a single LPAR.
Group Absolute Capping
Group absolute capping similar to the LPAR absolute capping but applies to a defined group of LPARs. This cap limit is defined in the HMC as a fraction of the total number of processors, and is controlled by PR/SM. The combined CPU capacity usage of these groups of LPARs can never exceed the group absolute capping limit at any time.
Scope is a group of LPARs.
Defined Capacity (Soft Capping)
An LPAR receives a Defined Capacity (DC) – its defined maximum capacity – is set in the HMC. The WLM tracks the 4HRA of the LPAR, and compares it to the LPAR’s DC value. If the LPAR’s 4HRA exceeds its DC value, then it is capped. The WLM triggers the capping, but it is enforced by the PR/SM. When capping occurs, the workload currently running on the LPAR is delayed. If the WLM policy is set appropriately, the WLM will run the most critical work and delays the low-importance work. Capping will be in effect until the resource usage 4HRA drops below the DC value, at which point the LPAR will process workloads without delay. Since the LPAR’s DC value is compared to the 4HRA (an averaged value), the current CPU usage of this LPAR can go over DC value as long as the average 4 hour rolling value doesn’t exceed the DC limit.
DC only applies to LPARs with shared central processors (CPs); LPARs with dedicated CPs cannot be controlled by DC. DC (soft capping) cannot be used with Initial Capping control (and vice-versa).
Scope is a single LPAR. This means that the DC value limits the CPU capacity for a single LPAR.
Group Capacity Limit
Similar to DC, group capacity limit (GCL) controls the CPU usage for a group of LPARs on a single CPC. z/OS LPARs can be grouped together in what are called capacity groups. These LPARs must reside in the same CPC, but not necessarily within the same sysplex. GCL controls all LPARs in the capacity group, and the GCL value is set in HMC. The total 4HRA of all LPARs in a given capacity group cannot exceed the GCL value. If the 4HRA exceeds the capacity group limit, then the capacity group is capped by the PR/SM, and each member of the group gets its share of resources based on its assigned weight. As the group 4HRA drops below the GCL, capping is terminated, and the group will process workloads without delay. While the 4HRA of the group is below the GCL, each LPAR in the group can use the capacity it needs.
Within the capacity group, in addition to its group share, each LPAR in the group can be assigned a DC. In such cases, either the calculated group share or the DC is used to cap, whichever is less.
Scope is a group of LPARs belonging to the same CPC.
Resource Group Capping
Resource Group (RG) capping provides the ability to control the maximum and the minimum CPU capacity given to the service classes (workloads) that are connected to a given RG. If workloads of a RG exceeds the maximum limit, then WLM caps the CPU usage of the RG.
WLM manages the workloads using the goals defined in the service definition. If a workload has no defined goal, the WLM assign more resources to it when needed. In that process, the WLM may remove resources from other workloads that are meeting their goals. The RG minimum and maximum values set the limits to this CPU goal management. A minimum value prevents the WLM from taking away resources, while a maximum value prevents the WLM from providing additional resources even if the workload is not meeting its goal.
Scope is sysplex wide. The service classes that are in a RG should belong to the LPARs that are in the same sysplex. However, LPARs themselves can span multiple CPCs.
Absolute MSU Capping
WLM-controlled Absolute MSU capping is similar to PR/SM controlled initial (hard) capping – it is permanent capping controlled by the WLM. The difference is that absolute MSU capping is specified in MSUs (the DC of the LPAR), whereas hard capping is specified in relative weight.
The limit value is derived from that DC and the LPAR group capacity.
Scope is a single LPAR.
Mainframe systems provide tremendous capabilities, and run the world’s largest and most complex commercial workloads. Yet it remains highly flexible, and is able to handle the unique business needs and workloads required for a wide range of businesses. To facilitate this high degree of business flexibility, there are exist 7 different techniques available to control the mighty and majestic mainframe.
Hemanth Rama, a millennial mainframer, is a senior software engineer at BMC Software. Hemanth has 13+ years of IT experience and he holds 3 patents. Hemanth is recipient of IBM z Champion award 2018. Hemanth writes regularly for many popular IT websites and also on his personal blog. He passionately speaks about mainframe technologies at various conferences, symposiums and user groups. Hemanth writes regularly for many popular IT websites and also on his personal blog.