IBM Announces Processor Innovations to Accelerate AI on Next-Generation IBM Z Mainframe Systems
At the semiconductor industry’s Hot Chips 2024 conference, IBM announced key architectural advancements of its new processors—the IBM Telum® II Processor and IBM Spyre™ Accelerator. These advancements will significantly enhance the AI capabilities of IBM’s Z mainframe systems while addressing the increasing power demands and costs associated with deploying AI.
IBM’s announcement highlights the significant technical and economic challenges the industry faces as AI moves from experimentation to production. McKinsey predicts that AI’s rising computational demands could triple semiconductor demand in the coming years, while Morgan Stanley projects that generative AI’s power consumption will increase by 75% annually, potentially matching Spain’s 2022 energy use by 2026.
This surge is driving innovation in system and semiconductor design to scale hardware capabilities as AI becomes increasingly vital for business.
Key innovations unveiled at the conference include:
- IBM Telum II Processor: Designed to power next-generation IBM Z systems, the new IBM chip features increased frequency, memory capacity, a 40 percent growth in cache and integrated AI accelerator core as well as a coherently attached Data Processing Unit (DPU) versus the first generation Telum chip. The new processor is expected to support enterprise compute solutions for LLMs, servicing the industry’s complex transaction needs.
- IO acceleration unit: A completely new Data Processing Unit (DPU) on the Telum II processor chip is engineered to accelerate complex IO protocols for networking and storage on the mainframe. The DPU simplifies system operations and can improve key component performance.
- IBM Spyre Accelerator: Provides additional AI compute capability to complement the Telum II processor. Working together, the Telum II and Spyre chips form a scalable architecture to support ensemble methods of AI modeling – the practice of combining multiple machine learning or deep learning AI models with encoder LLMs. By leveraging the strengths of each model architecture, ensemble AI may provide more accurate and robust results compared to individual models. The IBM Spyre Accelerator chip, previewed at the Hot Chips 2024 conference, will be delivered as an add on option. Each accelerator chip is attached via a 75-watt PCIe adapter and is based on technology developed in collaboration with the IBM Research. As with other PCIe cards, the Spyre Accelerator is scalable to fit client needs.
The Telum II processor will be the central processor for IBM’s next-generation IBM Z and IBM LinuxONE platforms, expected to be available to clients in 2025. The IBM Spyre Accelerator, currently in tech preview, is also anticipated to launch in 2025.
According to Tina Tarquinio, VP, Product Management, IBM Z and LinuxONE, IBM’s multi-generation roadmap will allow the company to remain ahead of the curve on technology trends, including escalating demands of AI. “The Telum II Processor and Spyre Accelerator are designed to deliver high-performance, secured, and more power efficient enterprise computing solutions. After years in development, these innovations will be introduced in our next generation IBM Z platform so clients can leverage LLMs and generative AI at scale,” Tarquinio added.
Specifications and Performance Metrics:
Telum II processor: Featuring eight high-performance cores running at 5.5GHz, with 36MB L2 cache per core and a 40% increase in on-chip cache capacity for a total of 360MB. The virtual level-4 cache of 2.88GB per processor drawer provides a 40% increase over the previous generation. The integrated AI accelerator allows for low-latency, high-throughput in-transaction AI inferencing, for example enhancing fraud detection during financial transactions, and provides a fourfold increase in compute capacity per chip over the previous generation.
The new I/O Acceleration Unit DPU is integrated into the Telum II chip. It is designed to improve data handling with a 50% increased I/O density. This advancement enhances the overall efficiency and scalability of IBM Z, making it well suited to handle the large-scale AI workloads and data-intensive applications of today’s businesses.
Spyre Accelerator: A purpose-built enterprise-grade accelerator offering scalable capabilities for complex AI models and generative AI use cases is being showcased. It features up to 1TB of memory, built to work in tandem across the eight cards of a regular IO drawer, to supports AI model workloads across the mainframe while designed to consume no more than 75W per card. Each chip will have 32 compute cores supporting int4, int8, fp8, and fp16 datatypes for both low-latency and high-throughput AI applications.
Read more about Telum II.
Read more about IBM Spyre.
Read more about IO Accelerator
Source: IBM
Review of IDUG DB2 Tech Conference and Registration for 2024 EMEA DB2 Tech Conference in Valencia Spain
For those who weren’t able to attend the IDUG (International Db2 User Group) Db2 Tech Conference in June, Craig Mullins has contributed an extensive review of the event on the SHARE blog.
The most recent IDUG conference was held in Charlotte, NC, and offered attendees a rich array of educational opportunities in the form of presentations, longer-form education, peer-to-peer communications, special interest groups, and even the availability of taking Db2 certification exams at no charge.
For anyone who has considered attending an IDUG Tech Conference, Mullins’ review offers a peek into some of the highlights of the conference. And for those who are ready to sign up, the next IDUG event of 2024 will take place in Valencia, Spain (with early bird rates expiring August 31).
Next year’s IDUG Db2 Tech Conference will take place in Atlanta, GA from June 9 – June 12, 2025, and it’s never too early to sign up!
SHARE Washington, D.C. Call for Presenters
SHARE Washington, D.C. is taking place Feb. 23-27, 2025, and the Call for Presentations is now open. This is your opportunity to demonstrate your expertise and showcase your innovative approaches to the industry. Submit a session proposal for a chance to present—the deadline to submit a proposal is Thursday, Sept. 12.
Go to the SHARE website for a list of potential presentation topics.
Source: SHARE
Sonja Soderlund is an Oregon-based B2B freelance writer. Whether writing about mainframe computers, educational technology, or sustainable retail, she strives to bring clarity to complex issues. Connect with her at sonjasoderlund.com or LinkedIn.