- HPE will ship 72-GPU racks with next-generation AMD Instinct accelerators globally
- Venice CPUs paired with GPUs target exascale-level AI performance per rack
- The Helios relies on liquid cooling and a double-wide chassis for thermal management
HPE has announced plans to integrate AMD’s Helios rack-scale AI architecture into its product line starting in 2026.
The collaboration gives Helios its first major OEM partner and positions HPE to ship full 72-GPU AI racks built around AMD’s next-generation Instinct MI455X accelerators.
These racks will pair with EPYC Venice CPUs and use an Ethernet-based scale-out fabric developed with Broadcom.
Rack layout and performance measures
The move creates a clear commercial route for Helios and puts the architecture in direct competition with Nvidia’s rack-scale platforms already in operation.
Helio’s reference design relies on Meta’s Open Rack Wide standard.
It uses a double-wide liquid-cooled chassis to house MI450 series GPUs, Venice CPUs and Pensando networking hardware.
AMD targets up to 2.9 exaFLOPS FP4 computers per rack with the MI455X generation along with 31 TB of HBM4 memory.
The system presents each GPU as part of a single pod, which allows workloads to span all accelerators without local bottlenecks.
A custom-built HPE Juniper switch supporting Ultra Accelerator Link over Ethernet forms the high-bandwidth GPU connection.
It offers an alternative to Nvidia’s NVLink-centric approach.
High-Performance Computing Center Stuttgart has chosen HPE’s Cray GX5000 platform for its next flagship system, called Herder.
Herder will use MI430X GPUs and Venice CPUs across direct liquid-cooled blades, replacing the current Hunter system in 2027.
HPE stated that the GX5000 racks’ waste heat will heat campus buildings, demonstrating environmental concerns beyond performance goals.
AMD and HPE plan to make Helios-based systems globally available next year, expanding access to rack-scale AI hardware for research institutions and enterprises.
Helios uses an Ethernet fabric to connect GPUs and CPUs, which contrasts with Nvidia’s NVLink approach.
The use of Ultra Accelerator Link over Ethernet and Ultra Ethernet Consortium custom hardware supports scale-out designs within an open standards framework.
While this approach allows for theoretically comparable GPU numbers to other high-end AI racks, performance under sustained multi-node workloads remains untested.
However, reliance on a single Ethernet layer can introduce latency or bandwidth limitations in real applications.
That said, these specs don’t predict real-world performance, which will depend on effective cooling, network traffic handling, and software optimization.
Via Tom’s hardware
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!
And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.



