- Nvidia Rubin DGX SuperPOD delivers 28.8 Exaflops with only 576 GPUs
- Each NVL72 system combines 36 Vera CPUs, 72 Rubin GPUs and 18 DPUs
- Total NVLink throughput reaches 260 TB/s per DGX rack for efficiency
At CES 2026, Nvidia unveiled its next-generation DGX SuperPOD powered by the Rubin platform, a system designed to deliver extreme AI computing in dense, integrated racks.
According to the company, SuperPOD integrates multiple Vera Rubin NVL72 or NVL8 systems into a single coherent AI engine that supports large workloads with minimal infrastructure complexity.
With liquid-cooled modules, high-speed connections and aggregated memory, the system is targeted at institutions seeking maximum AI throughput and reduced latency.
Ruby-based computing architecture
Each DGX Vera Rubin NVL72 system includes 36 Vera CPUs, 72 Rubin GPUs and 18 BlueField 4 DPUs, delivering a combined FP4 performance of 50 petaflops per system.
Total NVLink throughput reaches 260 TB/s per rack, allowing the full memory and computing space to act as a single coherent AI engine.
The Rubin GPU incorporates a third-generation Transformer Engine and hardware-accelerated compression, allowing inference and training workloads to process efficiently at scale.
Connectivity is enhanced by Spectrum-6 Ethernet switches, Quantum-X800 InfiniBand and ConnectX-9 SuperNICs, which support deterministic high-speed AI data transfer.
Nvidia’s SuperPOD design emphasizes end-to-end network performance, ensuring minimal congestion in large AI clusters.
Quantum-X800 InfiniBand delivers low latency and high throughput, while Spectrum-X Ethernet handles east-west AI traffic efficiently.
Each DGX rack incorporates 600TB of fast memory, NVMe storage and integrated AI context memory to support both training and inference pipelines.
The Rubin platform also integrates advanced software orchestration through Nvidia Mission Control, streamlining cluster operations, automated recovery and infrastructure management for large AI factories.
A DGX SuperPOD with 576 Rubin GPUs can achieve 28.8 Exaflops FP4, while individual NVL8 systems deliver 5.5x higher FP4 FLOPS than previous Blackwell architectures.
In comparison, Huawei’s Atlas 950 SuperPod claims 16 Exaflops FP4 per SuperPod, which means that Nvidia achieves higher efficiency per GPU and requires fewer units to achieve extreme levels of computation.
Ruby-based DGX clusters also use fewer nodes and chassis than Huawei’s SuperCluster, which scales to thousands of NPUs and several petabytes of memory.
This performance density allows Nvidia to directly compete with Huawei’s projected computing output while limiting space, power and overhead.
The Rubin platform unites AI computing, networking and software in a single stack.
Nvidia AI Enterprise software, NIM microservices, and mission-critical orchestration create a cohesive environment for long-context reasoning, agent AI, and multimodal model deployment.
While Huawei scales primarily through hardware count, Nvidia emphasizes rack-level efficiency and tightly integrated software controls, which can reduce operating costs for industrial-scale AI workloads.
TechRadar will extensively cover this year’s CESand will bring you all the big announcements as they happen. Head over to ours CES 2026 news page for the latest stories and our hands-on verdicts on everything from wireless TVs and foldable screens to new phones, laptops, smart home gadgets and the latest in artificial intelligence. You can also ask us a question about the show in our CES 2026 live Q&A and we will do our best to answer it.
And don’t forget it follow us on TikTok and WhatsApp for the latest from the CES show floor!



