NVIDIA has officially launched its Rubin platform, a powerful new AI architecture featuring six specialized chips designed to cut costs, boost performance, and accelerate the future of supercomputing.
Quick Summary – TLDR:
- NVIDIA launched Rubin, a six-chip AI platform at CES 2026, aimed at powering next-gen AI workloads.
- Rubin reduces inference token costs by 10x and slashes GPU requirements for training MoE models by 4x compared to Blackwell.
- Major tech giants including Microsoft, Google, AWS, and OpenAI are already integrating Rubin into their cloud and AI infrastructures.
- New capabilities in storage, networking, and power efficiency promise dramatic gains in AI compute scalability and cost-effectiveness.
What Happened?
At CES 2026 in Las Vegas, NVIDIA CEO Jensen Huang unveiled Rubin, calling it the company’s most advanced AI platform yet. Rubin brings together six new chips, including the Vera CPU and Rubin GPU, into a powerful and unified supercomputing system. Designed to handle the growing demands of AI training and inference, Rubin promises significant gains in performance and efficiency.
The platform is now in full production, with broader deployment planned for the second half of 2026.
A Unified Platform With Six Chips
The Rubin platform integrates six components through extreme codesign:
- NVIDIA Vera CPU: Built for power-efficient agentic reasoning.
- NVIDIA Rubin GPU: Third-gen Transformer Engine with adaptive compression and 50 petaflops of compute.
- NVLink 6 Switch: Enables GPU-to-GPU bandwidth of 3.6TB/s per unit and 260TB/s per rack.
- ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch round out the networking and storage capabilities.
The full-stack integration supports demanding workloads like mixture-of-experts (MoE) models, agentic AI, and multistep reasoning, all while cutting GPU requirements by 4x and lowering inference token costs by 10x compared to NVIDIA’s previous Blackwell platform.
Supercomputing at Scale
One of the standout hardware solutions is the Vera Rubin NVL72, a rack-scale system combining 72 Rubin GPUs and 36 Vera CPUs. These can be interconnected to form the DGX SuperPOD, a massive supercomputing cluster built for trillion-parameter AI models.
Rubin’s debut marks NVIDIA’s third-generation rack-scale architecture and sets a foundation for AI environments featuring millions of GPUs. The platform’s modular design also makes assembly and maintenance up to 18x faster than Blackwell.
Powerful Industry Backing
A wide ecosystem of partners is embracing Rubin. Microsoft, Google, AWS, and Oracle will be among the first cloud providers to roll out Rubin-based instances. OpenAI, Meta, Anthropic, and xAI are integrating Rubin into their next-generation AI labs and models.
Top tech leaders have praised the platform:
- Sam Altman (OpenAI) said, “The NVIDIA Rubin platform helps us keep scaling this progress so advanced intelligence benefits everyone.”
- Mark Zuckerberg (Meta) noted, “NVIDIA’s Rubin platform promises to deliver the step-change in performance and efficiency required to deploy the most advanced models to billions of people.”
- Elon Musk (xAI) described Rubin as a “rocket engine for AI.”
- Satya Nadella (Microsoft) called it essential to building “the world’s most powerful AI superfactories.”
Smarter Storage, Faster Networking
To meet growing demands for long-term memory and context in AI models, Rubin introduces Inference Context Memory Storage. Powered by BlueField-4, it allows more efficient caching and key-value data sharing.
Meanwhile, Spectrum-X Ethernet Photonics boosts data center performance with:
- 5x longer uptime.
- 10x greater reliability.
- 5x better power efficiency than traditional networks.
These enhancements allow AI systems to run smoother across distances and scale more predictably.
Performance That Outpaces the Past
Compared to the Blackwell architecture, Rubin offers:
- 3.5x faster model training.
- 5x faster inference processing.
- 8x more compute per watt.
This performance edge positions Rubin as a cornerstone in the global AI race. Analysts estimate AI infrastructure spending will exceed $3 trillion in the next five years, and Rubin aims to be the backbone of that investment.
SQ Magazine Takeaway
I’m genuinely excited about Rubin because it’s not just another chip. It is a platform reshaping the very fabric of AI infrastructure. What NVIDIA has done here is move the goalposts again, making it cheaper and faster to build smarter models. The fact that Rubin slashes costs and training time while scaling easily for next-gen AI makes this a turning point. It’s a clear message to the industry: either level up or fall behind.