Jensen Huang holding two GPUs” width=”970″ height=”646″ data-caption=’Jensen Huang holds Nvidia (NVDA)’s graphics cards at GTC 2025. <span class=”lazyload media-credit”>dpa/picture alliance via Getty I</span>’>
At its annual developer conference GTC in San Jose, Calif. today (Mar. 18), Nvidia unveiled two upcoming GPU architectures: Blackwell Ultra and Rubin. During his opening keynote, CEO Jensen Huang laid out the chipmaker’s ambitious vision to propel A.I. into an era of industrial-scale computing. Huang called GTC the “Super Bowl of A.I.”—with one key distinction: “At this Super Bowl, everyone wins,” he said.
Blackwell Ultra is expected to launch in the second half of 2025. Its successor, Rubin, is slated for late 2026, followed by the more advanced Rubin Ultra in 2027.
What to know about Blackwell Ultra and Rubin
Currently in production, Nvidia’s Blackwell Ultra series is an advanced iteration of the Blackwell chips unveiled at last year’s GTC, which are 40 times more powerful than the previous-generation Hopper chips, Huang said.
Blackwell Ultra will incorporate eight stacks of 12-Hi HBM4E memory, providing 288GB of onboard memory. The architecture will feature NVLink 72, an upgraded high-speed interconnect technology designed to facilitate communication between GPUs and CPUs—crucial for processing the massive datasets required for A.I. training and inference.
“NVLink connects multiple GPUs, turning them into a single GPU,” Huang explained. “It addresses the scale-up problem by enabling massive parallel computing.”
In addition, the company unveiled the Nvidia RTX PRO 6000 Blackwell Server Edition, engineered for enterprise workloads such as multimodal A.I. inference, immersive content creation and scientific computing. With 96GB of GDDR7 memory and support for multi-instance GPU technology, the RTX PRO 6000 is designed to power advanced A.I. development.
Blackwell’s successor, Rubin, is named after astronomer Vera Rubin, who discovered the existence of dark matter in space. The initial version of the Rubin chip is expected to achieve 50 petaflops of speed during A.I. model execution. The more powerful Rubin Ultra can deliver up to 100 petaflops of performance, which Huang called a “major step forward” in A.I. processing and performance power.
Early iterations of the Blackwell chips and racks reportedly faced overheating issues, leading some customers to reduce their orders. The newly introduced liquid-cooled Grace Blackwell 200 NVL72 system addresses these concerns, offering up to 30 times faster real-time inference for trillion-parameter large language models and up to four times faster training compared to Nvidia’s previous-generation H100 GPU. It can generate up to 12,000 tokens per second—the basic units processed by A.I. models—dramatically accelerating both training and inference.
“If you want your A.I. to be smarter, it must generate more tokens. That requires massive bandwidth, floating-point operations and memory,” said Huang. He also explained that reasoning A.I. models such as DeepSeek’s R1 model require 20 times more tokens and 105 times more computing power.
Nvidia is also collaborating closely with Taiwan’s TSMC to develop data packaging technologies for data centers, a move that could substantially enhance computational efficiency and thermal demands in future GPU generations.
Nvidia’s roadmap beyond Rubin
Huang also outlined Nvidia’s roadmap beyond Rubin. “The next generation of architectures after Rubin will be named Feynman,” he said, confirming that Feynman HBM is already in development and slated for release in 2028. Named after Richard Feynman, a theoretical physicist known for his contributions to quantum mechanics, this upcoming architecture is expected to push A.I. performance to unprecedented levels.
The announcements came on the heels of Nvidia’s better-than-expected first-quarter earnings results, driven by a surging demand for its GPUs. Despite increasing competition from rivals like AMD and geopolitical uncertainties including export restrictions on semiconductors, Nvidia currently dominates the global GPU market with an estimated 80 percent market share, according to a report by Nasdaq.