Nvidia announces the Blackwell B200 GPU for AI computing
During its GPU Technology Conference, Nvidia announced the world's most powerful chip for AI-related computing called GB200 powering up the Blackwell B200 GPU. It's a successor to the H100 AI chip and offers huge improvements in performance and efficiency.
The new B200 GPU is capable of 20 petaflops of FP4 thanks to the 208 billion transistors inside the chip. Additionally, the GB200 has 30 times the performance of H100 in LLM inference workloads while reducing the energy consumption 25-fold. In the GPT-3 LLM benchmark, the GB200 is seven times faster than the H100 too.
For instance, training a model with 1.8 trillion parameters would require 8,000 Hopper GPUs and about 15 megawatts, while a set of 2,000 Blackwell GPUs can do that for just 4 megawatts.
To further improve efficiency, Nvidia designed a new network switch chip with 50 billion transistors that can handle 576 GPUs and let them talk to each other at 1.8 TB/s of bidirectional bandwidth.
This way, Nvidia tackled an issue with communication as previously, a system that combines 16 GPUs would spend 60% of the time communicating and 40% of the time computing.
Nvidia says it's offering companies a complete solution. For instance, the GB200 NVL72 allows for 36 CPUs and 72 GPUs into a single liquid-cooled rack. A DGX Superpod for DGX GB200, on the other hand, combines eight of those systems into one, which makes 288 CPUs and 576 GPUs with 240TB of memory.
Companies like Oracle, Amazon, Google and Microsoft have already shared plans to integrate the NVL72 racks for their cloud services.
The GPU architecture used for the Blackwell B200 GPU will likely be the foundation of the upcoming RTX 5000 series.