Huawei Unveils AI System Claimed to Outpace the GB200 NVL72
Huawei has unveiled its CloudMatrix 384 Supernode , positioning it as a direct competitor to Nvidia’s NVL72 in the AI infrastructure space. The CloudMatrix 384 boasts nearly twice the compute power of its rival, along with enhanced memory capacity and superior bandwidth.
However, the system comes with a trade-off: it consumes almost four times the power of Nvidia’s solution. That said, Huawei suggests that system efficiency is less of a priority in the Chinese market, where performance and scalability often take precedence over energy consumption.
Huawei has long been positioning itself as China’s answer to Nvidia, and now, according to the South China Morning Post (SCMP) , the company has unveiled a new AI infrastructure architecture designed to rival Nvidia’s NVL72 system.
Nvidia’s NVL72 connects 72 GPUs using NVLink technology, enabling them to operate as a single, high-performance GPU. Tailored for training and running trillion-parameter AI models, the NVL72 delivers real-time inference speeds up to 30 times faster than previous systems by eliminating traditional data transfer bottlenecks.
The SCMP reports that Huawei’s response, the CloudMatrix 384 Supernode , has been described by unnamed Huawei insiders as a “nuclear-level product.” The system leverages 384 Ascend 910C chips to deliver 300 petaflops of dense BF16 compute power—nearly double the 180 petaflops provided by Nvidia’s NVL72.
The CloudMatrix 384 Supernode has reportedly been deployed at Huawei’s data centers in Wuhu, a city located in central Anhui province, according to the report.
In its analysis, SemiAnalysis highlights that this rack-scale solution directly competes with Nvidia’s GB200 NVL72 and, in certain metrics, surpasses it. The site notes that despite ongoing sanctions, China’s domestic semiconductor industry is steadily advancing. Huawei’s competitive edge lies in its system-level engineering capabilities, which encompass networking, optics, and software.
Although the Ascend chips remain heavily reliant on foreign supply chains—such as high-bandwidth memory (HBM) from Samsung and wafers from TSMC—Huawei has managed to circumvent export restrictions through intricate sourcing strategies.
The CloudMatrix 384 not only outperforms the NVL72 in terms of raw compute power but also provides 3.6 times the aggregate memory capacity and 2.1 times more memory bandwidth . However, one potential downside—likely not emphasized by Huawei—is that the system consumes nearly four times the power of its Nvidia counterpart.