Huawei just dropped a monster AI chip that claims 2.87x Nvidia H20 performance and massive memory gains under heavy restrictions

Huawei introduces the Atlas 350 with significant FP4 computing requirements
New accelerator board focuses on inference workloads and multimodal AI processing
Huawei Atlas 350 delivers higher memory capacity and improved bandwidth efficiency

Huawei has officially launched the Atlas 350 accelerator card, with its new Ascend 950PR processor, at the Huawei China Partner Conference 2026 in Shenzhen.

The company claims that this NPU delivers 1.56 PFLOPS of FP4 compute performance, which is reportedly 2.87 times higher than Nvidia’s H20.

Although exact verification is difficult because Hopper-era GPUs do not natively support FP4, the Atlas 350 is the first Chinese accelerator optimized for this low-precision format, allowing larger AI models to operate on the same hardware with reduced memory requirements.

The article continues below

Technical upgrades and memory performance

The Ascend 950PR chip introduces improvements over the previous Ascend 910 series, including improved microarchitecture, faster memory access and flexible programming modes.

Huawei equips the Atlas 350 with 112 GB of proprietary HBM, known as HiBL 1.0, which delivers up to 1.4 TB/s of bandwidth in current reports with a 128-byte memory access granularity.

This configuration enables efficient multimodal generation and inference tasks and reportedly quadruples memory access efficiency for small operators compared to the previous generation.

Its interconnection bandwidth also reaches 2TB/s using the LingQu protocol, 2.5 times higher than the Ascend 910 series.

Huawei is marketing the Atlas 350 for recommendation inference, LLM processing, and multimodal AI workloads.

Seven key partners – including Kunlun, Huakun Zhenyu, Shenzhou Kuntai and Yangtze Computing – have developed complete system products leveraging the Atlas 350.

These brands have created customized high-performance inference solutions for enterprise customers.

The accelerator is designed to integrate with AI ecosystems, enabling partners to optimize performance for specific workloads while maintaining compatibility with Huawei’s AI software stack.

The Atlas 350 reflects China’s efforts to establish independence in AI computer hardware under US export restrictions.

While Huawei cannot access TSMC’s CoWoS technology, the company has implemented alternative advanced packaging solutions for HBM and memory stacking.

Huawei hasn’t announced exact availability dates – a common practice with AI accelerators – but it launched the Ascend 950PR in Q1 2026 as promised.

The Atlas 350 is reportedly priced at around 111,000 Yuan, or about $16,000, compared to the Nvidia H20, which can range from $15,000 to $25,000.

Via Tom’s hardware

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!

And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.

Must Read

Leave a Comment Cancel Reply