- Huawei wants UB-MASH to unite fragmented interconnection standards across massive AI clusters
- UB-MASH Design Mixes Close backbone with multidimensional rack-level masks for scalability
- Traditional interconnections are growing too expensive at large -scale implementations
Huawei has revealed plans for Open Source its UB-Mash Interconnect, a system aimed at combining how processors, memory and networking equipment communicate across massive AI data centers.
The UB-Mesh design combines a closure-based backbone of the data hall level with multidimensional masks inside each tripod.
By combining these topologies, Huawei claims that it can keep costs under control even when system sizes are scaled into tens of thousands of nodes. It also hopes to solve the question of scaling AI workloads, where latency and hardware errors make up barriers.
Replacing fragmented standards with a single frame
The move is beaten as a way to replace multiple overlapping standards with a single frame, which potentially transforms how large -scale computer infrastructure is built and operated.
Simply put, Huawei wants to replace today’s mix of different connecting rules with a universal system so that all connects easier and cheap.
“Next month we have a conference where we will announce that the UB-Mesh Protocol will be published and revealed to anyone as a free license,” said Heng Liao, chief scientist of Hisilicon, Huawei’s processing arm.
“This is a very new technology; we see competing standardization efforts from different camps. […] Depending on how successful we are in implementing actual systems and demand from partners and customers, we can talk about transforming it into a kind of standard. “
One of the key arguments behind Ub-Mesh is that traditional interconnections are growing too expensive on scale and eventually costs more than the accelerators they are intended to connect.
Huawei points to his own demonstrations where an 8,192-node implementation was used as evidence that costs do not have to rise linearly.
This is framed as important for the future of AI systems built with millions of processors, high-speed network units and massive storage arrays, such as the largest SSD systems used in cloud storage operations.
UB-MASH is part of a wider idea that Huawei calls the supernode. This refers to a data center scale cluster where CPUs, GPUs, memory, SSD devices and switches can all work as if they were inside a single machine.
Bandwidth requirements over a terabyte per Second per Unit and Sub-Microsecond Latene Time are placed as proof that the concept is not only possible but necessary for the next generation computing.
However, standards such as PCIE, NVLink, Ualink and Ultra Ethernet have already support from several companies throughout the semiconductor and the networking industry.
The question now is whether the industry will accept a new Huawei-supported protocol or continue to favor standards that are already supported by a wider range of businesses.
Although Huawei’s proposal while ambitious, customers are able to introduce a protocol owned and controlled by a supplier.
Even with open source licenses, there is concern about prolonged interoperability, governance and geopolitical risks.
That said, Huawei’s technical potential sounds impressive, but its step requires a degree of industrial trust and adoption that it has not yet secured.
Via Toms hardware



