- Meta’s 1700W superchip delivers 30 PFLOPs and 512GB HBM memory
- MTIA 450 and 500 prioritize conclusions over pre-training workloads
- Future MTIA generations will support GenAI inference and ranking of workloads
Meta advances its AI infrastructure with a portfolio of custom MTIA chips designed specifically for inference workloads across its apps.
The company is developing a 1700W superchip capable of 30 PFLOPs and 512GB HBM, integrated into the same MTIA infrastructure to handle large-scale inference tasks.
Interestingly, it achieves this feat without any of its friends – no Nvidia, AMD, Intel or ARM.
The article continues below
According to Meta, hundreds of thousands of MTIA chips are already deployed in production, supporting ranking, recommendation and ad serving workloads.
These chips are part of a full-stack system optimized for Meta’s specific requirements, achieving higher computational efficiency than mainstream hardware for its intended workloads.
Unlike other hyperscalers such as Google, AWS, Microsoft and Apple, Meta pursues a fully customized silicon strategy.
This design prioritizes efficiency over general utility, allowing inference to run more cost-effectively than on regular GPUs or CPUs.
It maintains compatibility with industry standard software such as PyTorch, vLLM and Triton.
Meta’s MTIA roadmap foresees four new generations of chips over the next two years, including the MTIA 300, which is currently in production for ranking and recommendations.
Future generations – MTIA 400, 450 and 500 – will expand support for GenAI inference workloads with designs that can fit into existing rack infrastructure.
Meta emphasizes rapid, iterative development, releasing new chips approximately every six months through modular and reusable designs.
The modular design allows new chips to drop into existing rack systems, reducing deployment friction and accelerating time to production.
The approach enables the company to adopt new AI techniques and hardware improvements faster than competitors, who typically cycle one to two years per year. generation.
Unlike most mainstream AI chips, which prioritize large-scale GenAI pretraining and later adapt to inference, Meta’s MTIA 450 and 500 focus on inference workloads first.
The chips can also support other tasks, including ranking and recommendation training or GenAI training, but their design keeps them aligned with the expected growth in demand for inference.
Meta’s system-level design conforms to Open Compute Project standards, enabling frictionless deployment in data centers while maintaining high computational efficiency.
The company acknowledges that no single chip can handle the full range of its AI workloads.
This is why it deploys multiple MTIA generations alongside complementary silicon from other vendors.
The strategy aims to balance flexibility and performance while accelerating innovation towards personal superintelligence.
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!
And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.



