Tech startup proposes new way to tackle massive LLMs using fastest memory available to mankind


  • GPU-like PCIe cards offer 10pflops of FP4 Compute Power and 2GB of SRAM
  • SRAM is usually used in small amounts as cache in processors (L1 to L3)
  • It also uses LPDDR5 rather than far more expensive HBM memory

Microsoft-backed Silicon Valley startup D-Matrix has developed a chiplet-based solution designed for fast, small-batch inference of LLMs in enterprise environments. Its architecture takes an all-digital computer-in-memory approach using modified SRAM cells for speed and energy efficiency.

Corsair, D-Matrix’s current product, is described as a “first-of-its-kind AI computing platform” and features two D-Matrix ASICs on a full-height PCIe card with four chiplets per chip. ASIC. It achieves a total of 9.6 PFLOPS of FP4 Compute Power with 2GB of SRAM-based performance memory. Unlike traditional designs that rely on expensive HBM, Corsair uses LPDDR5 capacity memory with up to 256 GB per Short for handling larger models or batch inference workloads.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top