- Skymizer claims that giant AI models no longer need hyperscale GPU infrastructure
- Old 28nm chips suddenly drive massive language models with surprisingly low wattage
- The HTX301 squeezes 384GB of memory into a single PCIe accelerator card
A Taiwanese company called Skymizer has unveiled a PCIe AI accelerator that challenges both AMD and Nvidia using surprisingly old technology.
The HTX301 board can run language models with up to 700 billion parameters on a single device while consuming only 240 watts of power.
The card achieves this feat by using older 28-nanometer chips and standard LPDDR4 and LPDDR5 memory instead of expensive HBM or GDDR solutions.
Old technology chip competes with modern AI accelerators
Skymizer claims their card delivers 30 tokens per second with only 0.5 TOPS at 100 GB per second bandwidth.
The HTX301 is built on Skymizer’s HyperThought platform, which features next-generation LPU IP designed specifically for large language model workloads.
Each PCIe card contains six HTX301 chips working together, and the card offers up to 384GB of total memory capacity.
The design uses efficient compression techniques for both weights and KV cache, outperforming the open source llama.cpp by 9 to 17.8 percent.
Its power consumption is less than half of what leading PCIe AI accelerators from AMD and NVIDIA typically require.
The board supports agent AI for coding, automation, and domain-specific workflows without the need for GPU cluster hyperscaling.
Running large language models in the cloud introduces privacy issues and unpredictable costs that many organizations find unacceptable.
Upgrading the on-premises infrastructure to support massive GPU accelerator platforms often requires expensive redesigns of data center power and cooling systems.
Skymizer’s HTX301 offers businesses a third option that fits into standard air-cooled servers without infrastructure changes.
The company claims the era of using hyperscale GPU clusters for ultra-large LLMs is over with their new technology.
The PCIe card form factor enables enterprises to scale AI inferences on-premises while maintaining data sovereignty and predictable infrastructure costs.
Skymizer HTX301 awaits real-world testing
Skymizer will preview the HTX301 at Computex this year, enabling independent verification of its performance numbers.
The specs for this chip look impressive on paper, but real-world testing will determine if the card actually delivers 240 tokens per second on Llama2 7B workloads.
AMD recently launched its Instinct MI350P PCIe card with 144GB of HBM3E memory and up to 4,600 peak TFLOPS at MXFP4 precision, yet it uses significantly more power than Skymizer’s offering.
Nvidia’s RTX PRO 6000 Blackwell consumes about 600 watts, more than double what Skymizer’s card requires for comparable inference tasks.
Should the HTX301 perform as advertised, it could dramatically lower the barrier to entry for on-premises AI infrastructure.
Failure to deliver would place Skymizer among the many startups that failed to back up their promises.
Via Wccftech
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds.



