Sambanova hits 198 tokens per day. Second on the full, non-distilled Deepseek-R1 671B with only 16 SN40L RDU chips


  • Sambanova runs Deepseek-R1 on 198 tokens/sec using 16 custom chips
  • SN40L RDU -Chip is reportedly 3x faster, 5x more effective than GPUs
  • 5x speed boost promises soon with 100x capacity at the end of the year on the cloud

Chinese AI Upstart Deepseek has very quickly given a name to itself in 2025 with its R1 large open source language model, built for advanced reasoning tasks that show performance on par with the industry’s top models, while it is more cost-effective.

Sambanova Systems, an AI startup founded in 2017 by experts from Sun/Oracle and Stanford University, has now announced what it claims is the world’s fastest implementation of Deepseek-R1 671B LLM to date.

Leave a Comment

Your email address will not be published. Required fields are marked *