- Tiiny AI Pocket Lab runs large models locally and avoids cloud dependency
- The mini PC performs advanced inference tasks without discrete GPU support
- Models from 10B to 120B parameters work offline within 65W power
Tiiny, an American startup, has released AI Pocket Lab, a pocket-sized AI supercomputer capable of running large language models locally.
The device is a mini PC designed to run advanced inference workloads without cloud access, remote servers or discrete accelerators.
The company states that all processing remains offline, eliminating network latency and limiting external data exposure.
Built to run large models without the cloud
“Cloud AI has brought remarkable progress, but it has also created challenges of dependency, vulnerability and sustainability,” said Samar Bhoj, GTM Director of Tiiny AI.
“With Tiiny AI Pocket Lab, we believe that intelligence should not belong to data centers, but to people. This is the first step towards making advanced AI truly accessible, private and personal, by bringing the power of large models from the cloud to every single device.”
Pocket Lab targets large personal models designed for complex reasoning and long context tasks while operating within a limited 65W power frame.
Tiiny claims consistent performance for models in the 10B–100B parameter range, with support extending to 120B.
This upper bound approaches the capabilities of leading cloud systems, enabling advanced reasoning and extended context to run locally.
Guinness World Records has reportedly certified the hardware for local 100B class model performance.
The system uses a 12-core ARMv9.2 CPU paired with a custom heterogeneous AI module that delivers around 190 TOPS of computing.
The system includes 80GB of LPDDR5X memory along with a 1TB SSD, with total power consumption reportedly staying within a 65W system envelope.
Its physical size is more reminiscent of a large external drive than a workstation, reinforcing its pocket-oriented branding.
Although the specs look similar to a Houmo Manjie M50-style chip, independent real-world performance data is not yet available.
Tiiny also emphasizes an open source ecosystem that supports one-click installation of major models and agent frameworks.
The company states that it will provide continuous updates, including what it describes as OTA hardware upgrades.
This formulation is problematic since over-the-air mechanisms traditionally apply to software.
The statement suggests either imprecise wording or a marketing error rather than literal hardware change.
The engineering approach relies on two software-driven optimizations rather than scaling raw silicon performance.
TurboSparse focuses on selective neuron activation to reduce inference cost without changing the model structure.
PowerInfer distributes workloads across heterogeneous components and coordinates the CPU with a dedicated NPU to approach server-grade throughput with lower power.
The system includes no discrete GPU, and the company claims that careful planning eliminates the need for expensive accelerators.
These claims indicate that efficiency gains, rather than brute force hardware, serve as the primary differentiator.
Tiiny AI positions Pocket Lab as an answer to sustainability, privacy and cost pressures affecting centralized AI services.
Running large language models locally can reduce recurring cloud expenses and limit exposure of sensitive data.
However, claims of capacity, server-grade performance and seamless scaling on such limited hardware are still difficult to independently verify.
Via TechPowerUp
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!
And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.



