- SOLIDEM 122.88TB SSD supplied the warehouse for a test that involved Nvidia’s Nano Super
- The system was used to run Deepseek and even if it worked it wasn’t fast
- Gen 4 PCIE SSDS speed was limited by Nano Super’s Gen 3 -connection
At the end of 2024, Soliktm added a 122.88TB QLC SSD to its product line. D5-P5336 will be available in U.2 15mm to start and then in E1.L later in 2025, which means it doesn’t fit into a typical consumer PC. Its price is still expected to exceed $ 10,000, so you need deep pockets if you want to buy one.
If you are wondering how such a giant capacity SSD may work, we have the answer – kind – but it does not come in the form of a traditional review.
Storageerview Tested Jetson Orin Nano Super -Nvidia’s compact AI -Single -board -Computer for Edge Computing -to see how it performed on AI development tasks, specifically LLM -inferens. Nano Super comes with a 6-core arm CPU, a 1024-core Ampere GPU and 8 GB LPDDR5 memory. For $ 249 it is an affordable choice for AI developers, but its limited Vram pose a challenge for running LLMs.
Not smooth sailing
“We acknowledged that aboard memory restrictions are challenging driving models with billions of parameters, so we implemented an innovative approach to bypassing these limitations,” the site explained. “Typically, Nano Super’s 8 GB graphics memory limits its ability for smaller models, but we aimed to run a model 45 times larger than what would traditionally fit.”
To do this involved upgrade of Nano Super’s storage with Solidigm’s new U.2 drives, which has a Gen 4 PCIe X4 interface and promises sequential read/writing speeds of up to 7.1 GB/s (read) and 3.3 GB/s (write) along with random benefit of up to 1,269,000 IOPS.
Nano Super has two M.2 NVME bells, both of which offer a PCIe Gen3 connection. The team connected SSD to an 80mm slot that supports the entire four PCIe pitches using a breakout cable to get the most bandwidth and used an ATX power supply to supply 12V and 3.3V to SSD.
While the full potential of the drive was limited by Jetson’s interface, up to 2.5 GB/s of reading speeds still succeeded. Using Airllm, which loads model layer dynamically rather than at once, the site managed to run Deepseek R1 70B distilled, an AI model 45 times greater than what would traditionally fit such a device.
Treatment speed was found to be an important bottleneck for the experiment. Running smaller models worked well, but generating a single token from the 70B model took 4.5 minutes. While not convenient for real-time AI tasks, the test demonstrated how massive storage solutions, like the D5-P5336, can enable larger models in limited environments.
You can see how the test was achieved and the problems that were encountered and overcome along the way, in this YouTube video.