MIT software boosts SSD speeds by moving data between drives in large storage clusters, but it’s only for use in data centers

Sandook software coordinates many SSDs to avoid slowdowns from garbage collection
Two-tier control system redirects workloads across pooled drives in real-time
Performance gains approach theoretical limits but depend on large clustered storage environments

Researchers at MIT and Tufts University have built a storage management system called Sandook that pushes pooled SSDs closer to their theoretical limits. The project targets a long-standing problem in large storage clusters where identical drives rarely work in identical ways.

Solid-state drives slow down for a number of reasons, including internal garbage collection cycles and the slower nature of write operations compared to reads. These slowdowns can ripple across workloads when multiple applications share the same storage pool.

Rather than leaving each SSD to handle performance issues alone, the system divides control tasks across two coordinated layers that manage activity across the entire drive pool.

The article continues below

Unlock the potential of data center SSDs

Seam Blocks and files reports, a central controller collects performance telemetry from each large SSD and revises scheduling choices about 5x per second.

Local agents inside storage servers relay performance signals and congestion warnings as workloads change.

When a drive begins housekeeping tasks such as garbage collection, the system lowers its priority and transfers the traffic to healthier drives in the pool. This redirection occurs without requiring changes to applications that access the repository.

The method builds on techniques already used in enterprise storage, including block replication for reads and log-structured writes that can land on any available device.

Trials included database processing, neural network training, large-scale image compression and latency-critical storage services, and the system reportedly delivered between 30 and 82 percent higher raw input and output throughput compared to previous approaches that targeted single bottlenecks.

Across pooled workloads, application performance gains ranged from 12 to 94 percent, with latency reductions of up to 88 percent. In some cases, stock throughput reached about 1.7 times previous levels.

The gain comes entirely from software, meaning off-the-shelf SSDs remain unchanged. CPU and memory costs for monitoring dozens of drives per server was described as minimal.

The research paper, titled “Unleashing the Potential of Datacenter SSDs by Taming Performance Variability,” is available to view here.

Despite the headlines, this is not something most consumers could drive at home. The design relies on large groups of SSDs working together, along with Linux-based infrastructure and enterprise network setups common in data centers.

That pooling effect is where most of the performance improvement comes from. Without additional drives to move workloads to, a single drive system would see no benefit.

Blocks and files notes that the work will be discussed at the USENIX NSDI 2026 event in May, where the researchers plan to show how coordinated scheduling helps resolve unpredictable SSD behavior across large clusters.

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds.

Must Read

Leave a Comment Cancel Reply