Lexar tests SSD offloading for local AI models

19 June 2026, 02:43·1 min read

Lexar is developing storage technology designed to reduce the DRAM burden of running local AI models on PCs. CTO Daniel Guo said DRAM is about six times more expensive to manufacture than NAND Flash, creating an opportunity to use SSDs to support local deployments more efficiently. The Lexar AI Storage Core SSD is intended to offload large language models to storage, allowing larger models to fit into PC builds while reducing memory footprint by at least 40%.

In internal testing, Lexar ran the Qwen 3.5 122B AI model on a local PC. The company said its AI suite and Lexar AI Storage Core SSD can cut the DRAM requirement from 128 GB to 32 GB, while running a model with 35 billion parameters at 15.6 tokens per second versus 5.2 tokens per second using traditional frameworks. When loading the 122B model on 32 GB of DRAM, Llama.cpp failed and crashed, while Lexar’s SSD offloading reached about 4.4 tokens per second.

Originally reported by techpowerup.comRead the source →