NVDA 199.00 ▼0.52%GOOGL 345.29 ▼0.24%MSFT 365.46 ▼2.27%AMD 519.74 ▼0.02%INTC 131.65 ▼0.48%TSMC 440.83 ▲1.02%AMZN 234.27 ▲0.07%META 557.67 ▼0.81%AAPL 293.08 ▼0.41%PLTR 113.50 ▼2.74%
Markets at last close

Models

Longsys runs 397B model on tiny AMD Ryzen AI Halo PC

·1 min read

Longsys demonstrated a localized 397B-parameter AI model running on its version of AMD’s Ryzen AI Halo, using the same 16-core Ryzen AI Max+ 395 and 128GB of RAM configuration already appearing in systems from other providers. The model was not named, but it appears to be a customized version derived from Alibaba’s Qwen 3.5 397B (A17B), a multimodal Mixture-of-Experts model.

The setup is notable because only 96GB of VRAM is available to the GPU in a 128GB unified configuration, while the model is estimated to require 200-250GB of VRAM. Longsys said its custom SPU and iSA configuration compresses data in real time and uses expert offloading, intelligent cache management and predictive prefetch algorithms to reduce DRAM demands and address I/O latency during inference.

Longsys did not disclose tokens per second or other compute performance details, leaving open questions about practical speed compared with modern AI GPU systems. Even so, the demonstration suggests fast storage can act as a memory-like layer for some local AI workloads, potentially allowing much larger models to run on compact machines than their onboard memory would normally allow.

Originally reported by techradar.comRead the source →
Related coverage