NVDA 208.65 ▼0.97%GOOGL 349.68 ▼4.99%MSFT 367.34 ▼3.18%AMD 551.63 ▲2.65%INTC 140.94 ▲5.19%TSMC 467.67 ▲1.20%AMZN 232.79 ▼4.75%META 563.85 ▼2.32%AAPL 297.01 ▼0.34%PLTR 119.50 ▼6.98%
Markets at last close

Intel · Chips

Intel and AMD define ACE extensions for CPU-based AI

·1 min read

Intel and AMD have released the full specification for ACE, a new set of CPU extensions designed to make AI workloads easier and more power-efficient on x86 processors. The approach targets smaller models, single-user latency-sensitive tasks, and systems where a GPU is unavailable or limited.

ACE builds on existing AVX10 registers while adding silicon dedicated to matrix multiplication, a core operation in AI workloads. The design uses AVX’s 512-bit inputs to simplify integration with current CPU designs. For the same number of input vectors, ACE can perform 16x as many operations as AVX10, though actual speed gains will depend on each implementation.

The extensions are intended to reduce instruction overhead, improve power efficiency, and potentially make better use of RAM bandwidth. ACE is also implementation-agnostic, giving machine learning frameworks and libraries such as PyTorch and TensorFlow a consistent code path instead of requiring multiple variations for different AVX support levels.

ACE natively supports common ML data types including INT8, INT32, FP8, FP16, FP32, and BF16, along with Open Compute Project’s MX block-scaled formats. That could let developers move some NPU-specific workloads back to the CPU when they need a faster, more consistent target across x86 hardware.

Originally reported by tomshardware.comRead the source →
Related coverage
All Intel news →