OpenAI unveils Jalapeño inference chip with Broadcom
OpenAI has unveiled Jalapeño, a purpose-built LLM inference accelerator developed with Broadcom. The chip combines fixed-function and programmable compute hardware to accelerate the OpenAI LLM stack behind ChatGPT, Codex, the OpenAI API, and planned agentic AI products.
Jalapeño is positioned as conceptually similar to Google’s TPU, but tailored for OpenAI systems and apparently focused on inference rather than training. OpenAI said the chip was designed and reached manufacturing tape-out in just nine months, which it described as the fastest ASIC development cycle among advanced semiconductors.
The accelerator is part of a compute platform planned to span multiple generations, with initial deployment targeted toward the end of 2026. Technical details remain limited, though the chip is described as a contemporary multi-chip module with an interposer, a large centrally located logic tile, and eight HBM3E stacks.