NVDA 208.65 ▼0.97%GOOGL 349.68 ▼4.99%MSFT 367.34 ▼3.18%AMD 551.63 ▲2.65%INTC 140.94 ▲5.19%TSMC 467.67 ▲1.20%AMZN 232.79 ▼4.75%META 563.85 ▼2.32%AAPL 297.01 ▼0.34%PLTR 119.50 ▼6.98%
Markets at last close

Models

Moonshot AI launches Kimi K2, a trillion-parameter MoE language model

·1 min read

Moonshot AI´s Kimi K2 stands out as an ambitious initiative in the rapidly evolving landscape of large language models. Engineered as a massive Mixture-of-Experts (MoE) architecture, Kimi K2 features an extraordinary 1 trillion total parameters, activating 32 billion of those during any single forward pass. This scale places it among the most sophisticated Artificial Intelligence models available for public and enterprise use.

Kimi K2 has been specifically optimized for agentic behaviors, which means it is designed not only to process and generate complex language, but also to dynamically use external tools, synthesize reliable code, and produce structured reasoning outputs. The model´s agentic capabilities are apparent in its performance on a diverse set of industry benchmarks, including but not limited to coding (LiveCodeBench, SWE-bench), logical reasoning (ZebraLogic, GPQA), and tool-use tasks (Tau2, AceBench). These results point to a model that is both versatile and competitive within the highly specialized domains it targets.

Long-context understanding is central to Kimi K2´s offering. With support for up to 128,000 tokens in a single prompt, Kimi K2 can process extensive documents, technical manuals, or intricate source codebases, making it especially appealing for developers, researchers, and organizations dealing with large-scale textual data. Its training leveraged a novel stack, prominently featuring the MuonClip optimizer. This innovation plays a crucial role in enabling stable and efficient large-scale training in an MoE setting, which is often prone to instability. Deployment options via OpenRouter and availability of model weights through Hugging Face further underscore Moonshot AI´s commitment to broad accessibility for experimentation and integration with mainstream Artificial Intelligence platforms.

Originally reported by openrouter.aiRead the source →
Related coverage