The Nov Tech

The Nov Tech

A Chinese Startup Just Beat GPT-5 — With a Model That Costs Nothing

How Moonshot AI’s Kimi K2.5 is proving that algorithmic intelligence beats brute force, and why Silicon Valley is terrified

Novy Baf's avatar
Novy Baf
Feb 01, 2026
∙ Paid

Just a few days ago, an event occurred that could redefine the global AI hierarchy.

A Chinese startup valued at $4.8 billion just published an open-source model that beats GPT-5 on the planet’s hardest benchmarks.

And the most surprising part was? This model can create its own army of AI agents and coordinate them in real time.

This is the explosive return of Kimi K2.5. If you’ve been following AI for a while, you know that every time Moonshot AI announces something, everyone trembles. If not, you’re about to understand why.

When Algorithm Beats Brute Force

It’s late January 2026, and Moonshot AI unveils Kimi K2.5. A one trillion parameter model. Natively multimodal, meaning it handles text, audio, video, and images. And capable of self-organizing into a swarm of 100 sub-agents working simultaneously.

moonshotai/Kimi-K2.5 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.huggingface.co

Hugging Face servers hosting this model are overheating, and the global AI community is holding its breath because what just happened changes the rules of the game.

The coming minutes will reveal this model’s technical leap forward, its operational agent swarm, and the growing unease of US tech companies observing China.

Kimi K2.5 isn’t just another new model. It’s proof that algorithmic efficiency can beat raw power. That’s a powerful message when you know that American export restrictions were supposed to slow China’s AI development.

Let’s start with what jumps out when you look at the official benchmarks.

On HLE, the Human Level Exam tests doctoral-level reasoning capabilities. Kimi K2.5 scores 50%. This is the highest score ever achieved by an open-source model. To put this in perspective, this test comprises 2,500 questions covering domains ranging from theoretical physics to advanced mathematics.

Source: HLE

But the most striking performance is on WebVoyager, a benchmark evaluating a model’s ability to navigate the web completely autonomously. Kimi literally explodes the competition with a 75% score, surpassing even GPT-5.2 and Claude Opus 4.5.

The Architecture That Changes Everything

The technical architecture enabling these performances is fascinating. It’s a system called Mixture of Experts with 384 specialized experts.

If you don’t know what that means, here’s a 30-second refresher.

User's avatar

Continue reading this post for free, courtesy of Novy Baf.

Or purchase a paid subscription.
© 2026 The Nov Tech · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture