The Invasion of the Middle Kingdom#

The New Standard-Bearer#

It was undeniable that throughout 2024 China had been strong in OSS, but the moment it fully took over the space didn’t arrive until April 2025.

After Mark Zuckerberg promised “the best open model in the world” with Llama 4—and ended up delivering a series of dense, massive models trained on benchmarks to game them—Alibaba’s Qwen was left with a completely open field to become the open source standard. Qwen 2.5 was already the most widely used model for fine-tuning and experimentation in the new reasoning paradigm. Its results were solid and inspired confidence.

Expectations for the next version were enormous. And in April, Qwen 3 arrived: a family of dense and MoE models ranging from 0.6B to 256B parameters, hybrid at first and later with pure instruct and pure thinking variants. No tricks, no hype campaigns—just a lineup of good models that the community adopted immediately and gratefully.

With that, Llama had fallen. And with it, the last traces of the West seriously competing in open models during 2025.

Seeing the disadvantage, OpenAI even tried to step into OSS with GPT-OSS 120B and 20B. While they proved to be strong reasoners in math and code, there are important areas beyond those two domains where these models fall short.

The Frontier Trident#

Few people know that on the very same day DeepSeek R1 was released, another Chinese lab announced it had a reasoning model comparable to O1: Moonshot AI presented Kimi K 1.5. It went largely unnoticed because they only published a technical report and offered limited access—but they weren’t in a rush, because their moment would come months later.

In July 2025, they launched Kimi K2. Although officially described as a “non-reasoning” model, it included a considerable amount of RL and post-training geared toward agent behavior. It quickly rose to the top tier: 1T total parameters with 32B active. This time, it was a proper launch—technical report, open model, and free access via their website. It effectively became the second DeepSeek.
The reasoning version of K2 arrived in November and was just as surprising—only more powerful.

That same July, the third member of the new Chinese trident appeared: GLM 4.5, from Zhipu AI. With 355B total parameters and 32B active, it became a favorite for many in code-related tasks and tool use, competing directly with the undisputed king of that area: Sonnet. With a subsequent update—GLM 4.6—it managed to hold its place on the podium.

By this point, DeepSeek was beginning to fall behind its younger cousins. The community was loudly calling for a DeepSeek R2.

What arrived was something different. In December Professional 2025, DeepSeek resurfaced with DeepSeek V3.2, a model capable of approaching Google’s powerful Gemini 3 Pro—but at an absurdly low price: just 42 cents per million output tokens. They also released DeepSeek Math 2, specialized in mathematics.
In their report, they openly admitted that their models were still inferior to Western closed models in world knowledge, but stated that they planned to solve this through much larger-scale pretraining.

There are two other promising labs that released extremely solid models:
InclusionAI, with Ling-1T,
LongCat, with LongCat-Chat-Thinking at 500B parameters.
Meanwhile, Western releases were relegated to competing in mid-sized models: Microsoft’s Phi series, Mistral 3 Large—yet neither led that category.

It’s strange to think that by the end of 2025, the top three open models in the world are entirely Chinese. And that the default family for research, customization, and experimentation is as well.
The Middle Kingdom has not only entered the map—it has taken over the entire map.