one year on
Alibaba open-sources Qwen2.5-Coder-32B, says it matches GPT-4o's coding capabilities
The 32.5-billion-parameter model tops open-source code leaderboards and is competitive with GPT-4o, a sign that Chinese open-weight models are closing the gap with frontier US systems.
Alibaba’s Qwen team today releases the Qwen2.5-Coder-32B-Instruct model, an open-weight code model that the company says matches the coding capabilities of GPT-4o and leads open-source models on EvalPlus, LiveCodeBench and BigCodeBench. The 32.5-billion-parameter model is licensed under Apache 2.0.
The release is part of a broader family spanning six sizes from 0.5B to 32B, all but the 3B variant under permissive licenses. The 32B flagship scores 73.7 on the Aider code repair benchmark and 65.9 on McEval across 40+ programming languages. The team also explores the model in code assistants and Artifacts scenarios, including Cursor.
The release positions Qwen2.5-Coder as the current state-of-the-art open code model, and it is another sign that Chinese open-weight models — already strong in general language tasks — are catching up specifically in coding. The discussion today is about whether local open models have reached parity with cloud-only services for everyday development work.
The record
One year later — open only if you can handle spoilers
Qwen2.5-Coder-32B became a staple of local coding setups through 2025. While GPT-4o remained a strong baseline, subsequent models like DeepSeek-Coder and Llama 4 narrowed the gap further, but this release was a key moment mainstreaming the idea that a home PC could rival cloud APIs for code generation.