Open source progress continues with the release of GLM 4.6 in Cline
The gap between open and closed source AI models is narrowing faster than expected. This week's releases of Sonnet 4.5 and GLM-4.6 exemplify this convergence, with real-world data showing performance differences measured in basis points, not percentage points.

This week marked an interesting moment in AI development. Monday saw Anthropic release Claude Sonnet 4.5, their latest iteration. Tuesday brought GLM-4.6 from zAI. Both models were well-received, but the community response revealed something deeper; the performance gap between premium and open source models has narrowed to a remarkable degree.
An open source convergence
The Artificial Analysis Coding Index has been tracking this trend for months. Open source models are catching closed source in intelligence, but even faster in coding ability. This isn't speculation; it's measurable in benchmarks, real-world usage, and now, in our own data from millions of Cline operations.

What the diff edit data reveals
Diff edits are the hardest test for AI coding models. They require understanding context, maintaining consistency, and making precise surgical changes to existing code. Unlike generating new code from scratch, diff edits test whether a model truly understands what it's modifying.
We analyzed millions of diff edit operations from Cline users over the past four months. The results tell a clear story about this week's releases.

The performance clustering is striking. Claude 4 Sonnet achieves 95.8% success rate on diff edits. Claude 4.5 Sonnet edges slightly higher at 96.2%. GLM-4.6 comes in at 94.9% success rate. These differences are measurable but marginal; we're talking about a gap of basis points, not percentage points.
For context, just three months ago, the gap between premium and open models on these tasks was 5-10 percentage points. The convergence is accelerating.
Community sentiment tells the story
The Cline Discord lit up this week with discussions about both releases. Users testing Sonnet 4.5 reported noticeable improvements, particularly noting they "only have to correct it half as much" compared to previous versions. The model follows instructions better and produces cleaner initial outputs.
GLM-4.6 generated particular enthusiasm. Users described it as "close to Sonnet at a fraction of the cost" and noted it "benchmarks alongside Sonnet 4.5" in their testing. The excitement wasn't just about raw performance; it was about accessibility.
The economics can't be ignored
The cost differential between these models is substantial:
- Claude Sonnet 4.5: $3 per million input tokens, $15 per million output tokens
- GLM-4.6: $0.50 per million input tokens, $1.75 per million output tokens
- Qwen3 Coder: $0.22 per million input tokens, $0.95 per million output tokens
zAI's GLM Coding Plan takes this further, offering GLM-4.6 access for just $6/month with 120 prompts per 5-hour cycle. For many developers, this transforms AI coding from a luxury to a utility.
What this convergence means
The trend lines are clear. Open source models continue to improve at a faster rate than closed source models. While this trend doesn't imply open source will certainly supersede the closed source frontier, each release narrows the gap further. GLM-4.6 achieving 95% success rate on diff edits would have been unthinkable for an open model six months ago.
The convergence extends beyond cloud models. This week, AMD demonstrated that models like Qwen3 Coder can run effectively on consumer hardware with just 32GB RAM. The gap isn't just closing in the cloud; it's closing on your laptop.
As models converge in capability, differentiation will come from ecosystem, tooling, and specialized features. The fundamentals of good code generation are becoming commoditized. What matters next is how these models integrate into developer workflows.
Ready to try these models? Download Cline and experiment with both GLM-4.6 and Sonnet 4.5. Join the conversation on Reddit and Discord to share your experiences.