Chinese AI Just Beat Claude Code Opus (For 1/6 the Cost)

Everything you need to know about GLM-5.2, the open model from China everyone in AI is talking about. What's real, what's hype, and what it means for you.

Jul 01, 2026

In case you missed it:

TLDR… GLM-5.2 is a new open-weight AI model from Zhipu AI (Z.ai) in China: a 750-billion-parameter Mixture-of-Experts model with a 1-million-token memory and an MIT license, so anyone can download and run it for free. It matters because the best FREE, open AI is now only about ~6 months behind the best PAID, closed AI from the US labs, and on one Semgrep security test it out-scored Claude Code for about 1/6 the cost. Here’s what’s real, what’s hype, and how to try it on a Mac or Windows.

A model you’ve never heard of, from a company most people can’t pronounce, just out-scored Claude Code on a hard security test.

For ~1/6 the price.

It’s called GLM-5.2.

It’s open, it’s cheap, and AI researchers are raving!

So I read the actual benchmark from Semgrep and the fine print most people skipped past.

Here’s what’s REAL vs what’s HYPE…

1. So what IS GLM-5.2?

GLM-5.2 is the newest AI model from a Chinese company called Zhipu AI, which goes by Z.ai.

They dropped it on a Saturday, June 13th, and published the model itself 3 days later.

3 things make it different from Claude or ChatGPT.

It’s open-weight. The model’s “brain” (the file that holds everything it learned) is published for anyone to download, run, and tweak. Claude and ChatGPT are the opposite: you rent access, you never get the model itself.
It’s free to use commercially. The weights ship under an MIT license, which is about as permissive as it gets. You can build a business on it and owe Z.ai nothing.
It’s big, but cheaper to run than it looks. It has around 750 billion parameters (the dials that store what it knows), but only about 40 billion switch on for each word. So it runs at a fraction of the cost you’d expect from its size.

It can hold ~1 million tokens in mind at once, roughly 750,000 words. That’s up from 200,000 in the last version.

Translation:

You can feed it an entire book series and it keeps track.

However, “open-weight” is NOT the same as “open-source.”

Open-weight means you get the finished brain, but not the recipe (the training data stays private).

2. Why the AI world lost its mind

Open models from China have been creeping up on the expensive American ones for a while.

GLM-5.2 is the moment a lot of researchers say they finally caught up on the thing that matters most right now:

AI agents that DO tasks, not just chat.

The receipts:

It tops the open-model rankings on the coding and agent tests people actually track (Terminal-Bench, SWE-bench Pro, and friends), the best any open model has done.
On several long, multi-step coding tests it even edges out GPT-5.5, for a fraction of the cost (per VentureBeat).
On a public agent leaderboard, it’s the ONE open model mixing it up with the latest from OpenAI and Anthropic.
Nathan Lambert, one of the most respected open-model researchers, called it “the step change for open agents” and compared the moment to DeepSeek R1, the last time an open model rattled everyone this hard.

Lambert points out the gap between the best closed model and the best open one is now about 6.8 months. Not years. Months!

The expensive, closed AI you pay for today…
is roughly what you’ll be able to download for free in ~6 months.
That’s the big deal!

3. About that “beat Claude” headline (read this part)

This is where most AI news got sloppy, so I’ll be careful.

A security company called Semgrep ran a bunch of AI models against the same test: find a common website security flaw called IDOR (the kind of bug where you change a number in a web address and suddenly see someone else’s data).

The result that made headlines:

GLM-5.2 scored 39% and beat Claude Code Opus (which scored about 32%) on the same bare setup, at about $0.17 per bug found. Roughly 1/6 the cost.

True. But here’s what the headline leaves out, straight from Semgrep’s own write-up:

It was ONE task, one dataset, one run. Not a sweeping “open models have caught up.” In their words: one open model, on this task, under these conditions.
The setup around the model mattered more than the model. Semgrep’s own purpose-built security tool still won by a mile (53 to 61%), whether it ran on GPT or Claude underneath.
It was tested against Claude Code, not Claude’s top model Fable/Mythos. GLM-5.2 did NOT beat Fable 5. It beat the Opus-powered coding agent on a narrow security task.

So the honest version is less sexy than “China beats Claude,” but more useful:

A cheap, open model just matched an expensive, closed one on a real task.
Not everywhere. Not always.
But it happened, and that was UNTHINKABLE just 1 year ago.

4. Why “open” is the whole point

This is the part that actually matters…

Right now, the best AI on earth lives inside a few US companies.

You don’t own it. You rent it.

They set the price. They set the rules.

They decide what it’s allowed to say and do.

And they can change ANY of that overnight. A price hike. A model getting retired. A new rule about what you can ask.

You have no say.

An open-weight model is the ONLY real counterweight to that.

When the model’s brain is a file YOU can download, the power tips back to the PEOPLE:

Price stops being their decision. Free, open models force the expensive labs to compete. Their prices drop whether you ever touch GLM-5.2 or not.
Your data can stay home. You can run an open model fully inside your own computer or company, so nothing you type ever leaves the building. For anyone handling private or client work, that’s a big deal.
Nobody can rug-pull you. Once those weights exist, they exist forever. No subscription to cancel on you, no model getting deprecated, no terms-of-service surprise.
Anyone can build on it. The next great AI app doesn’t need one company’s blessing. That’s how progress stays fast and cheap.

Here’s why GLM-5.2 specifically is a turning point.

For YEARS, “open” meant… “fine for hobbyists, not for real work.”

That’s over.

The best open model is now about ~6 months behind the best closed one… and the gap is hopefully closing.

And look at WHO is carrying open AI right now:

CHINESE labs like Z.ai, while the big US labs become more closed every month.

If “open” loses this race, here’s the future we get: 1 or 2 companies own AI that’s 100X better than everything else, and we are stuck with their decisions.

THAT is why this matters.

Not the benchmark.

The balance of power.

5. How to try GLM-5.2 (the easy ways)

You do not need to install anything to test it.

Easiest: open Z.ai’s chat at chat.z.ai and just talk to it like ChatGPT. You get a free tier, then it caps.
Pay-as-you-go: through OpenRouter or the Z.ai API, it runs roughly $1 per million tokens of input and $3 to $4 per million of output, depending on the provider, and the price keeps dropping as hosts compete. For most people that’s a few dollars a month of real use.
Flat monthly plan: Z.ai’s “GLM Coding Plan” starts at $18/month (Lite), $72/month (Pro), or $160/month (Max), and it plugs GLM-5.2 into coding tools like Cursor and Cline.

For 99% of you, the easy browser chat option #1 is all you need to form an opinion.

6. How to set it up on your computer (Mac vs Windows) (ADVANCED)

Skip this if you just want to use it. The browser chat above is the easy path. This section is for people who want it running on their own machine.

But first, let’s be real…

GLM-5.2 is a 750-BILLION-parameter model.

The full version needs roughly 250+ GB of memory to run.

A normal MacBook or Windows laptop CANNOT run the real thing.

There are only 2 reasons to bother putting it on your computer.

Reason 1: you want GLM-5.2 inside your OWN tools

Say you want it powering a code editor, an automation, or an app on your machine. You install one small program called Ollama, and then any tool on your computer can call GLM-5.2 the same way.

On a Mac: go to ollama.com/download, download the Mac app, and open it. Then open the “Terminal” app and type ollama run glm-5.2:cloud. Sign in when it asks.
On Windows: go to the same ollama.com/download page, run the Windows installer, then open “PowerShell” (search for it in the Start menu) and type ollama run glm-5.2:cloud. Sign in when prompted.

Now this trips people up: you’re still running GLM-5.2 on Ollama’s servers, NOT your machine. You get no privacy and no offline use out of this. The ONLY thing you’re buying is 1 local hookup that every app on your computer can point at. If you don’t need that, close this and just use the browser chat.

Reason 2: you want it 100% private and fully offline

THIS is the version that truly runs on your own hardware, where nothing you type ever leaves the room. It’s also the expensive one.

To run a shrunk-down (2-bit) version offline, you need about 240 GB of memory, which in practice means a maxed-out Mac Studio (256 GB) or a workstation with a serious graphics card plus 256 GB of system RAM. No normal computer comes close.

If that’s you, the simplest path is LM Studio (a free app for Mac and Windows): install it, search “GLM-5.2” in its model browser, pick a quantized version that fits your memory, and download it. Expect about 3-9 words per second on the best consumer machines. Slow AF, but it’s yours.

For everyone else, stick to the browser.

Recap

GLM-5.2 is not “the Claude killer” and anyone screaming that didn’t read the study.

What it IS: proof that the gap between the expensive AI you rent and the free AI you can own shrank from years to months.

That’s good for all of us, no matter which tools you use.

Cheaper AI. More choice. Less lock-in.

I still reach for Claude Code for my daily work.

But I’m watching the open models closely now, and you should too.

The big story isn’t “China beat Claude”
It’s that good AI is getting cheap and hard to lock up.

FAQ

Is GLM-5.2 free? The model weights are free under an MIT license if you download and run them yourself. Using it through a hosted service (the browser chat, an API, or a coding plan) costs money, though far less than the big closed models. There’s no permanent free unlimited API.

Can I run GLM-5.2 on my laptop? Not the full model. It needs around 250 GB of memory, which no normal laptop has. You can use it on any computer through Ollama’s glm-5.2:cloud option (which runs on Ollama’s cloud servers, not your machine) or the browser chat. True offline use needs a 256 GB Mac Studio or a heavy-duty workstation.

Is GLM-5.2 better than Claude or ChatGPT? On some coding and agent benchmarks it’s the best open model and competitive with the top closed ones. For everyday writing and content tasks, Claude, ChatGPT, and Gemini are still smoother for beginners. A benchmark win does not always match your day-to-day experience.

Did GLM-5.2 really beat Claude? On one security test from Semgrep, GLM-5.2 out-scored Claude Code running Opus 4.8, at about 1/6 the cost. But it was one task, one run, it did not beat Claude’s top Fable 5 model, and Semgrep’s own security tool still beat both. Read it as “a cheap open model matched an expensive one once,” not “Claude lost.”

What’s the cheapest way to use GLM-5.2? The browser chat at chat.z.ai to try it free, then pay-as-you-go through OpenRouter or Z.ai (a few dollars a month for most people) if you want it in your own tools.

Do I need to be technical to use it? No. The browser chat works like ChatGPT. You only need technical skills for the run-it-on-your-own-hardware path in Section 6, which most people should skip.

P.S. Need More Help? 👋

1/ Free AI courses

2/ Free AI prompts

3/ Free AI automations

4/ Free AI vibe coding

5/ Ask me anything @ Friday livestream

6/ Free private community for Women Building AI

7/ I built Blotato to grow 1M+ followers in 1 year

Sabrina Ramonov 🍄

Discussion about this post

Ready for more?