Z.ai launches GLM-5.2: 1M context, Coding Plan only, API to follow

GLM-5.2 is live on every GLM Coding Plan tier with a usable 1M-token context and High and Max effort levels. No standalone API yet, no open weights yet, and no benchmarks at launch. Here is what is actually confirmed.

3 min readWritten byAgents Directory's profileAgents Directory

Z.ai released GLM-5.2 on June 13, 2026. It is a coding-first iteration of the GLM line with a usable 1M-token context window, and it shipped straight into the GLM Coding Plan rather than the API. If you are on any Coding Plan tier, you already have it. If you were waiting to call it from your own code with an API key, that part is not here yet.

We have added a GLM-5.2 model page and a Z.ai provider page. Both carry the same caveat you will read below: this is a launch on a subscription product, not a public API release, and there are no benchmark numbers to rank it on yet.

What is confirmed

Z.ai kept the launch details narrow. These are the specs the company stated:

  • 1M-token context window. Roughly a 5x jump from GLM-5.1's 200K window, and Z.ai describes it as "usable" across the full length rather than a theoretical maximum.
  • Up to 131,072 output tokens in a single response.
  • Two thinking-effort levels, High and Max. Z.ai recommends Max for complex, multi-step coding. Inside Claude Code the /effort command maps low, medium, and high to High, and xhigh, max, and ultracode to Max.
  • Model id glm-5.2[1m].
  • An Anthropic-compatible endpoint, so agents built for Claude work after a base-URL and key swap.

That is the whole confirmed surface. Z.ai did not publish an architecture breakdown for 5.2. Several outlets report it reuses GLM-5's 744B-parameter Mixture-of-Experts design with 40B active per token, but that is unconfirmed for this specific release, so treat it as reporting rather than spec.

How to use it today

GLM-5.2 is live on every GLM Coding Plan tier: Lite, Pro, Max, and Team. There is no extra charge on top of the subscription. Access is through the plan's Anthropic-compatible endpoint, and Z.ai lists day-one support across eight agentic coding tools: Claude Code, Cline, OpenCode, Roo Code, OpenClaw, Kilo Code, Crush, and Goose.

The plan meters usage in prompts, not tokens, with rolling 5-hour and weekly caps. Rough allowances:

  • Lite: about 400 prompts per week, starting around $18 per month.
  • Pro: about 2,000 prompts per week.
  • Max: about 8,000 prompts per week.
  • Team: seat-based pricing.

One detail worth knowing if you run heavy: the plan normally applies quota multipliers that rise during peak hours, but Z.ai is running a promotion where GLM-5.2 consumes 1x quota during off-peak hours through the end of September.

What is not here yet

This is the part to be clear about, because a lot of the launch-day coverage blurred it.

  • No standalone API. You cannot buy a metered, per-token API key for GLM-5.2 today. Z.ai says the API and chatbot access arrive "next week," with no firm date.
  • No open weights yet. The model is announced as MIT-licensed, and the weights are promised for "next week," but they are not downloadable at launch. Until they drop, this is an open-weight model on paper, not in practice.
  • Not on OpenRouter. As of this writing it has no third-party hosting, so there is no neutral per-token price to quote.
  • No benchmarks. Z.ai published zero scores at launch: no SWE-bench, no Terminal-Bench, no Code Arena. We do not list it on any of our leaderboards yet for that reason. Any "GLM-5.2 beats X" claim circulating right now is extrapolation from older GLM results, not a measured number for 5.2.

We will update the GLM-5.2 model page the moment the API and weights land and the first independent scores publish. Until then, the honest summary is simple: strong on paper, genuinely useful if you live inside a coding tool on the Coding Plan, and unproven everywhere a number would settle it.

Sources

Share: