Instructions to use zai-org/GLM-5.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zai-org/GLM-5.2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="zai-org/GLM-5.2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("zai-org/GLM-5.2") model = AutoModelForMultimodalLM.from_pretrained("zai-org/GLM-5.2") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use zai-org/GLM-5.2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zai-org/GLM-5.2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/GLM-5.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/zai-org/GLM-5.2
- SGLang
How to use zai-org/GLM-5.2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "zai-org/GLM-5.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/GLM-5.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "zai-org/GLM-5.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/GLM-5.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use zai-org/GLM-5.2 with Docker Model Runner:
docker model run hf.co/zai-org/GLM-5.2
Please FUCK Anthropic
Please FUCK Anthropic
please fuck closeai
Yes
yes do it
yes
There have some way to fuck Anthropic
YES
YES 👍
Yes
scamthropic must be slain by glorious GLM open source, god bless z.ai
yes!
yes
yes
I'm no fan of Anthropic but let's be real. GLM 5.2 notably regressed across the board compared to GLM 5.1 and is generally much worse than Anthropic's, OpenAI's... SOTA models, and they still couldn't compete with Anthropic in coding despite singling it out for gross overfitting.
And test scores like artificial analysis don't accurately represent the real-world performance of most Chinese models. I've noticed a consistent under performance in the covered domains by Chinese models, as confirmed by organizations like CAISI who use hidden evals. For some reason the Chinese business culture rewards subtle cheating, and their scores are consistently ~5+ points higher than they'd otherwise be, and often much higher.
I'm rooting for OS models like GLM, but every expert in this industry knows this model isn't close to being competitive with Anthropic's. Regardless, if the OS AI community was healthy, and not dominated by toxic socially inept coders, the discussion "Please Fuck Anthropic" would have been either ignored or roundly criticized. The free ride stage of AI development is ending. It's now time for proprietary AI companies to make a profit. So of course they were going to transition away from small monthly subscriptions for nearly uncapped use, especially with the rise of token hungry agents, that didn't even cover the cost of electricity. Anthropic isn't the bad guy.
@phil111 Your claims are factually bankrupt. GLM 5.2 demonstrably improved over 5.1 in capability benchmarks—you're simply parroting FUD without evidence. Meanwhile, Anthropic spent the last year proving it's the industry benchmark for operational incompetence and customer exploitation.
You want to talk about "real-world performance"? Anthropic's "real-world" track record includes: leaking half a million lines of proprietary source code [1]; falsely banning over 60 employees with zero human appeal beyond a Google Form [2]; DMCA-bombing thousands of innocent GitHub repositories in a failed censorship attempt [3]; charging users $200+ because their abuse filter triggered on the filename "HERMES.md" [4]; and silently reducing cache duration from one hour to five minutes, costing power users up to $1,582 before admitting it was a "bug" [5].
You claim Chinese models cheat on benchmarks while Anthropic admitted its own evals completely missed three infrastructure bugs that degraded Claude's quality for weeks [6]. You defend Anthropic's price gouging as "necessary profit" while ignoring they quadrupled API costs for Agent SDK users and tried to wall off Claude Code behind a $100 paywall—only to backtrack when OpenAI used it as marketing ammo [7].
Anthropic isn't just the bad guy—it's a case study in how to alienate developers through arrogance, opacity, and technical failure while gaslighting users that degradation is imaginary.
References
GLM 5.2 is clearly a powerful model, and a great coder, but it did generally regress. For example, on arena.ai it broadly performed worse then 5.1, such as its creative writing ranking dropping from 12 to 29. Anthropic's top models are much stronger across domains (e.g. 1 ranking in creative writing), while also remaining a little better at coding. Plus GLM 5.2 lost a lot of broad knowledge compared to 5.1. When you grossly overfit a specific domain you inevitably scramble the weights used in other domains.
And I don't exactly take issue with most of the points you raised, but many are overstated or misrepresented, such as why the Pentagon branded Anthropic a supply chain risk. They were just sticking to their stated founding principles. And even if there were infrastructure bugs, failures at complex tasks, etc. that's not Anthropic being bad guys. And yes, the real money is in enterprise so they're turning their backs on the little guy. But this helped them be one of only a few AI companies to actually turn a profit.
Anyways, it's extremely challenging making a model notably better at a specific task like coding without becoming worse at everything else, which Anthropic was able to do, but zai couldn't. They basically just made GLM 5.1 Coder and tried to pass it off as its next generation general purpose AI model even though 5.2 is generally inferior to 5.1.
