China

Z AI

Z AI refers to Zhipu AI (智谱AI), a Chinese AI company developing GLM series large language models and foundation models. It provides generative AI services, APIs, and applications such as ChatGLM, positioning itself as a key player in China's LLM ecosystem.

Website

Products

Models

Available

Benchmarks

Region

China

Updated

May 29, 2026

Product coverage

Products from this provider

Coding plan

GLM Coding Plan

GLM Coding Plan is a subscription service by Z AI (Zhipu AI) designed for AI-powered coding. It provides access to GLM models (GLM-5.1, GLM-5-Turbo, GLM-4.7, GLM-4.5-Air) through official integrations with 20+ coding tools including Claude Code, Cline, Kilo Code, Cursor, and VS Code. Plans include dedicated MCP tools for vision understanding, web search, and repository access.

Codex · SupportedOpenHands · SupportedKimi Code · Supported

Plans

Models

Updated

May 13, 2026

Model coverage

Models from this provider

GLM 5V Turbo Reasoning

GLM 5V Turbo (Reasoning)

GLM 5V Turbo (Reasoning) is a multimodal model from Zhipu AI's GLM series, optimized for fast inference and strong reasoning capabilities. It is designed to handle complex tasks that require logical deduction and visual understanding.

ReasoningFastMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

42.9

GLM 4.5

GLM-4.5 (Reasoning)

GLM-4.5 (Reasoning) is a model from Zhipu AI's GLM-4 series, specifically optimized for complex reasoning and problem-solving tasks. It likely employs chain-of-thought or similar techniques to enhance logical deduction and step-by-step analysis.

ReasoningLong contextCoding

Input / 1M tokens

$0.60

Output tokens/s

49.32

First-token seconds

1.27s

Artificial Analysis Intelligence Index

26.4

GLM 4.5

GLM-4.5-Air

GLM-4.5-Air is a lightweight and efficient variant of the GLM-4.5 series, optimized for fast response times and lower computational costs. It is suitable for applications requiring quick inference and high throughput.

FastCheap

Input / 1M tokens

$0.17

Output tokens/s

80.36

First-token seconds

1.63s

Artificial Analysis Intelligence Index

23.2

GLM 4.5

GLM-4.5V (Non-reasoning)

GLM-4.5V is a multimodal model from Z AI optimized for fast, non-reasoning tasks. It excels at processing visual inputs alongside text and is tuned for efficient, low-latency responses, particularly for Chinese language contexts.

MultimodalFastCoding

Input / 1M tokens

$0.60

Output tokens/s

23.67

First-token seconds

54.52s

Artificial Analysis Intelligence Index

12.7

GLM 4.5

GLM-4.5V (Reasoning)

A multimodal reasoning model from Zhipu AI's GLM series. It is designed to process and reason across both text and visual inputs, excelling at tasks that require integrated understanding and logical deduction.

ReasoningMultimodalCoding

Input / 1M tokens

$0.60

Output tokens/s

22.82

First-token seconds

1.13s

Artificial Analysis Intelligence Index

15.1

GLM 4.6

GLM-4.6 (Non-reasoning)

GLM-4.6 (Non-reasoning) is a variant of the GLM-4 series optimized for general-purpose dialogue and content generation tasks, rather than complex reasoning. It offers fast response speeds and is suitable for high-throughput applications.

CodingFastCheap

Input / 1M tokens

$0.60

Output tokens/s

45.62

First-token seconds

1.92s

Artificial Analysis Intelligence Index

30.2

GLM 4.6

GLM-4.6 (Reasoning)

GLM-4.6 (Reasoning) is a model from the GLM-4 series optimized for complex reasoning tasks. It excels at multi-step logical deduction and problem-solving, often employing chain-of-thought reasoning to enhance accuracy.

ReasoningLong context

Input / 1M tokens

$0.55

Output tokens/s

31.96

First-token seconds

1.79s

Artificial Analysis Intelligence Index

32.5

GLM 4.6

GLM-4.6V (Non-reasoning)

GLM-4.6V is a multimodal model from Zhipu AI capable of processing both text and images. As a non-reasoning variant, it is optimized for general-purpose tasks, content generation, and multimodal understanding rather than complex chain-of-thought reasoning.

MultimodalCoding

Input / 1M tokens

$0.30

Output tokens/s

38.71

First-token seconds

1.41s

Artificial Analysis Intelligence Index

17.1

GLM 4.6

GLM-4.6V (Reasoning)

A multimodal reasoning model from the GLM-4 series, designed for advanced visual understanding and complex logical inference tasks. It integrates vision capabilities with strong reasoning performance.

ReasoningMultimodalCoding

Input / 1M tokens

$0.30

Output tokens/s

42.54

First-token seconds

1.41s

Artificial Analysis Intelligence Index

23.4

GLM 4.7

GLM-4.7 (Non-reasoning)

GLM-4.7 (Non-reasoning) is a variant of the GLM-4 series from Z AI, optimized for general-purpose tasks without an explicit reasoning or chain-of-thought mode. It focuses on providing fast and cost-effective responses for standard conversational, coding, and everyday tasks.

CodingFastCheapLong context

Input / 1M tokens

$0.60

Output tokens/s

82.4

First-token seconds

0.88s

Artificial Analysis Intelligence Index

34.2

GLM 4.7

GLM-4.7 (Reasoning)

GLM-4.7 is a powerful reasoning model from Zhipu AI (Z AI), designed for complex logical and analytical tasks. It supports an ultra-long context window of 128K tokens and is capable of processing multimodal inputs.

ReasoningLong contextMultimodal

Input / 1M tokens

$0.60

Output tokens/s

87.89

First-token seconds

0.88s

Artificial Analysis Intelligence Index

42.1

GLM 4.7

GLM-4.7-Flash (Non-reasoning)

GLM-4.7-Flash is a lightweight, high-speed variant of the GLM-4 series optimized for low-latency and cost-effective inference. As a non-reasoning model, it focuses on direct and rapid response generation rather than complex chain-of-thought processes. It is well-suited for applications requiring quick, efficient text generation.

FastCheapCoding

Input / 1M tokens

$0.07

Output tokens/s

112.85

First-token seconds

1.27s

Artificial Analysis Intelligence Index

22.1

GLM 4.7

GLM-4.7-Flash (Reasoning)

GLM-4.7-Flash (Reasoning) is a lightweight, high-speed model from the GLM series, optimized for fast inference and strong reasoning capabilities. It is designed for applications requiring quick, logical responses and complex problem-solving.

ReasoningFastCheap

Input / 1M tokens

$0.07

Output tokens/s

79.77

First-token seconds

0.96s

Artificial Analysis Intelligence Index

30.1

GLM 5

GLM-5 (Non-reasoning)

GLM-5 (Non-reasoning) is a variant of the GLM-5 series optimized for high-speed, low-latency responses. It excels in tasks requiring quick turnaround and cost efficiency, while maintaining strong capabilities in coding, multimodal understanding, and long-context processing.

CodingFastCheapLong contextMultimodal

Input / 1M tokens

$1.00

Output tokens/s

62.73

First-token seconds

1.19s

Artificial Analysis Intelligence Index

40.6

GLM 5

GLM-5 (Reasoning)

GLM-5 (Reasoning) is the latest generation large language model from Zhipu AI, specifically optimized for complex reasoning tasks. It features enhanced logical deduction and chain-of-thought capabilities, and is part of the multimodal GLM model family.

ReasoningMultimodal

Input / 1M tokens

$1.00

Output tokens/s

72.63

First-token seconds

0.66s

Artificial Analysis Intelligence Index

49.8

GLM 5

GLM-5 Fast

Z AI develops the GLM series of large language models, including GLM-5 and GLM-5.1, designed for advanced AI applications like coding, reasoning, and multimodal tasks. These models are offered through the Z.ai platform and feature high parameter counts with efficient architectures.

GLM 5

GLM-5-Turbo

Z.AI develops and offers advanced AI models, such as the GLM-5 series, which support multimodal inputs, complex coding, reasoning, and long-context tasks. The provider makes models available via API and through open-source releases on platforms like Hugging Face, focusing on research and deployment in the AI market.

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

46.8

GLM 5.1

GLM-5.1 (Non-reasoning)

GLM-5.1 (Non-reasoning) is a variant of the GLM-5.1 model optimized for faster response times and cost-efficiency by omitting the dedicated reasoning/thinking process. It is suitable for general-purpose tasks, coding, and multimodal interactions where rapid output is prioritized over complex chain-of-thought reasoning.

CodingFastCheapMultimodal

Input / 1M tokens

$1.40

Output tokens/s

47.14

First-token seconds

0.93s

Artificial Analysis Intelligence Index

43.8

GLM 5.1

GLM-5.1 (Reasoning)

GLM-5.1 (Reasoning) is a large language model from Z AI (Zhipu AI) specifically optimized for complex reasoning tasks. It excels at multi-step logical deduction, problem-solving, and analysis, making it suitable for applications requiring deep thought and structured output.

ReasoningCoding

Input / 1M tokens

$1.40

Output tokens/s

59.29

First-token seconds

0.83s

Artificial Analysis Intelligence Index

51.4

CodeGeex4 All 9B

codegeex4-all-9b

Z AI develops and provides the CodeGeeX4 series of AI models, such as CodeGeeX4-ALL-9B, which are versatile models for various AI software development scenarios including code completion, code interpreter, web search, function calling, and repository-level Q&A.

Discussion

Thinking... Make sure you are connected to GitHub server