Providers
Z AI

China

Z AI

Z AI refers to Zhipu AI (智谱AI), a Chinese AI company developing GLM series large language models and foundation models. It provides generative AI services, APIs, and applications such as ChatGLM, positioning itself as a key player in China's LLM ecosystem.

Products
1
Models
20
Available
0
Benchmarks
15

Region

China

Updated

May 14, 2026

CODPL speed

Provider speed highlights

3

Product coverage

Products from this provider

1

Model coverage

Models from this provider

20

GLM 5V Turbo Reasoning

GLM 5V Turbo (Reasoning)

GLM 5V Turbo (Reasoning) is a multimodal model from Zhipu AI's GLM series, optimized for fast inference and strong reasoning capabilities. It is designed to handle complex tasks that require logical deduction and visual understanding.

ReasoningFastMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

42.9

GLM 4.5

GLM-4.5 (Reasoning)

GLM-4.5 (Reasoning) is a model from Zhipu AI's GLM-4 series, specifically optimized for complex reasoning and problem-solving tasks. It likely employs chain-of-thought or similar techniques to enhance logical deduction and step-by-step analysis.

ReasoningLong contextCoding

Input / 1M tokens

$0.60

Output tokens/s

47.24

First-token seconds

0.9s

Artificial Analysis Intelligence Index

26.4

GLM 4.5

GLM-4.5-Air

GLM-4.5-Air is a lightweight and efficient variant of the GLM-4.5 series, optimized for fast response times and lower computational costs. It is suitable for applications requiring quick inference and high throughput.

FastCheap

Input / 1M tokens

$0.17

Output tokens/s

71.25

First-token seconds

1.45s

Artificial Analysis Intelligence Index

23.2

GLM 4.5

GLM-4.5V (Non-reasoning)

GLM-4.5V is a multimodal model from Z AI optimized for fast, non-reasoning tasks. It excels at processing visual inputs alongside text and is tuned for efficient, low-latency responses, particularly for Chinese language contexts.

MultimodalFastCoding

Input / 1M tokens

$0.60

Output tokens/s

49.23

First-token seconds

30.02s

Artificial Analysis Intelligence Index

12.7

GLM 4.5

GLM-4.5V (Reasoning)

A multimodal reasoning model from Zhipu AI's GLM series. It is designed to process and reason across both text and visual inputs, excelling at tasks that require integrated understanding and logical deduction.

ReasoningMultimodalCoding

Input / 1M tokens

$0.60

Output tokens/s

49.92

First-token seconds

1.1s

Artificial Analysis Intelligence Index

15.1

GLM 4.6

GLM-4.6 (Non-reasoning)

GLM-4.6 (Non-reasoning) is a variant of the GLM-4 series optimized for general-purpose dialogue and content generation tasks, rather than complex reasoning. It offers fast response speeds and is suitable for high-throughput applications.

CodingFastCheap

Input / 1M tokens

$0.60

Output tokens/s

37.89

First-token seconds

1.01s

Artificial Analysis Intelligence Index

30.2

GLM 4.6

GLM-4.6 (Reasoning)

GLM-4.6 (Reasoning) is a model from the GLM-4 series optimized for complex reasoning tasks. It excels at multi-step logical deduction and problem-solving, often employing chain-of-thought reasoning to enhance accuracy.

ReasoningLong context

Input / 1M tokens

$0.55

Output tokens/s

35.27

First-token seconds

0.76s

Artificial Analysis Intelligence Index

32.5

GLM 4.6

GLM-4.6V (Non-reasoning)

GLM-4.6V is a multimodal model from Zhipu AI capable of processing both text and images. As a non-reasoning variant, it is optimized for general-purpose tasks, content generation, and multimodal understanding rather than complex chain-of-thought reasoning.

MultimodalCoding

Input / 1M tokens

$0.30

Output tokens/s

28.76

First-token seconds

9.25s

Artificial Analysis Intelligence Index

17.1

GLM 4.6

GLM-4.6V (Reasoning)

A multimodal reasoning model from the GLM-4 series, designed for advanced visual understanding and complex logical inference tasks. It integrates vision capabilities with strong reasoning performance.

ReasoningMultimodalCoding

Input / 1M tokens

$0.30

Output tokens/s

37.2

First-token seconds

1.6s

Artificial Analysis Intelligence Index

23.4

GLM 4.7

GLM-4.7 (Non-reasoning)

GLM-4.7 (Non-reasoning) is a variant of the GLM-4 series from Z AI, optimized for general-purpose tasks without an explicit reasoning or chain-of-thought mode. It focuses on providing fast and cost-effective responses for standard conversational, coding, and everyday tasks.

CodingFastCheapLong context

Input / 1M tokens

$0.60

Output tokens/s

105.81

First-token seconds

0.69s

Artificial Analysis Intelligence Index

34.2

GLM 4.7

GLM-4.7 (Reasoning)

GLM-4.7 is a powerful reasoning model from Zhipu AI (Z AI), designed for complex logical and analytical tasks. It supports an ultra-long context window of 128K tokens and is capable of processing multimodal inputs.

ReasoningLong contextMultimodal

Input / 1M tokens

$0.60

Output tokens/s

107.88

First-token seconds

0.8s

Artificial Analysis Intelligence Index

42.1

GLM 4.7

GLM-4.7-Flash (Non-reasoning)

GLM-4.7-Flash is a lightweight, high-speed variant of the GLM-4 series optimized for low-latency and cost-effective inference. As a non-reasoning model, it focuses on direct and rapid response generation rather than complex chain-of-thought processes. It is well-suited for applications requiring quick, efficient text generation.

FastCheapCoding

Input / 1M tokens

$0.07

Output tokens/s

122.97

First-token seconds

0.95s

Artificial Analysis Intelligence Index

22.1

GLM 4.7

GLM-4.7-Flash (Reasoning)

GLM-4.7-Flash (Reasoning) is a lightweight, high-speed model from the GLM series, optimized for fast inference and strong reasoning capabilities. It is designed for applications requiring quick, logical responses and complex problem-solving.

ReasoningFastCheap

Input / 1M tokens

$0.07

Output tokens/s

87.93

First-token seconds

0.87s

Artificial Analysis Intelligence Index

30.1

GLM 5

GLM-5 (Non-reasoning)

GLM-5 (Non-reasoning) is a variant of the GLM-5 series optimized for high-speed, low-latency responses. It excels in tasks requiring quick turnaround and cost efficiency, while maintaining strong capabilities in coding, multimodal understanding, and long-context processing.

CodingFastCheapLong contextMultimodal

Input / 1M tokens

$1.00

Output tokens/s

66.6

First-token seconds

1.36s

Artificial Analysis Intelligence Index

40.6

GLM 5

GLM-5 (Reasoning)

GLM-5 (Reasoning) is the latest generation large language model from Zhipu AI, specifically optimized for complex reasoning tasks. It features enhanced logical deduction and chain-of-thought capabilities, and is part of the multimodal GLM model family.

ReasoningMultimodal

Input / 1M tokens

$1.00

Output tokens/s

84.04

First-token seconds

0.68s

Artificial Analysis Intelligence Index

49.8

GLM 5

GLM-5 Fast

Z AI develops the GLM series of large language models, including GLM-5 and GLM-5.1, designed for advanced AI applications like coding, reasoning, and multimodal tasks. These models are offered through the Z.ai platform and feature high parameter counts with efficient architectures.

GLM 5

GLM-5-Turbo

Z.AI develops and offers advanced AI models, such as the GLM-5 series, which support multimodal inputs, complex coding, reasoning, and long-context tasks. The provider makes models available via API and through open-source releases on platforms like Hugging Face, focusing on research and deployment in the AI market.

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

46.8

GLM 5.1

GLM-5.1 (Non-reasoning)

GLM-5.1 (Non-reasoning) is a variant of the GLM-5.1 model optimized for faster response times and cost-efficiency by omitting the dedicated reasoning/thinking process. It is suitable for general-purpose tasks, coding, and multimodal interactions where rapid output is prioritized over complex chain-of-thought reasoning.

CodingFastCheapMultimodal

Input / 1M tokens

$1.40

Output tokens/s

41.88

First-token seconds

1.16s

Artificial Analysis Intelligence Index

43.8

GLM 5.1

GLM-5.1 (Reasoning)

GLM-5.1 (Reasoning) is a large language model from Z AI (Zhipu AI) specifically optimized for complex reasoning tasks. It excels at multi-step logical deduction, problem-solving, and analysis, making it suitable for applications requiring deep thought and structured output.

ReasoningCoding

Input / 1M tokens

$1.40

Output tokens/s

51.55

First-token seconds

0.88s

Artificial Analysis Intelligence Index

51.4

CodeGeex4 All 9B

codegeex4-all-9b

Z AI develops and provides the CodeGeeX4 series of AI models, such as CodeGeeX4-ALL-9B, which are versatile models for various AI software development scenarios including code completion, code interpreter, web search, function calling, and repository-level Q&A.

Discussion

Thinking... Make sure you are connected to GitHub server