Providers
Kimi

China

Kimi

Kimi is an AI chatbot and assistant platform developed by Moonshot AI (月之暗面), a Chinese AI startup based in Beijing. The platform features Kimi K2.6 model with capabilities including long-context processing, deep research, code generation (Kimi Code), and agent-based workflows (Agent Swarm). Kimi gained significant popularity in China for its strong long-context understanding abilities, initially supporting 200K+ token contexts. The platform also offers productivity tools including document creation, slide presentations, spreadsheets, and website building.

Products
1
Models
8
Available
0
Benchmarks
15

Region

China

Updated

May 14, 2026

CODPL speed

Provider speed highlights

1

Product coverage

Products from this provider

1

Model coverage

Models from this provider

8

Kimi K2

Kimi K2

Kimi K2 is a large language model developed by Moonshot AI (Kimi). It is part of the Kimi series, likely representing a significant iteration in the model family.

Input / 1M tokens

$0.585

Output tokens/s

33.17

First-token seconds

1.3s

Artificial Analysis Intelligence Index

26.3

Kimi K2

Kimi K2 0905

Kimi K2 0905 is a large language model developed by Moonshot AI (Kimi). It is part of the Kimi series, likely featuring strong long-context understanding and reasoning capabilities.

Long contextReasoning

Input / 1M tokens

$0.60

Output tokens/s

16.5

First-token seconds

1.85s

Artificial Analysis Intelligence Index

30.9

Kimi K2

Kimi K2 Thinking

Kimi K2 Thinking is a reasoning-focused model from Moonshot AI, designed to excel in complex problem-solving and chain-of-thought tasks. It is optimized for deep analysis and logical inference.

ReasoningLong context

Input / 1M tokens

$0.60

Output tokens/s

118.72

First-token seconds

0.96s

Artificial Analysis Intelligence Index

40.9

Kimi K2.5

Kimi K2.5 (Non-reasoning)

Kimi K2.5 (Non-reasoning) is a fast-response variant of the Kimi K2.5 series, optimized for low-latency interactions. It excels in rapid content generation, chat, and multimodal understanding tasks where immediate answers are prioritized over deep, step-by-step reasoning.

FastMultimodalLong context

Input / 1M tokens

$0.60

Output tokens/s

56.41

First-token seconds

1.19s

Artificial Analysis Intelligence Index

37.3

Kimi K2.5

Kimi K2.5 (Reasoning)

Kimi K2.5 (Reasoning) is a large language model from Moonshot AI, specifically optimized for complex reasoning tasks. It excels in multi-step logical deduction, mathematical problem-solving, and code analysis, often employing a chain-of-thought approach to enhance accuracy. The model is designed to handle intricate queries that require deep analytical thinking.

ReasoningLong context

Input / 1M tokens

$0.54

Output tokens/s

47.81

First-token seconds

1.16s

Artificial Analysis Intelligence Index

46.8

Kimi K2.6

Kimi K2.6

Kimi K2.6 is a large language model developed by Moonshot AI, optimized for long-context understanding and complex reasoning tasks. It builds upon the Kimi K2 series, offering enhanced performance in processing extensive information and generating coherent, logical responses.

Long contextReasoningCoding

Input / 1M tokens

$0.95

Output tokens/s

40.66

First-token seconds

1.26s

Artificial Analysis Intelligence Index

53.9

Kimi K2.6

Kimi K2.6 (Non-reasoning)

A non-reasoning variant of the Kimi K2.6 model, optimized for fast response times and cost-efficiency. It is suitable for applications requiring quick, economical replies while maintaining the long-context capabilities of the Kimi K2 series.

FastCheapLong context

Input / 1M tokens

$0.95

Output tokens/s

36.78

First-token seconds

1.26s

Artificial Analysis Intelligence Index

42.9

Kimi

Kimi Linear 48B A3B Instruct

Kimi Linear 48B A3B Instruct is a large language model optimized for efficiency, likely utilizing a Mixture-of-Experts (MoE) architecture with 48 billion total parameters and 3 billion active parameters. The 'Linear' designation suggests it may employ linear attention mechanisms for enhanced performance on long sequences. As an instruction-tuned model, it is designed for strong instruction following and conversational capabilities.

CodingReasoningFastCheap

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

14.4

Discussion

Thinking... Make sure you are connected to GitHub server