China
阿里巴巴
Alibaba develops Tongyi Qianwen (Qwen) models and provides generative AI services via Alibaba Cloud. It is a major player in enterprise AI infrastructure and open-source LLMs.
Region
China
Updated
May 14, 2026
Product coverage
Products from this provider
Model coverage
Models from this provider
阿里巴巴
QwQ 32B
QwQ 32B is a 32-billion parameter language model from Alibaba, designed to deliver strong reasoning and coding capabilities. It offers a balanced performance-to-cost ratio, making it suitable for a wide range of general-purpose and specialized tasks.
Input / 1M tokens
$0.66
Output tokens/s
31.39
First-token seconds
0.46s
Artificial Analysis Intelligence Index
19.7
阿里巴巴
QwQ 32B-Preview
QwQ 32B-Preview is a 32-billion parameter reasoning model developed by Alibaba's Qwen team. It is specifically designed to excel at complex reasoning tasks, particularly in mathematics and coding, utilizing reinforcement learning to enhance its problem-solving capabilities. The model features a 'thinking' mode that allows it to break down problems step-by-step before providing a final answer.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
15.2
阿里巴巴
Qwen Chat 14B
Qwen Chat 14B is a mid-sized, general-purpose conversational model from Alibaba's Qwen series. It offers a balanced performance between capability and efficiency, optimized for dialogue, reasoning, and code generation tasks.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
7.4
阿里巴巴
Qwen Chat 72B
Qwen Chat 72B is a large-parameter chat model from Alibaba's Qwen series, optimized for conversational interactions. It features strong reasoning capabilities, supports long contexts, and is proficient in multiple languages including Chinese and English.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
8.8
Qwen Image
Qwen-Image-2.0
Alibaba's Qwen image generation model version 2.0.
Qwen Image
Qwen-Image-2.0-Pro
Alibaba's Qwen image generation model version 2.0 Pro.
Qwen1.5
Qwen1.5 Chat 110B
Qwen1.5 Chat 110B is a large-scale language model from Alibaba's Qwen series, featuring 110 billion parameters. It excels in complex reasoning, code generation, and supports long-context understanding and multi-modal inputs.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
9.5
Qwen2
Qwen2 Instruct 72B
Qwen2 Instruct 72B is a large-scale instruction-tuned language model from Alibaba's Qwen series. It features strong reasoning, code generation, and multilingual capabilities, optimized for complex instruction following and dialogue tasks.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
11.7
Qwen 2.5
Qwen2.5 Coder Instruct 32B
Qwen2.5 Coder Instruct 32B is a 32-billion parameter language model from Alibaba, specifically optimized for coding tasks. It excels at code generation, completion, and understanding across multiple programming languages, following instructions effectively for developer workflows.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
12.9
Qwen 2.5
Qwen2.5 Coder Instruct 7B
Qwen2.5 Coder Instruct 7B is a specialized code generation model from Alibaba's Qwen series, optimized for tasks like code completion, generation, and debugging. As a 7B parameter model, it offers a balance of strong coding performance and efficient inference speed, making it suitable for deployment in resource-constrained environments.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
10
Qwen 2.5
Qwen2.5 Instruct 32B
Qwen2.5 Instruct 32B is a mid-sized, instruction-tuned language model from Alibaba's Qwen series. It excels at following instructions, multilingual tasks, and code generation while maintaining strong reasoning capabilities. The model supports a long context window of up to 128K tokens.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
13.2
Qwen 2.5
Qwen2.5 Instruct 72B
Qwen2.5 Instruct 72B is a large language model developed by Alibaba's Qwen team, optimized for instruction following and dialogue. It features strong multilingual capabilities, particularly in Chinese, and excels at complex reasoning and code generation tasks. The model supports long context windows for processing extensive information.
Input / 1M tokens
$0.36
Output tokens/s
55.29
First-token seconds
1.06s
Artificial Analysis Intelligence Index
15.6
Qwen 2.5
Qwen2.5 Max
Qwen2.5 Max is Alibaba Cloud's flagship large language model, excelling in complex reasoning, code generation, and multimodal understanding. It supports an extremely long context window and is designed for high-performance enterprise and research applications.
Input / 1M tokens
$1.60
Output tokens/s
48.63
First-token seconds
1.14s
Artificial Analysis Intelligence Index
16.3
Qwen 2.5
Qwen2.5 Turbo
Qwen2.5 Turbo is a high-performance, cost-effective large language model optimized for rapid response times. It is part of Alibaba's Qwen series, designed to deliver strong general capabilities, including coding and reasoning, at a competitive price point.
Input / 1M tokens
$0.05
Output tokens/s
70.19
First-token seconds
1.31s
Artificial Analysis Intelligence Index
12
Qwen3
Qwen3 0.6B (Non-reasoning)
Qwen3 0.6B is a lightweight, non-reasoning variant of the Qwen3 series with only 0.6 billion parameters. It is optimized for fast inference, low latency, and minimal resource consumption, making it suitable for edge deployment, simple conversational tasks, and applications requiring rapid response times.
Input / 1M tokens
$0.11
Output tokens/s
224.09
First-token seconds
0.89s
Artificial Analysis Intelligence Index
5.7
Qwen3
Qwen3 0.6B (Reasoning)
A lightweight reasoning model from the Qwen3 series, optimized for fast inference and cost-effective deployment. It excels in logical reasoning tasks with a focus on chain-of-thought capabilities.
Input / 1M tokens
$0.11
Output tokens/s
225.26
First-token seconds
0.92s
Artificial Analysis Intelligence Index
6.5
Qwen3
Qwen3 1.7B (Non-reasoning)
Qwen3 1.7B is a lightweight language model from Alibaba's Qwen series, optimized for fast and efficient inference. It is designed for non-reasoning tasks, providing quick responses with minimal computational resources.
Input / 1M tokens
$0.11
Output tokens/s
140.28
First-token seconds
1.02s
Artificial Analysis Intelligence Index
6.8
Qwen3
Qwen3 1.7B (Reasoning)
A compact 1.7B parameter model from Alibaba's Qwen3 series, optimized for efficient reasoning tasks. It is designed to deliver strong logical and analytical performance in resource-constrained environments, offering a balance of speed and capability.
Input / 1M tokens
$0.11
Output tokens/s
139.17
First-token seconds
0.99s
Artificial Analysis Intelligence Index
8
Qwen3
Qwen3 14B (Non-reasoning)
Qwen3 14B is a 14-billion parameter model from Alibaba's Qwen3 series, optimized for general-purpose dialogue and instruction following. As a non-reasoning variant, it focuses on efficient and responsive text generation, making it suitable for applications requiring quick, cost-effective, and high-quality conversational AI.
Input / 1M tokens
$0.235
Output tokens/s
64.21
First-token seconds
1.16s
Artificial Analysis Intelligence Index
12.8
Qwen3
Qwen3 14B (Reasoning)
Qwen3 14B (Reasoning) is a 14-billion parameter model from Alibaba's Qwen3 series, specifically optimized for complex reasoning tasks. It excels at chain-of-thought and step-by-step logical problem-solving, offering a strong balance between advanced reasoning capabilities and computational efficiency.
Input / 1M tokens
$0.235
Output tokens/s
64.76
First-token seconds
1.14s
Artificial Analysis Intelligence Index
16.2
Qwen3
Qwen3 235B A22B (Non-reasoning)
Qwen3 235B A22B is a large-scale Mixture-of-Experts (MoE) language model from Alibaba's Qwen series, with a total of 235 billion parameters but only 22 billion activated per inference. This non-reasoning variant is optimized for general-purpose tasks, offering strong multilingual capabilities, coding proficiency, and efficient performance due to its MoE architecture.
Input / 1M tokens
$0.45
Output tokens/s
69.11
First-token seconds
1.2s
Artificial Analysis Intelligence Index
17
Qwen3
Qwen3 235B A22B (Reasoning)
Qwen3 235B A22B (Reasoning) is a large-scale language model from Alibaba's Qwen3 series, optimized for complex reasoning tasks. It utilizes a Mixture-of-Experts (MoE) architecture with 235B total parameters and 22B activated parameters, balancing high performance with computational efficiency. The model excels in instruction following and multi-step logical reasoning.
Input / 1M tokens
$0.70
Output tokens/s
69.04
First-token seconds
1.31s
Artificial Analysis Intelligence Index
19.8
Qwen3
Qwen3 235B A22B 2507 (Reasoning)
This is a reasoning-optimized variant of the Qwen3 235B model from Alibaba Cloud. It is designed to excel in complex logical, mathematical, and coding tasks that require multi-step reasoning. As a large-scale model, it supports long context windows and is part of the advanced Qwen3 series.
Input / 1M tokens
$0.40
Output tokens/s
59
First-token seconds
1.21s
Artificial Analysis Intelligence Index
29.5
Qwen3
Qwen3 235B A22B 2507 Instruct
Qwen3 235B A22B is a large-scale Mixture-of-Experts (MoE) language model from Alibaba's Qwen series. It features 235 billion total parameters with 22 billion activated per token, designed for strong instruction following, complex reasoning, and multilingual tasks.
Input / 1M tokens
$0.20
Output tokens/s
68.67
First-token seconds
1.25s
Artificial Analysis Intelligence Index
25
Qwen3
Qwen3 30B A3B (Non-reasoning)
Qwen3 30B A3B is a 30-billion parameter model from Alibaba's Qwen3 series, optimized for general-purpose instruction following and fast response generation. As a non-reasoning variant, it prioritizes efficiency and speed over complex chain-of-thought tasks, making it suitable for cost-sensitive and latency-critical applications.
Input / 1M tokens
$0.08
Output tokens/s
67.46
First-token seconds
1.35s
Artificial Analysis Intelligence Index
12.5
Qwen3
Qwen3 30B A3B (Reasoning)
Qwen3 30B A3B is a reasoning-optimized language model from Alibaba, designed for enhanced logical inference and problem-solving tasks.
Input / 1M tokens
$0.09
Output tokens/s
67.28
First-token seconds
1.17s
Artificial Analysis Intelligence Index
15.3
Qwen3
Qwen3 30B A3B 2507 (Reasoning)
This is a 30-billion parameter reasoning model from Alibaba's Qwen3 series, optimized for complex logical and analytical tasks. It features enhanced chain-of-thought capabilities to improve accuracy in multi-step problem-solving.
Input / 1M tokens
$0.28
Output tokens/s
148.45
First-token seconds
1.05s
Artificial Analysis Intelligence Index
22.4
Qwen3
Qwen3 30B A3B 2507 Instruct
Qwen3 30B A3B is a 30-billion parameter instruction-tuned model from Alibaba's Qwen3 series, likely utilizing a Mixture-of-Experts architecture with 3 billion active parameters. It is optimized for strong instruction following, reasoning, and multilingual (especially Chinese) performance, balancing capability with inference efficiency.
Input / 1M tokens
$0.15
Output tokens/s
122.46
First-token seconds
1.12s
Artificial Analysis Intelligence Index
15
Qwen3
Qwen3 32B (Non-reasoning)
Qwen3 32B (Non-reasoning) is a 32-billion parameter instruction-tuned model from Alibaba's Qwen series. It is designed for general-purpose dialogue and content generation, balancing performance and efficiency. This model excels at following instructions and handling a wide range of tasks without specialized reasoning modes.
Input / 1M tokens
$0.15
Output tokens/s
104.69
First-token seconds
1.1s
Artificial Analysis Intelligence Index
14.5
Qwen3
Qwen3 32B (Reasoning)
Qwen3 32B (Reasoning) is a 32-billion parameter model from Alibaba's Qwen3 series, specifically optimized for complex reasoning tasks. It excels in chain-of-thought processes, logical deduction, and problem-solving, while also maintaining strong coding and long-context capabilities.
Input / 1M tokens
$0.195
Output tokens/s
103.45
First-token seconds
1.04s
Artificial Analysis Intelligence Index
16.5
Qwen3
Qwen3 4B (Non-reasoning)
Qwen3 4B (Non-reasoning) is a lightweight, 4-billion parameter language model from Alibaba's Qwen3 series, optimized for fast and cost-effective inference. It is designed for general-purpose tasks and edge deployment, offering a balance of performance and efficiency without the overhead of complex reasoning chains.
Input / 1M tokens
$0.11
Output tokens/s
104.23
First-token seconds
0.98s
Artificial Analysis Intelligence Index
12.5
Qwen3
Qwen3 4B (Reasoning)
Qwen3 4B (Reasoning) is a compact 4-billion parameter model from Alibaba's Qwen3 series, optimized for reasoning tasks. It likely incorporates a chain-of-thought or thinking mode to enhance logical problem-solving while maintaining low latency and cost. This model is suitable for deployment in resource-constrained environments requiring efficient reasoning capabilities.
Input / 1M tokens
$0.11
Output tokens/s
103.85
First-token seconds
1s
Artificial Analysis Intelligence Index
14.2
Qwen3
Qwen3 4B 2507 (Reasoning)
A lightweight 4B-parameter reasoning model from Alibaba's Qwen3 series, optimized for instruction following and logical reasoning tasks. It offers a balance of performance and efficiency for resource-constrained deployments.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
18.2
Qwen3
Qwen3 4B 2507 Instruct
Qwen3 4B is a lightweight, efficient instruction-tuned model from Alibaba's Qwen series. It is optimized for fast inference and low-cost deployment while maintaining strong performance in following instructions and general tasks, particularly for Chinese language processing.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
12.9
Qwen3
Qwen3 8B (Non-reasoning)
Qwen3 8B (Non-reasoning) is an 8-billion parameter instruction-tuned model from Alibaba's Qwen3 series, optimized for general-purpose dialogue and instruction-following tasks. It is designed for fast response speeds and cost-effective deployment, making it suitable for applications requiring efficient and capable language understanding without complex reasoning chains.
Input / 1M tokens
$0.18
Output tokens/s
92.36
First-token seconds
1.03s
Artificial Analysis Intelligence Index
10.6
Qwen3
Qwen3 8B (Reasoning)
Qwen3 8B (Reasoning) is a lightweight, 8-billion parameter model from Alibaba's Qwen3 series, optimized for instruction following and reasoning tasks. It delivers strong logical and analytical performance while maintaining fast inference speeds suitable for real-time applications.
Input / 1M tokens
$0.11
Output tokens/s
87.15
First-token seconds
1.03s
Artificial Analysis Intelligence Index
13.2
Qwen3
Qwen3 Coder 30B A3B Instruct
Qwen3 Coder 30B A3B Instruct is a code-specialized model from Alibaba's Qwen3 series. It features a 30B total parameter size with a 3B active parameter architecture (likely a Mixture-of-Experts design), optimized for code generation, understanding, and instruction following.
Input / 1M tokens
$0.19
Output tokens/s
104.32
First-token seconds
1.5s
Artificial Analysis Intelligence Index
20
Qwen3
Qwen3 Coder 480B A35B Instruct
Qwen3 Coder 480B A35B Instruct is a large-scale, code-specialized language model from Alibaba's Qwen series. It features a Mixture-of-Experts (MoE) architecture with 480 billion total parameters and 35 billion active parameters, designed for high-performance code generation, understanding, and instruction following.
Input / 1M tokens
$0.30
Output tokens/s
68.64
First-token seconds
1.66s
Artificial Analysis Intelligence Index
24.8
Qwen3
Qwen3 Coder Next
Alibaba Group, through its Alibaba Cloud division, develops and offers the Qwen series of large language models for AI applications. The Qwen3-Coder-Next model is a coding-focused AI model with strong agentic capabilities, trained on executable task synthesis and reinforcement learning. Alibaba provides AI platform services via Alibaba Cloud Model Studio.
Input / 1M tokens
$0.35
Output tokens/s
152.98
First-token seconds
0.85s
Artificial Analysis Intelligence Index
28.3
Qwen3
Qwen3 Max
Qwen3 Max is Alibaba Cloud's flagship large language model, designed for high-performance general tasks. It features strong multimodal understanding, a 128K long context window, and excels in complex reasoning and code generation.
Input / 1M tokens
$1.66
Output tokens/s
32.81
First-token seconds
1.93s
Artificial Analysis Intelligence Index
31.4
Qwen3
Qwen3 Max (Preview)
Qwen3 Max (Preview) is the latest flagship model from Alibaba's Qwen series, designed for high-performance enterprise applications. It features enhanced reasoning and coding capabilities, supports an ultra-long context window, and is optimized for complex analytical tasks.
Input / 1M tokens
$1.20
Output tokens/s
44.87
First-token seconds
1.77s
Artificial Analysis Intelligence Index
26.1
Qwen3
Qwen3 Max Thinking
Qwen3 Max Thinking is a high-end model from Alibaba's Qwen3 series, optimized for complex reasoning tasks. It features an enhanced thinking mode for deeper analysis and supports long-context processing.
Input / 1M tokens
$1.20
Output tokens/s
48.94
First-token seconds
1.44s
Artificial Analysis Intelligence Index
39.8
Qwen3
Qwen3 Max Thinking (Preview)
Qwen3 Max Thinking (Preview) is an advanced AI model from Alibaba, designed for enhanced reasoning and chain-of-thought capabilities. It excels in complex problem-solving and logical tasks, with a focus on deep thinking modes.
Input / 1M tokens
$1.20
Output tokens/s
45.61
First-token seconds
1.95s
Artificial Analysis Intelligence Index
32.5
Qwen3
Qwen3 Next 80B A3B (Reasoning)
Qwen3 Next 80B A3B (Reasoning) is a large language model from Alibaba's Qwen series, optimized for complex reasoning tasks. It utilizes a Mixture-of-Experts (MoE) architecture with 80 billion total parameters but only 3 billion active parameters per inference, offering a strong balance between high performance and computational efficiency.
Input / 1M tokens
$0.50
Output tokens/s
169.83
First-token seconds
1.07s
Artificial Analysis Intelligence Index
26.7
Qwen3
Qwen3 Next 80B A3B Instruct
Qwen3 Next 80B A3B Instruct is a large language model from Alibaba's Qwen series, featuring a sparse activation architecture (likely 80B total parameters with ~3B active parameters per token). This design aims to deliver strong reasoning and coding capabilities while significantly improving inference speed and cost-efficiency compared to dense models of similar total size.
Input / 1M tokens
$0.50
Output tokens/s
167.41
First-token seconds
1.06s
Artificial Analysis Intelligence Index
20.1
Qwen3
Qwen3 Omni 30B A3B (Reasoning)
Qwen3 Omni 30B A3B (Reasoning) is a multimodal model from Alibaba's Qwen3 series, optimized for complex reasoning tasks. It processes both text and images, leveraging a 30-billion parameter architecture with 3 billion active parameters for efficient inference.
Input / 1M tokens
$0.25
Output tokens/s
89.62
First-token seconds
1.07s
Artificial Analysis Intelligence Index
15.6
Qwen3
Qwen3 Omni 30B A3B Instruct
Qwen3 Omni 30B A3B Instruct is a multimodal model from Alibaba's Qwen3 series, designed for instruction-following tasks. It features a 30-billion parameter architecture with 3 billion active parameters, balancing performance and efficiency. The model supports both text and image inputs, making it suitable for diverse multimodal applications.
Input / 1M tokens
$0.25
Output tokens/s
110.46
First-token seconds
0.98s
Artificial Analysis Intelligence Index
10.7
Qwen3
Qwen3 VL 235B A22B (Reasoning)
A large vision-language model from Alibaba with 235B total parameters and 22B activated parameters, focused on enhanced reasoning capabilities. It combines visual understanding with language generation, suitable for complex multimodal tasks requiring reasoning.
Input / 1M tokens
$0.84
Output tokens/s
34.36
First-token seconds
1.29s
Artificial Analysis Intelligence Index
27.6
Qwen3
Qwen3 VL 235B A22B Instruct
Qwen3 VL 235B A22B Instruct is a large-scale multimodal model from Alibaba's Qwen series, featuring a 235B total parameter Mixture-of-Experts (MoE) architecture with 22B active parameters. It is designed for advanced visual and language understanding tasks, offering strong reasoning capabilities while maintaining efficiency.
Input / 1M tokens
$0.30
Output tokens/s
46.74
First-token seconds
1.17s
Artificial Analysis Intelligence Index
20.8
Qwen3
Qwen3 VL 30B A3B (Reasoning)
A multimodal vision-language model from Alibaba's Qwen3 series, optimized for reasoning tasks. It features a 30B total parameter architecture with 3B activated parameters, suggesting a Mixture-of-Experts design for efficient inference.
Input / 1M tokens
$0.20
Output tokens/s
125.65
First-token seconds
1.07s
Artificial Analysis Intelligence Index
19.7
Qwen3
Qwen3 VL 30B A3B Instruct
Qwen3 VL 30B A3B Instruct is a multimodal vision-language model from Alibaba's Qwen3 series. It is designed to process both image and text inputs, likely leveraging a Mixture-of-Experts architecture (30B total parameters, 3B active) for efficient inference. The model is instruction-tuned for following user prompts in visual and language tasks.
Input / 1M tokens
$0.20
Output tokens/s
122.57
First-token seconds
1.03s
Artificial Analysis Intelligence Index
16
Qwen3
Qwen3 VL 32B (Reasoning)
Qwen3 VL 32B (Reasoning) is a multimodal vision-language model from Alibaba's Qwen series, optimized for complex reasoning tasks. It integrates visual understanding with strong logical and analytical capabilities, suitable for tasks requiring visual input and step-by-step reasoning.
Input / 1M tokens
$0.70
Output tokens/s
96.7
First-token seconds
1.4s
Artificial Analysis Intelligence Index
24.7
Qwen3
Qwen3 VL 32B Instruct
A 32-billion parameter vision-language model from Alibaba's Qwen3 series, excelling at image understanding, visual question answering, and multimodal reasoning while maintaining good response speed and cost efficiency.
Input / 1M tokens
$0.70
Output tokens/s
54.88
First-token seconds
1.15s
Artificial Analysis Intelligence Index
17.2
Qwen3
Qwen3 VL 4B (Reasoning)
A compact 4-billion parameter multimodal model from the Qwen3 VL series, optimized for visual reasoning tasks. It processes both images and text to perform complex reasoning, making it suitable for applications requiring visual understanding and logical inference.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
13.7
Qwen3
Qwen3 VL 4B Instruct
A lightweight vision-language model from Alibaba's Qwen3 series with 4 billion parameters. It is designed for efficient image understanding and multimodal tasks, offering fast inference and low deployment costs, suitable for edge or resource-constrained scenarios.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
9.6
Qwen3
Qwen3 VL 8B (Reasoning)
Qwen3 VL 8B (Reasoning) is a lightweight, multimodal vision-language model from Alibaba's Qwen series, optimized for enhanced reasoning capabilities. It efficiently processes both text and images, making it suitable for tasks requiring visual understanding and logical inference.
Input / 1M tokens
$0.18
Output tokens/s
134.73
First-token seconds
1.08s
Artificial Analysis Intelligence Index
16.7
Qwen3
Qwen3 VL 8B Instruct
Qwen3 VL 8B Instruct is a lightweight, 8-billion parameter vision-language model from Alibaba's Qwen series. It is designed for efficient multimodal understanding and instruction following, excelling at tasks that require processing both text and images.
Input / 1M tokens
$0.18
Output tokens/s
145.61
First-token seconds
1.09s
Artificial Analysis Intelligence Index
14.3
Qwen3.5
Qwen3.5 0.8B (Non-reasoning)
A lightweight, 0.8 billion parameter model from the Qwen3.5 series, optimized for fast inference and low-cost deployment. It is designed for simple, non-reasoning tasks and is suitable for edge devices or applications requiring rapid response times.
Input / 1M tokens
$0.01
Output tokens/s
105.07
First-token seconds
0.29s
Artificial Analysis Intelligence Index
9.9
Qwen3.5
Qwen3.5 0.8B (Reasoning)
Qwen3.5 0.8B (Reasoning) is a lightweight, 0.8-billion parameter model from Alibaba's Qwen3.5 series, specifically optimized for reasoning tasks. It is designed to deliver strong logical and analytical performance while maintaining a small footprint suitable for edge deployment or low-latency applications.
Input / 1M tokens
$0.01
Artificial Analysis Intelligence Index
10.5
Qwen3.5
Qwen3.5 122B A10B (Non-reasoning)
A large language model from Alibaba's Qwen3.5 series, featuring 122B total parameters with 10B activated (A10B). This non-reasoning variant is optimized for high efficiency and low latency in general-purpose tasks, offering strong multilingual and multimodal capabilities without the overhead of complex reasoning chains.
Input / 1M tokens
$0.40
Output tokens/s
154.57
First-token seconds
1.11s
Artificial Analysis Intelligence Index
35.9
Qwen3.5
Qwen3.5 122B A10B (Reasoning)
Qwen3.5 122B A10B is a large-scale Mixture-of-Experts (MoE) model from Alibaba's Qwen series, optimized for complex reasoning tasks. It features a 122 billion parameter architecture with 10 billion active parameters, balancing high performance with computational efficiency. The model supports an extremely long context window and excels in code generation and logical analysis.
Input / 1M tokens
$0.40
Output tokens/s
153.85
First-token seconds
1.02s
Artificial Analysis Intelligence Index
41.6
Qwen3.5
Qwen3.5 27B (Non-reasoning)
Qwen3.5 27B (Non-reasoning) is a mid-sized language model from Alibaba's Qwen series, optimized for general-purpose tasks without specialized reasoning capabilities. It supports long contexts and is designed for efficient deployment.
Input / 1M tokens
$0.28
Output tokens/s
89.68
First-token seconds
1.39s
Artificial Analysis Intelligence Index
37.2
Qwen3.5
Qwen3.5 27B (Reasoning)
Qwen3.5 27B (Reasoning) is a 27-billion parameter language model from Alibaba, optimized for reasoning tasks with enhanced chain-of-thought capabilities. It is designed for complex problem-solving and logical inference.
Input / 1M tokens
$0.30
Output tokens/s
90.12
First-token seconds
1.36s
Artificial Analysis Intelligence Index
42.1
Qwen3.5
Qwen3.5 2B (Non-reasoning)
A lightweight, 2-billion parameter model from the Qwen3.5 series optimized for fast and cost-effective inference. It is designed for general-purpose conversational tasks and simple applications where low latency and high throughput are prioritized over complex reasoning.
Input / 1M tokens
$0.02
Output tokens/s
324.32
First-token seconds
0.23s
Artificial Analysis Intelligence Index
14.7
Qwen3.5
Qwen3.5 2B (Reasoning)
Qwen3.5 2B (Reasoning) is a lightweight, 2-billion parameter model from Alibaba's Qwen series, specifically optimized for reasoning tasks. It delivers efficient and fast inference while maintaining strong performance on logical and analytical problems.
Input / 1M tokens
$0.02
Artificial Analysis Intelligence Index
16.3
Qwen3.5
Qwen3.5 35B A3B (Non-reasoning)
This is a high-efficiency Mixture-of-Experts (MoE) model from the Qwen3.5 series, featuring 35 billion total parameters with only 3 billion activated per inference. It is optimized for fast response speeds and is the non-reasoning variant, suitable for general-purpose tasks.
Input / 1M tokens
$0.25
Output tokens/s
129.72
First-token seconds
1.17s
Artificial Analysis Intelligence Index
30.7
Qwen3.5
Qwen3.5 35B A3B (Reasoning)
A 35B parameter reasoning model from the Qwen3.5 series, utilizing a Mixture-of-Experts architecture with 3B activated parameters for efficient inference. Optimized for complex reasoning tasks while maintaining strong performance in coding and long-context understanding.
Input / 1M tokens
$0.25
Output tokens/s
122.98
First-token seconds
1.09s
Artificial Analysis Intelligence Index
37.1
Qwen3.5
Qwen3.5 397B A17B (Non-reasoning)
Qwen3.5 397B A17B (Non-reasoning) is a large-scale Mixture-of-Experts (MoE) model from Alibaba's Qwen series. As a non-reasoning variant, it is optimized for general-purpose tasks, offering high performance and efficiency without the overhead of extended chain-of-thought processes. It is well-suited for applications requiring fast and capable responses across coding, instruction following, and general knowledge tasks.
Input / 1M tokens
$0.60
Output tokens/s
52.04
First-token seconds
1.95s
Artificial Analysis Intelligence Index
40.1
Qwen3.5
Qwen3.5 397B A17B (Reasoning)
Qwen3.5 397B A17B is a large-scale reasoning model from Alibaba's Qwen series, featuring 397 billion total parameters with 17 billion active parameters. It is optimized for complex reasoning tasks, multi-step problem solving, and supports a wide range of languages including Chinese and English.
Input / 1M tokens
$0.60
Output tokens/s
51.78
First-token seconds
1.86s
Artificial Analysis Intelligence Index
45
Qwen3.5
Qwen3.5 4B (Non-reasoning)
A lightweight, 4-billion parameter model from the Qwen3.5 series, optimized for efficiency and speed. It is designed for fast inference and low-cost deployment, suitable for edge devices and applications requiring quick responses.
Input / 1M tokens
$0.03
Output tokens/s
203.41
First-token seconds
0.23s
Artificial Analysis Intelligence Index
22.6
Qwen3.5
Qwen3.5 4B (Reasoning)
Qwen3.5 4B (Reasoning) is a lightweight, 4-billion parameter model from Alibaba's Qwen3.5 series, specifically optimized for enhanced reasoning and chain-of-thought capabilities. It offers a strong balance of performance and efficiency, making it suitable for fast inference tasks that require logical deduction.
Input / 1M tokens
$0.03
Output tokens/s
201.95
First-token seconds
0.24s
Artificial Analysis Intelligence Index
27.1
Qwen3.5
Qwen3.5 9B (Non-reasoning)
Qwen3.5 9B (Non-reasoning) is a compact language model from Alibaba's Qwen series, optimized for fast inference and low-cost deployment. It supports coding and multilingual tasks with a 9B parameter size, making it suitable for edge devices and real-time applications.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
27.3
Qwen3.5
Qwen3.5 9B (Reasoning)
Qwen3.5 9B (Reasoning) is a lightweight, efficient model from Alibaba's Qwen series, specifically optimized for enhanced reasoning and chain-of-thought capabilities. It offers a strong balance of performance, speed, and cost-effectiveness for complex problem-solving tasks.
Input / 1M tokens
$0.10
Output tokens/s
59.31
First-token seconds
0.45s
Artificial Analysis Intelligence Index
32.4
Qwen3.5
Qwen3.5 Omni Flash
Qwen3.5 Omni Flash is a multimodal model from Alibaba's Qwen series, designed for fast and efficient processing of text, images, and potentially other modalities. It is optimized for low-latency applications, making it suitable for real-time interactive scenarios.
Input / 1M tokens
$0.10
Output tokens/s
253.73
First-token seconds
1.08s
Artificial Analysis Intelligence Index
25.9
Qwen3.5
Qwen3.5 Omni Plus
Qwen3.5 Omni Plus is a multimodal large language model from Alibaba's Qwen series, designed for enhanced performance across text, image, and potentially audio inputs. It features strong reasoning and coding capabilities, suitable for complex tasks requiring integrated understanding of different data types.
Input / 1M tokens
$0.40
Output tokens/s
54.68
First-token seconds
1.43s
Artificial Analysis Intelligence Index
38.6
Qwen3.5
Qwen3.5 Plus 2026-02-15
Alibaba's Qwen3.5 Plus multimodal model supporting text, image, and video inputs with 1M-token context window for reasoning and agent workflows.
Input / 1M tokens
$0.40
Qwen3.6
Qwen3.6 27B (Non-reasoning)
Qwen3.6 27B (Non-reasoning) is a mid-sized, general-purpose language model from Alibaba's Qwen3 series. It is optimized for fast response times and cost-efficiency, making it suitable for a wide range of standard tasks without the overhead of complex reasoning chains.
Input / 1M tokens
$0.60
Output tokens/s
63.62
First-token seconds
1.39s
Artificial Analysis Intelligence Index
37.1
Qwen3.6
Qwen3.6 27B (Reasoning)
Qwen3.6 27B (Reasoning) is a mid-sized language model from Alibaba's Qwen series, optimized for complex reasoning tasks. It balances strong logical and analytical capabilities with efficient inference speed, making it suitable for applications requiring step-by-step problem-solving.
Input / 1M tokens
$0.60
Output tokens/s
65.57
First-token seconds
1.54s
Artificial Analysis Intelligence Index
45.8
Qwen3.6
Qwen3.6 35B A3B (Non-reasoning)
This is a 35-billion parameter model from Alibaba's Qwen series, optimized for fast and efficient inference without the overhead of a dedicated reasoning mode. It maintains strong capabilities in coding and multimodal tasks while offering a balance between performance and cost-effectiveness.
Input / 1M tokens
$0.375
Output tokens/s
183.3
First-token seconds
1.4s
Artificial Analysis Intelligence Index
31.5
Qwen3.6
Qwen3.6 35B A3B (Reasoning)
Qwen3.6 35B A3B is a reasoning-focused model from Alibaba's Qwen series. It is designed for complex problem-solving tasks, leveraging a 35B parameter architecture with optimized activation (A3B) for efficient inference.
Input / 1M tokens
$0.248
Output tokens/s
181.98
First-token seconds
1.45s
Artificial Analysis Intelligence Index
43.5
Qwen3.6
Qwen3.6 Max Preview
Qwen3.6 Max Preview is the latest flagship model in Alibaba's Qwen series, designed to deliver top-tier performance in complex reasoning and code generation tasks. As a preview of the Max variant, it likely represents the most capable version within the Qwen3.6 family, optimized for high-quality, long-form outputs.
Input / 1M tokens
$1.30
Output tokens/s
35.75
First-token seconds
2.49s
Artificial Analysis Intelligence Index
51.8
Qwen3.6
Qwen3.6 Plus
Qwen3.6 Plus is a large language model developed by Alibaba as part of the Qwen series. It is designed for general-purpose tasks and likely supports long-context processing and multilingual capabilities.
Input / 1M tokens
$0.50
Output tokens/s
52.42
First-token seconds
1.88s
Artificial Analysis Intelligence Index
50
Wan2.7
Wan2.7-Image
Alibaba's Wan2.7 image generation model.
Wan2.7
Wan2.7-Image-Pro
Alibaba's Wan2.7 image generation model Pro version.
Discussion

Thinking... Make sure you are connected to GitHub server

