TAU2
Measured May 14, 2026Source
Score
0.25
A lightweight 4B-parameter reasoning model from Alibaba's Qwen3 series, optimized for instruction following and logical reasoning tasks. It offers a balance of performance and efficiency for resource-constrained deployments.
Benchmark history
Score
0.25
Score
0.02
Score
0.38
Score
0.5
Score
0.83
Score
0.26
Score
0.64
Score
0.06
Score
0.67
Score
0.74
Score
82.7
Score
9.5
Score
18.2
Score
0.94
Score
0.98
Plan availability

Thinking... Make sure you are connected to GitHub server