TAU2
Measured May 29, 2026Source
Score
0.25
A lightweight 4B-parameter reasoning model from Alibaba's Qwen3 series, optimized for instruction following and logical reasoning tasks. It offers a balance of performance and efficiency for resource-constrained deployments.
Benchmark history
Score
0.25
Score
0.02
Score
0.38
Score
0.5
Score
0.83
Score
0.26
Score
0.64
Score
0.06
Score
0.67
Score
0.74
Score
82.7
Score
9.5
Score
18.2
Score
0.94
Score
0.98
Plan availability
Loading ratings...

Thinking... Make sure you are connected to GitHub server