TAU2
Measured May 14, 2026Source
Score
0.28
This is a 30-billion parameter reasoning model from Alibaba's Qwen3 series, optimized for complex logical and analytical tasks. It features enhanced chain-of-thought capabilities to improve accuracy in multi-step problem-solving.
Benchmark history
Score
0.28
Score
0.05
Score
0.59
Score
0.51
Score
0.56
Score
0.91
Score
0.98
Score
0.33
Score
0.71
Score
0.1
Score
0.71
Score
0.81
Score
56.3
Score
14.6
Score
22.4
Plan availability

Thinking... Make sure you are connected to GitHub server