TAU2
Measured May 14, 2026Source
Score
0.78
Claude 4.5 Sonnet (Reasoning) is a model optimized for complex reasoning tasks. It utilizes an extended thinking chain to break down and solve multi-step problems, excelling in analysis, planning, and logical deduction.
Benchmark history
Score
0.78
Score
0.36
Score
0.66
Score
0.57
Score
0.88
Score
0.45
Score
0.71
Score
0.17
Score
0.83
Score
0.88
Score
88
Score
38.6
Score
43
Plan availability

Thinking... Make sure you are connected to GitHub server