Gpqa
Measured May 14, 2026Source
Score
0.85
An enhanced reasoning model from OpenAI's o-series, designed for complex problem-solving with deep chain-of-thought capabilities. It excels in tasks requiring multi-step logical inference and analysis.
Benchmark history
Score
0.85
Score
40.7
Score
0.81
Score
0.37
Score
0.9
Score
0.99
Score
0.41
Score
0.81
Score
0.2
Score
0.85
Score
38.4
Score
0.69
Score
0.71
Score
0.88
Score
88.3
Plan availability

Thinking... Make sure you are connected to GitHub server