TAU2
Measured May 14, 2026Source
Score
0.22
Nous Research
Hermes 4 is Nous Research's latest flagship model, fine-tuned from Meta's Llama-3.1 405B. It is specifically optimized for complex reasoning tasks, offering strong instruction-following and conversational capabilities.
Benchmark history
Score
0.22
Score
0.11
Score
0.21
Score
0.33
Score
0.7
Score
0.25
Score
0.69
Score
0.1
Score
0.73
Score
0.83
Score
69.7
Score
16
Score
18.6
Plan availability

Thinking... Make sure you are connected to GitHub server