TAU2
Measured May 29, 2026Source
Score
0.22
Hermes 4 is Nous Research's latest flagship model, fine-tuned from Meta's Llama-3.1 405B. It is specifically optimized for complex reasoning tasks, offering strong instruction-following and conversational capabilities.
Benchmark history
Score
0.22
Score
0.11
Score
0.21
Score
0.33
Score
0.7
Score
0.25
Score
0.69
Score
0.1
Score
0.73
Score
0.83
Score
69.7
Score
16
Score
18.6
Plan availability
Loading ratings...

Thinking... Make sure you are connected to GitHub server