TAU2
Measured May 14, 2026Source
Score
0.23
Nous Research
Hermes 4 is a fine-tuned version of Llama-3.1 70B, specifically optimized for enhanced reasoning and chain-of-thought capabilities. It excels at complex problem-solving, logical deduction, and following intricate instructions, making it suitable for tasks requiring deep analysis. As part of the Hermes series, it maintains strong tool-use and coding proficiency.
Benchmark history
Score
0.23
Score
0.05
Score
0.07
Score
0.31
Score
0.69
Score
0.34
Score
0.65
Score
0.08
Score
0.7
Score
0.81
Score
68.7
Score
14.4
Score
16
Plan availability

Thinking... Make sure you are connected to GitHub server