TAU2
Measured May 29, 2026Source
Score
0.16
Step3 VL 10B is a multimodal vision-language model developed by StepFun. With 10 billion parameters, it is designed to understand and process both visual and textual information for various tasks.
Benchmark history
Score
0.16
Score
0.05
Score
0
Score
0.5
Score
0.31
Score
0.1
Score
0.69
Score
13.9
Score
15.5
Plan availability
Loading ratings...

Thinking... Make sure you are connected to GitHub server