Aime
Measured May 29, 2026Source
Score
0.05
This is the largest multimodal instruction-tuned model in the Meta Llama 3.2 series, featuring 90 billion parameters and support for both image and text inputs. It excels at visual understanding and complex reasoning tasks, making it suitable for sophisticated applications requiring the processing of both images and text.
Benchmark history
Score
0.05
Score
0.63
Score
0.24
Score
0.21
Score
0.05
Score
0.43
Score
0.67
Score
11.9
Score
0.21
Score
0.01
Score
0.12
Score
0.3
Score
0.03
Score
3.3
Score
4.2
Plan availability
Loading ratings...

Thinking... Make sure you are connected to GitHub server