TAU2
Measured May 14, 2026Source
Score
0.19
Qwen3 VL 30B A3B Instruct is a multimodal vision-language model from Alibaba's Qwen3 series. It is designed to process both image and text inputs, likely leveraging a Mixture-of-Experts architecture (30B total parameters, 3B active) for efficient inference. The model is instruction-tuned for following user prompts in visual and language tasks.
Benchmark history
Score
0.19
Score
0.06
Score
0.24
Score
0.33
Score
0.72
Score
0.31
Score
0.48
Score
0.06
Score
0.7
Score
0.76
Score
72.3
Score
14.3
Score
16
Score
0.94
Score
0.98
Plan availability

Thinking... Make sure you are connected to GitHub server