option
Home Multimodal Model InternVL3-78B VS VILA1.5-13B

InternVL3-78B VS VILA1.5-13B

Model Name Platform Release time Model parameter quantity Comprehensive score
InternVL3-78B Shanghai AI Laboratory & Tsinghua University June 1, 2025 78.4B 4.5
VILA1.5-13B NVIDIA March 1, 2025 13B 2.4
Swipe left and right to view more

Brief Comparison of InternVL3-78B VS VILA1.5-13B AI Models

Comprehensive Evaluation

Both models perform poorly in multimodal reasoning, with severe misinterpretation of visual details and illogical reasoning, indicating overall low capability.

Multimodal Reasoning

Both InternVL3-78B and VILA1.5-13B are weak in multimodal reasoning, exhibiting severe misinterpretation of visual information and shallow, chaotic cross-modal reasoning, with capabilities at a low level.

Multimodal Creation

Both InternVL3-78B and VILA1.5-13B are weak in multimodal creation, exhibiting severe disconnect between visuals and language, shallow and chaotic creativity, with capabilities at a low level.

OR