Model name | Overall (val) | Overall (test) | AR | BVR | B | CR | C | DD | IQG | MR | M | NT | OR-A | OR-HN | OR-P | OR-T | SG | SAR | SIR | SWR |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Random | 25.70 | 25.94 | 38.20 | 22.73 | 22.92 | 22.72 | 24.06 | 26.66 | 27.13 | 27.00 | 20.00 | 24.75 | 21.37 | 22.93 | 22.33 | 21.18 | 32.43 | 24.23 | 21.39 | 23.71 |
Medical Special Model | ||||||||||||||||||||
MedVInT | 2.29 | 1.96 | 5.75 | 0.00 | 0.00 | 0.00 | 2.56 | 2.11 | 4.05 | 0.00 | 0.00 | 0.00 | 0.11 | 0.00 | 0.00 | 0.12 | 7.36 | 0.00 | 1.88 | 0.00 |
Med-Flamingo | 12.74 | 11.64 | 6.67 | 10.14 | 9.23 | 11.27 | 6.62 | 13.43 | 12.15 | 6.38 | 8.00 | 18.18 | 9.26 | 18.27 | 11.00 | 11.53 | 12.16 | 5.19 | 8.47 | 11.43 |
LLaVA-Med | 20.54 | 19.60 | 24.51 | 17.83 | 17.08 | 19.86 | 15.04 | 19.81 | 20.24 | 21.51 | 13.20 | 15.15 | 20.42 | 23.73 | 17.67 | 19.65 | 21.70 | 19.81 | 14.11 | 20.86 |
Qilin-Med-VL-Chat | 22.34 | 22.06 | 29.57 | 19.41 | 16.46 | 23.79 | 15.79 | 24.19 | 21.86 | 16.62 | 7.20 | 13.64 | 24.00 | 14.67 | 12.67 | 15.53 | 26.13 | 24.42 | 17.37 | 25.71 |
RadFM | 22.95 | 22.93 | 27.16 | 20.63 | 13.23 | 19.14 | 20.45 | 24.51 | 23.48 | 22.85 | 15.60 | 16.16 | 14.32 | 24.93 | 17.33 | 21.53 | 29.73 | 17.12 | 19.59 | 31.14 |
MedDr | 41.95 | 43.69 | 41.20 | 50.70 | 37.85 | 29.87 | 28.27 | 52.53 | 36.03 | 31.45 | 29.60 | 47.47 | 33.37 | 51.33 | 32.67 | 44.47 | 35.14 | 25.19 | 25.58 | 32.29 |
Open-Source LVLMs | ||||||||||||||||||||
CogVLM-grounding-generalist | 5.20 | 5.66 | 3.11 | 4.02 | 2.92 | 3.22 | 10.83 | 7.98 | 9.72 | 0.15 | 0.00 | 11.11 | 8.32 | 1.87 | 1.67 | 2.00 | 1.65 | 0.00 | 4.02 | 0.57 |
XComposer | 8.92 | 7.67 | 1.38 | 7.69 | 8.31 | 12.34 | 22.86 | 7.31 | 6.07 | 5.49 | 2.80 | 16.16 | 5.05 | 8.67 | 2.00 | 9.76 | 11.94 | 7.31 | 3.17 | 4.00 |
PandaGPT 13B | 16.69 | 16.27 | 24.51 | 23.60 | 22.15 | 23.61 | 14.29 | 14.95 | 13.36 | 12.17 | 18.40 | 28.79 | 18.63 | 27.33 | 18.67 | 16.71 | 11.04 | 9.23 | 13.43 | 9.71 |
Flamingo v2 | 25.58 | 26.34 | 37.74 | 21.50 | 20.62 | 22.00 | 22.41 | 27.29 | 25.91 | 27.45 | 18.00 | 28.79 | 25.16 | 22.13 | 22.00 | 22.00 | 34.61 | 22.88 | 20.44 | 27.43 |
VisualGLM-6B | 29.58 | 30.45 | 40.16 | 33.92 | 24.92 | 25.22 | 24.21 | 32.99 | 29.96 | 29.53 | 21.20 | 37.88 | 30.32 | 24.80 | 13.33 | 29.88 | 33.11 | 19.62 | 19.16 | 37.43 |
Idefics-9B-Instruct | 29.74 | 31.13 | 40.39 | 30.59 | 26.46 | 33.63 | 22.56 | 34.38 | 25.51 | 26.71 | 21.60 | 27.78 | 27.47 | 32.80 | 24.67 | 23.41 | 32.66 | 23.08 | 21.39 | 30.57 |
InstructBLIP-7B | 31.80 | 30.95 | 42.12 | 26.92 | 24.92 | 28.09 | 21.65 | 34.58 | 31.58 | 29.23 | 22.40 | 30.30 | 28.95 | 27.47 | 23.00 | 24.82 | 32.88 | 19.81 | 21.64 | 26.57 |
Mini-Gemini-7B | 32.17 | 31.09 | 29.69 | 39.16 | 31.85 | 28.26 | 10.38 | 35.58 | 29.96 | 28.78 | 20.80 | 34.34 | 29.58 | 36.53 | 24.00 | 31.76 | 22.45 | 25.96 | 18.56 | 29.43 |
MMAlaya | 32.19 | 32.30 | 41.20 | 35.14 | 32.15 | 34.17 | 27.82 | 35.09 | 28.34 | 30.27 | 18.00 | 46.97 | 20.21 | 31.20 | 16.00 | 34.59 | 32.28 | 23.65 | 22.93 | 30.29 |
Qwen-VL | 34.80 | 36.05 | 37.05 | 37.24 | 35.85 | 28.98 | 24.81 | 43.60 | 24.70 | 30.12 | 19.20 | 44.44 | 29.68 | 31.87 | 25.00 | 31.18 | 30.26 | 21.54 | 20.10 | 26.86 |
Yi-VL-6B | 34.82 | 34.31 | 41.66 | 39.16 | 26.62 | 30.23 | 31.88 | 38.01 | 26.72 | 24.93 | 25.20 | 37.37 | 29.58 | 31.20 | 32.33 | 30.59 | 36.71 | 24.81 | 23.18 | 31.43 |
LLaVA-NeXT-vicuna-7B | 34.86 | 35.42 | 40.62 | 38.64 | 21.08 | 35.42 | 23.91 | 41.22 | 32.39 | 28.04 | 20.53 | 44.95 | 27.92 | 34.98 | 20.22 | 32.82 | 33.63 | 23.08 | 25.06 | 34.86 |
Qwen-VL-Chat | 35.07 | 36.96 | 38.09 | 40.56 | 38.00 | 32.20 | 25.71 | 44.07 | 24.70 | 30.56 | 24.00 | 40.91 | 29.37 | 36.53 | 26.00 | 27.29 | 35.14 | 16.54 | 20.10 | 34.00 |
CogVLM-Chat | 35.23 | 36.08 | 40.97 | 30.77 | 27.69 | 32.74 | 19.40 | 41.10 | 36.84 | 34.72 | 24.00 | 40.91 | 36.74 | 37.33 | 26.00 | 33.65 | 36.56 | 20.19 | 23.95 | 26.57 |
Monkey | 35.48 | 36.39 | 38.32 | 35.31 | 35.54 | 34.53 | 23.16 | 43.40 | 31.98 | 30.12 | 19.20 | 33.33 | 30.00 | 32.53 | 25.33 | 31.65 | 34.46 | 20.00 | 20.27 | 30.29 |
mPLUG-Owl2 | 35.62 | 36.21 | 37.51 | 41.08 | 30.92 | 38.10 | 27.82 | 41.59 | 28.34 | 32.79 | 22.40 | 40.91 | 24.74 | 38.27 | 23.33 | 36.59 | 33.48 | 20.58 | 23.01 | 32.86 |
ShareCaptioner | 36.37 | 36.19 | 42.35 | 32.69 | 31.08 | 27.19 | 30.83 | 41.19 | 30.36 | 33.23 | 28.40 | 42.93 | 27.79 | 33.73 | 28.33 | 40.71 | 29.58 | 20.96 | 28.83 | 30.00 |
Emu2-Chat | 36.50 | 37.59 | 43.27 | 47.73 | 26.31 | 40.07 | 28.12 | 44.00 | 36.44 | 28.49 | 20.40 | 31.82 | 26.74 | 37.60 | 26.67 | 29.76 | 33.63 | 23.27 | 26.43 | 29.43 |
XComposer2-4KHD | 36.66 | 38.54 | 41.89 | 39.86 | 28.77 | 40.43 | 20.60 | 44.25 | 35.22 | 33.53 | 22.80 | 42.42 | 34.84 | 29.60 | 44.00 | 39.53 | 35.21 | 21.54 | 27.20 | 38.00 |
ShareGPT4V-7B | 36.71 | 36.70 | 43.96 | 37.59 | 21.54 | 37.57 | 18.80 | 43.26 | 32.39 | 27.30 | 22.80 | 43.43 | 29.47 | 37.33 | 22.00 | 31.76 | 34.98 | 24.42 | 25.06 | 30.00 |
LLaVA-NeXT-mistral-7B | 37.20 | 37.16 | 38.43 | 27.98 | 20.31 | 29.16 | 20.60 | 47.19 | 30.36 | 32.64 | 22.40 | 55.56 | 32.75 | 25.58 | 17.56 | 34.04 | 28.38 | 23.27 | 24.12 | 37.43 |
LLAVA-V1.5-13b-xtuner | 37.82 | 38.74 | 44.65 | 29.02 | 27.08 | 38.28 | 28.87 | 45.32 | 32.79 | 30.12 | 20.40 | 45.96 | 33.47 | 42.53 | 44.33 | 37.53 | 33.48 | 19.62 | 22.58 | 35.43 |
OmniLMM-12B | 37.89 | 39.30 | 39.82 | 40.56 | 32.62 | 37.57 | 24.81 | 46.68 | 35.63 | 35.01 | 27.60 | 57.58 | 28.42 | 34.00 | 25.00 | 29.18 | 34.46 | 24.42 | 27.54 | 40.29 |
InternVL-Chat-V1.1 | 38.16 | 39.41 | 42.46 | 43.88 | 35.23 | 45.08 | 23.31 | 45.96 | 38.87 | 29.23 | 29.60 | 40.40 | 31.68 | 41.87 | 26.67 | 38.82 | 32.13 | 19.42 | 25.58 | 30.29 |
LLAVA-V1.5-7B | 38.23 | 37.96 | 45.45 | 34.27 | 30.92 | 41.32 | 21.65 | 44.68 | 34.01 | 27.74 | 23.60 | 43.43 | 28.00 | 42.13 | 29.00 | 35.06 | 33.41 | 22.12 | 23.61 | 29.14 |
Monkey-Chat | 38.39 | 39.50 | 40.62 | 41.43 | 37.08 | 35.24 | 23.76 | 47.73 | 29.96 | 32.94 | 26.00 | 37.88 | 34.84 | 32.67 | 24.67 | 33.18 | 34.91 | 21.73 | 22.24 | 34.00 |
LLAVA-V1.5-7B-xtuner | 38.68 | 38.22 | 38.90 | 40.03 | 28.00 | 40.25 | 30.08 | 44.08 | 33.60 | 32.49 | 21.20 | 40.91 | 29.47 | 40.40 | 30.33 | 38.59 | 31.46 | 23.85 | 26.95 | 36.86 |
XComposer2 | 38.68 | 39.20 | 41.89 | 37.59 | 33.69 | 40.79 | 22.26 | 45.87 | 36.44 | 32.94 | 27.20 | 58.59 | 26.11 | 36.40 | 43.67 | 37.29 | 32.06 | 23.46 | 27.80 | 32.86 |
LLAVA-InternLM-7b | 38.71 | 39.11 | 36.36 | 36.54 | 32.62 | 38.10 | 30.68 | 46.53 | 34.82 | 28.19 | 25.20 | 48.99 | 28.11 | 40.53 | 33.33 | 36.00 | 34.08 | 26.73 | 24.12 | 29.71 |
TransCore-M | 38.86 | 38.70 | 40.74 | 41.78 | 20.77 | 35.06 | 34.74 | 45.69 | 32.39 | 32.94 | 24.40 | 44.95 | 31.05 | 38.93 | 27.00 | 33.76 | 33.86 | 23.46 | 25.49 | 31.14 |
InternVL-Chat-V1.5 | 38.86 | 39.73 | 43.84 | 44.58 | 34.00 | 33.99 | 31.28 | 45.59 | 33.20 | 38.28 | 32.40 | 42.42 | 31.89 | 42.80 | 27.00 | 36.82 | 34.76 | 23.27 | 24.72 | 32.57 |
InternVL-Chat-V1.2-Plus | 39.41 | 40.79 | 42.58 | 42.31 | 32.46 | 37.03 | 31.43 | 47.49 | 42.51 | 35.01 | 21.20 | 50.51 | 34.95 | 42.93 | 22.67 | 42.47 | 35.74 | 22.31 | 24.98 | 28.29 |
InternVL-Chat-V1.2 | 39.52 | 40.01 | 41.66 | 44.06 | 27.38 | 38.46 | 34.29 | 46.99 | 33.60 | 34.42 | 21.20 | 47.98 | 30.63 | 42.80 | 27.67 | 35.88 | 35.59 | 23.85 | 24.98 | 28.00 |
LLAVA-InternLM2-7b | 40.07 | 40.45 | 39.82 | 37.94 | 30.62 | 35.24 | 29.77 | 48.97 | 34.01 | 25.96 | 20.80 | 53.03 | 30.95 | 42.67 | 32.00 | 39.88 | 32.43 | 21.73 | 24.38 | 38.00 |
DeepSeek-VL-1.3B | 40.25 | 40.77 | 38.55 | 35.14 | 38.92 | 40.07 | 27.97 | 48.12 | 35.63 | 31.75 | 22.80 | 46.97 | 40.74 | 44.93 | 31.00 | 40.47 | 33.33 | 22.31 | 21.39 | 31.71 |
MiniCPM-V | 40.95 | 41.05 | 39.70 | 46.50 | 36.31 | 39.36 | 22.26 | 48.09 | 34.82 | 35.76 | 24.00 | 45.45 | 34.11 | 44.80 | 23.00 | 44.47 | 36.19 | 21.15 | 23.95 | 35.14 |
DeepSeek-VL-7B | 41.73 | 43.43 | 38.43 | 47.03 | 42.31 | 37.03 | 26.47 | 51.11 | 33.20 | 31.16 | 26.00 | 44.95 | 36.00 | 58.13 | 36.33 | 47.29 | 34.91 | 18.08 | 25.49 | 39.43 |
MiniCPM-V2 | 41.79 | 42.54 | 40.74 | 43.01 | 36.46 | 37.57 | 27.82 | 51.08 | 28.74 | 29.08 | 26.80 | 47.47 | 37.05 | 46.40 | 25.33 | 46.59 | 35.89 | 22.31 | 23.44 | 31.71 |
Proprietary LVLMs | ||||||||||||||||||||
Claude3-Opus | 32.37 | 32.44 | 1.61 | 39.51 | 34.31 | 31.66 | 12.63 | 39.26 | 28.74 | 30.86 | 22.40 | 37.37 | 25.79 | 41.07 | 29.33 | 33.18 | 31.31 | 21.35 | 23.87 | 4.00 |
Qwen-VL-Max | 41.34 | 42.16 | 32.68 | 44.58 | 31.38 | 40.79 | 10.68 | 50.53 | 32.79 | 44.36 | 29.20 | 51.52 | 41.37 | 58.00 | 30.67 | 41.65 | 26.95 | 25.00 | 24.64 | 39.14 |
GPT-4V | 42.50 | 44.08 | 29.92 | 48.95 | 44.00 | 37.39 | 12.93 | 52.88 | 32.79 | 44.21 | 32.80 | 63.64 | 39.89 | 54.13 | 37.00 | 50.59 | 27.55 | 23.08 | 25.75 | 37.43 |
Gemini 1.0 | 44.38 | 44.93 | 42.12 | 45.10 | 46.46 | 37.57 | 20.45 | 53.29 | 35.22 | 36.94 | 25.20 | 51.01 | 34.74 | 59.60 | 34.00 | 50.00 | 36.64 | 23.65 | 23.87 | 35.43 |
Gemini 1.5 | 47.42 | 48.36 | 43.50 | 56.12 | 51.23 | 47.58 | 2.26 | 55.33 | 38.87 | 48.07 | 30.00 | 76.26 | 51.05 | 75.87 | 46.33 | 62.24 | 20.57 | 27.69 | 30.54 | 40.57 |
GPT-4o | 53.53 | 53.96 | 38.32 | 61.01 | 57.08 | 49.02 | 46.62 | 61.45 | 46.56 | 56.38 | 34.00 | 75.25 | 53.79 | 69.47 | 48.67 | 65.88 | 33.93 | 22.88 | 29.51 | 39.43 |