News — GMAI Lab (General Medical AI)

Updates

News

November 2025	Paper	MedQ-Deg released — benchmarking MLLM robustness under medical image degradations We release MedQ-Deg, a multidimensional benchmark evaluating 40 mainstream MLLMs across 18 degradation types, 30 capability dimensions, and 7 imaging modalities (24,894 QA pairs). We reveal the AI Dunning-Kruger Effect — models maintain high confidence despite severe accuracy collapse under degraded inputs.
September 2025	Paper	UniMedVL: First unified model for medical image understanding and generation We introduce UniMedVL, the first medical multimodal model that unifies image understanding and generation in a single architecture. Built on 5.6M multimodal samples and Progressive Curriculum Learning, UniMedVL achieves strong results on 5 understanding benchmarks and 8 generation modalities.
October 2025	Paper	MedQ-Bench: New benchmark for medical image quality assessment in MLLMs We release MedQ-Bench, a novel perception-reasoning benchmark for medical image quality assessment covering 5 modalities and 40+ quality attributes (3,308 samples). Zero-shot evaluation of 14 leading MLLMs shows GPT-4 achieves 68.97% accuracy — still 13.5% below expert performance.
September 2025	Join	Welcoming new visiting researchers to GMAI Lab We welcome Wenhao Tang (Nankai University), Shujian Gao (Fudan University), and Jiashi Lin (Northwestern Polytechnical University) as new visiting researchers, bringing expertise in computational pathology, multimodal learning, and LLM-based agents.
August 2025	Paper	Survey of Scientific LLMs released — collaboration with 20+ global institutions We release a comprehensive survey of Scientific Large Language Models (Sci-LLMs), produced in collaboration with 20+ leading global institutions. The survey covers 1,000+ papers and 600+ key datasets, and proposes a roadmap for AI-assisted scientific discovery ecosystems.
July 2025	Paper	OphCLIP accepted at ICCV 2025 — hierarchical surgical video-language pre-training OphCLIP, our hierarchical retrieval-augmented framework for ophthalmic surgical video understanding, was accepted at ICCV 2025. Built on OphVL (375K+ video-text pairs), OphCLIP sets new records on 11 benchmarks for phase recognition and multi-instrument detection tasks.
June 2025	Paper	Ophora accepted at MICCAI 2025 as Oral Presentation Our text-guided ophthalmic surgical video generation model Ophora was accepted at MICCAI 2025 as an oral presentation. Ophora is trained on 160K video-text pairs and outperforms existing methods on FID, FVD, and CLIPScore metrics.
June 2025	Paper	RetinaLogos, ProgEmu, and MRI Translation accepted at MICCAI 2025 Three papers accepted at MICCAI 2025: RetinaLogos (language-driven high-resolution fundus image generation), ProgEmu (interpretable counterfactual medical image generation), and Multi-modal MRI Translation via Evidential Regression and Distribution Calibration.
May 2025	Paper	MedITok: First unified visual tokenizer for medical image synthesis and interpretation We release MedITok, the first unified visual tokenizer designed for medical images. Pre-trained on 30M+ images, MedITok achieves SOTA across reconstruction, classification, generation, and VQA tasks spanning 9 imaging modalities and 30+ datasets.
May 2025	Paper	MedSegAgent accepted at IEEE Journal of Biomedical and Health Informatics MedSegAgent, our multi-agent system for instructive medical image segmentation, was accepted at IEEE Journal of Biomedical and Health Informatics (JBHI). The system supports 343 segmentation targets across CT, MRI, PET/CT, and ultrasound without training a single universal model.
February 2025	Paper	SlideChat accepted at CVPR 2025 Our whole-slide pathology image understanding assistant SlideChat was accepted at CVPR 2025. SlideChat achieves 81.17% accuracy on SlideBench-VQA and surpasses state-of-the-art on 18 of 22 benchmark tasks.
December 2024	Paper	GMAI-MMBench presented at NeurIPS 2024 — evaluating 50 large vision-language models GMAI-MMBench was presented at NeurIPS 2024 — the most comprehensive general medical AI evaluation platform covering 284 datasets, 38 imaging modalities, and 18 clinical tasks. Even top-performing GPT-4o achieves only 53.96% accuracy.
November 2024	Paper	GMAI-VL released — general medical VLM trained on 5.5M image-text pairs We release GMAI-VL, a general-purpose medical vision-language model trained on GMAI-VL-5.5M — 5.5M high-quality image-text pairs spanning 18 clinical specialties and 10+ imaging modalities. GMAI-VL achieves or surpasses SOTA on multiple medical multimodal VQA and diagnostic reasoning benchmarks.
October 2024	Award	SAM-Med3D selected as Oral at ECCV 2024 Biomedical Image Computing Workshop SAM-Med3D was selected as an oral presentation at the ECCV 2024 Biomedical Image Computing (BIC) Workshop. SAM-Med3D adapts SAM to 3D volumetric medical images, covering 247 segmentation categories across 21K medical volumes.
June 2024	Paper	OmniMedVQA accepted at CVPR 2024 — large-scale medical VQA benchmark OmniMedVQA was accepted at CVPR 2024. The benchmark integrates 73 datasets across 12 imaging modalities and 20+ anatomical regions, revealing that many medical-specific models surprisingly underperform general-purpose LVLMs on medical VQA tasks.