Junjun He
Principal Investigator

Junjun He

Principal Investigator
Junjun He leads the GMAI (General Medical AI) research group. He received his PhD from Shanghai Jiao Tong University (SJTU), advised by Prof. Lixu Gu, and conducted research at the Multimedia Lab (MMLAB) of Shenzhen Institute of Advanced Technology (SIAT), CAS, with Prof. Yu Qiao. His research interests span dense prediction (medical image segmentation, object detection, instance segmentation), efficient deep learning (model compression, NAS, quantization), and general medical AI — including multimodal large language models, segmentation foundation models, clinical AI systems, and biomedical data infrastructure.

About Me

Dr. Junjun He is a researcher at Shanghai AI Laboratory, where he leads the GMAI (General Medical AI) Research Group — building general-purpose AI systems for medicine. He received his PhD from Shanghai Jiao Tong University (advised by Prof. Lixu Gu) and conducted research at the Multimedia Lab (MMLAB) of SIAT, Chinese Academy of Sciences, with Prof. Yu Qiao.

Research Interests

  • Medical Multimodal LLMs: GMAI-VL series, UniMedVL, SlideChat
  • Medical Segmentation Foundation Models: SAM-Med2D, SAM-Med3D, STU-Net (14M–1.4B params)
  • Clinical AI Systems: MedSegAgent multi-agent segmentation, surgical video understanding (OphCLIP, Ophora)
  • Medical Data Infrastructure: Project Imaging-X (1,000+ open medical imaging datasets)
  • Efficient Deep Learning: model compression, NAS, quantization

Highlights

The GMAI group has achieved systematic breakthroughs in general medical AI:

Medical Multimodal LLMs — GMAI-VL, trained on 5.5M image-text pairs across 18 clinical specialties and 38 imaging modalities, is a world-leading medical vision-language model. SlideChat is the first vision-language assistant to understand gigapixel whole-slide pathology images (CVPR 2025). GMAI-VL-R1 uses reinforcement learning to achieve ~30% average accuracy improvement across eight modalities, surpassing models 36× larger.

Medical Segmentation Foundation Models — SAM-Med3D extends the SAM architecture to 3D medical imaging and is among the most widely adopted open-source models in the field. The STU-Net family (14M–1.4B parameters) is the largest medical segmentation model to date, achieving 90.06% mean DSC on TotalSegmentator and winning the MICCAI 2023 ATLAS and SPPIN challenge championships.

Clinical AI Systems — OphCLIP (ICCV 2025) built a 375K video-text pair ophthalmic surgical dataset, achieving state-of-the-art zero-shot performance on 11 benchmarks. MedSegAgent orchestrates multi-agent collaboration for universal medical image segmentation across 23 datasets and 343 targets (IEEE JBHI 2026).

Academic Overview

  • Published at top venues including CVPR, ICCV, ECCV, NeurIPS, MICCAI, AAAI
  • Open-source projects with thousands of GitHub stars
  • Collaboration interest from Stanford University and other leading institutions
  • Led team to championship victories at multiple MICCAI 2023 challenges

Selected Early Works

  • APCNet: Adaptive Pyramid Context Network for Semantic Segmentation (CVPR 2019)
  • Dynamic Multi-scale Filters for Semantic Segmentation (ICCV 2019)
  • EfficientFCN: Holistically-guided Decoding for Semantic Segmentation (ECCV 2020; 4 ECCV papers in the same year)
  • ODIR-2019 Competition: 1st place in Ocular Disease Intelligent Recognition (Rank 1/1500+)

Professional Service

  • Reviewer for CVPR, MICCAI, ICME