Medical visual question answering: A survey

Zhihong Lin, Donghao Zhang, Qingyi Tao, Danli Shi, Gholamreza Haffari, Qi Wu, Mingguang He, Zongyuan Ge (Corresponding Author)

Research output: Journal article publicationReview articleAcademic researchpeer-review

14 Citations (Scopus)


Medical Visual Question Answering (VQA) is a combination of medical artificial intelligence and popular VQA challenges. Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer. Although the general-domain VQA has been extensively studied, the medical VQA still needs specific investigation and exploration due to its task features. In the first part of this survey, we collect and discuss the publicly available medical VQA datasets up-to-date about the data source, data quantity, and task feature. In the second part, we review the approaches used in medical VQA tasks. We summarize and discuss their techniques, innovations, and potential improvements. In the last part, we analyze some medical-specific challenges for the field and discuss future research directions. Our goal is to provide comprehensive and helpful information for researchers interested in the medical visual question answering field and encourage them to conduct further research in this field.
Original languageEnglish
Article number102611
Pages (from-to)1-16
Number of pages16
JournalArtificial Intelligence in Medicine
Publication statusPublished - Sept 2023


  • Computer vision
  • Medical image interpretation
  • Natural language processing
  • Visual question answering

ASJC Scopus subject areas

  • Medicine (miscellaneous)
  • Artificial Intelligence


Dive into the research topics of 'Medical visual question answering: A survey'. Together they form a unique fingerprint.

Cite this