Multimodal Relation Extraction with Efficient Graph Alignment

Changmeng Zheng, Junhao Feng, Ze Fu, Yi Cai, Qing Li, Tao Wang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

43 Citations (Scopus)


Relation extraction (RE) is a fundamental process in constructing knowledge graphs. However, previous methods on relation extraction suffer sharp performance decline in short and noisy social media texts due to a lack of contexts. Fortunately, the related visual contents (objects and their relations) in social media posts can supplement the missing semantics and help to extract relations precisely. We introduce the multimodal relation extraction (MRE), a task that identifies textual relations with visual clues. To tackle this problem, we present a large-scale dataset which contains 15000+ sentences with 23 pre-defined relation categories. Considering that the visual relations among objects are corresponding to textual relations, we develop a dual graph alignment method to capture this correlation for better performance. Experimental results demonstrate that visual contents help to identify relations more precisely against the text-only baselines. Besides, our alignment method can find the correlations between vision and language, resulting in better performance. Our dataset and code are available at

Original languageEnglish
Title of host publicationMM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Number of pages9
ISBN (Electronic)9781450386517
Publication statusPublished - 17 Oct 2021
Event29th ACM International Conference on Multimedia, MM 2021 - Virtual, Online, China
Duration: 20 Oct 202124 Oct 2021

Publication series

NameMM 2021 - Proceedings of the 29th ACM International Conference on Multimedia


Conference29th ACM International Conference on Multimedia, MM 2021
CityVirtual, Online


  • graph alignment
  • multimodal dataset
  • multimodal relation extraction

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Software
  • Computer Graphics and Computer-Aided Design


Dive into the research topics of 'Multimodal Relation Extraction with Efficient Graph Alignment'. Together they form a unique fingerprint.

Cite this