Skip to main navigation Skip to search Skip to main content

Learning multimodal word representation via dynamic fusion methods

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Multimodal models have been proven to outperform text-based models on learning semantic word representations. Almost all previous multimodal models typically treat the representations from different modalities equally. However, it is obvious that information from different modalities contributes differently to the meaning of words. This motivates us to build a multimodal model that can dynamically fuse the semantic representations from different modalities according to different types of words. To that end, we propose three novel dynamic fusion methods to assign importance weights to each modality, in which weights are learned under the weak supervision of word association pairs. The extensive experiments have demonstrated that the proposed methods outperform strong unimodal baselines and state-of-the-art multimodal models.

Original languageEnglish
Title of host publicationProceedings of the AAAI Conference on Artificial Intelligence
PublisherAAAI press
Pages5973-5980
Number of pages8
Volume32
ISBN (Electronic)9781577358008
DOIs
Publication statusPublished - 26 Apr 2018
Externally publishedYes
Event32nd AAAI Conference on Artificial Intelligence, AAAI 2018 - New Orleans, United States
Duration: 2 Feb 20187 Feb 2018

Publication series

Name32nd AAAI Conference on Artificial Intelligence, AAAI 2018

Conference

Conference32nd AAAI Conference on Artificial Intelligence, AAAI 2018
Country/TerritoryUnited States
CityNew Orleans
Period2/02/187/02/18

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Learning multimodal word representation via dynamic fusion methods'. Together they form a unique fingerprint.

Cite this