Noise-Robust Semi-supervised Multi-modal Machine Translation

Lin Li, Kaixi Hu, Turghun Tayir, Jianquan Liu, Kong Aik Lee

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Recent unsupervised multi-modal machine translation methods have shown promising performance for capturing semantic relationships in unannotated monolingual corpora by large-scale pretraining. Empirical studies show that small accessible parallel corpora can achieve comparable performance gains of large pretraining corpora in unsupervised setting. Inspired by the observation, we think semi-supervised learning can largely reduce the demand of pretraining corpora without performance degradation in low-cost scenario. However, images of parallel corpora typically contain much irrelevant information, i.e., visual noises. Such noises have a negative impact on the semantic alignment between source and target languages in semi-supervised learning, thus weakening the contribution of parallel corpora. To effectively utilize the valuable and expensive parallel corpora, we propose a Noise-robust Semi-supervised Multi-modal Machine Translation method (Semi-MMT). In particular, a visual cross-attention sublayer is introduced into source and target language decoders, respectively. And, the representations of texts are used as a guideline to filter visual noises. Based on the visual cross-attention, we further devise a hybrid training strategy by employing four unsupervised and two supervised tasks to reduce the mismatch between the semantic representation spaces of source and target languages. Extensive experiments conducted on the Multi30k dataset show that our method outperforms the state-of-the-art unsupervised methods with large-scale extra corpora for pretraining in terms of METEOR metric, yet only requires 7% parallel corpora.

Original languageEnglish
Title of host publicationPRICAI 2022
Subtitle of host publicationTrends in Artificial Intelligence - 19th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2022, Proceedings
EditorsSankalp Khanna, Jian Cao, Quan Bai, Guandong Xu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages155-168
Number of pages14
ISBN (Print)9783031208645
DOIs
Publication statusPublished - Nov 2022
Externally publishedYes
Event19th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2022 - Shangai, China
Duration: 10 Nov 202213 Nov 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13630 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2022
Country/TerritoryChina
CityShangai
Period10/11/2213/11/22

Keywords

  • Multimodal data
  • Neural machine translation
  • Noise
  • Semi-supervised learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Noise-Robust Semi-supervised Multi-modal Machine Translation'. Together they form a unique fingerprint.

Cite this