Dilated Residual Aggregation Network for Text-Guided Image Manipulation

Siwei Lu, Di Luo, Zhenguo Yang, Tianyong Hao, Qing Li, Wenyin Liu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Text-guided image manipulation aims to modify the visual attributes of images according to textual descriptions. Existing works either mismatch between generated images and textual descriptions or may pollute the text-irrelevant image regions. In this paper, we propose a dilated residual aggregation network (denoted as DRA) for text-guided image manipulation, which exploits a long-distance residual with dilated convolutions (RD) to aggregate the encoded visual content and style features and the textual features of the guiding descriptions. In particular, the dilated convolutions increase the receptive field without sacrificing spatial resolutions of intermediate features, benefiting to reconstructing the texture details matching with the textual descriptions. Furthermore, we propose an attention-guided injection module (AIM) to inject textual semantics into feature maps of DRA without polluting the text-irrelevant image regions by combining triplet attention mechanism and central biasing instance normalization. Quantitative and qualitative experiments conducted on the CUB-200-2011 and Oxford-102 datasets demonstrate the superior performance of the proposed DRA.

Original languageEnglish
Title of host publicationArtificial Neural Networks and Machine Learning – ICANN 2021 - 30th International Conference on Artificial Neural Networks, Proceedings
EditorsIgor Farkaš, Paolo Masulli, Sebastian Otte, Stefan Wermter
PublisherSpringer Science and Business Media Deutschland GmbH
Pages28-40
Number of pages13
ISBN (Print)9783030863647
DOIs
Publication statusPublished - 2021
Event30th International Conference on Artificial Neural Networks, ICANN 2021 - Virtual, Online
Duration: 14 Sept 202117 Sept 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12893 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference30th International Conference on Artificial Neural Networks, ICANN 2021
CityVirtual, Online
Period14/09/2117/09/21

Keywords

  • Attention-guide injection
  • Long-distance residual
  • Text-guided image manipulation

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Dilated Residual Aggregation Network for Text-Guided Image Manipulation'. Together they form a unique fingerprint.

Cite this