Adversarial Learning with Mask Reconstruction for Text-Guided Image Inpainting

Xingcai Wu, Yucheng Xie, Jiaqi Zeng, Zhenguo Yang, Yi Yu, Qing Li, Wenyin Liu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

5 Citations (Scopus)

Abstract

Text-guided image inpainting aims to complete the corrupted patches coherent with both visual and textual context. On one hand, existing works focus on surrounding pixels of the corrupted patches without considering the objects in the image, resulting in the characteristics of objects described in text being painted on non-object regions. On the other hand, the redundant information in text may distract the generation of objects of interest in the restored image. In this paper, we propose an adversarial learning framework with mask reconstruction (ALMR) for image inpainting with textual guidance, which consists of a two-stage generator and dual discriminators. The two-stage generator aims to restore coarse-grained and fine-grained images, respectively. In particular, we devise a dual-attention module (DAM) to incorporate the word-level and sentence-level textual features as guidance on generating the coarse-grained and fine-grained details in the two stages. Furthermore, we design a mask reconstruction module (MRM) to penalize the restoration of the objects of interest with the given textual descriptions about the objects. For adversarial training, we exploit global and local discriminators for the whole image and corrupted patches, respectively. Extensive experiments conducted on CUB-200-2011, Oxford-102 and CelebA-HQ show the outperformance of the proposed ALMR (e.g., FID value is reduced from 29.69 to 14.69 compared with the state-of-the-art approach on CUB-200-2011). Codes are available at https://github.com/GaranWu/ALMR

Original languageEnglish
Title of host publicationMM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages3464-3472
Number of pages9
ISBN (Electronic)9781450386517
DOIs
Publication statusPublished - 17 Oct 2021
Event29th ACM International Conference on Multimedia, MM 2021 - Virtual, Online, China
Duration: 20 Oct 202124 Oct 2021

Publication series

NameMM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

Conference

Conference29th ACM International Conference on Multimedia, MM 2021
Country/TerritoryChina
CityVirtual, Online
Period20/10/2124/10/21

Keywords

  • object mask
  • text-guided image inpainting
  • textual and visual semantics

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Software
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Adversarial Learning with Mask Reconstruction for Text-Guided Image Inpainting'. Together they form a unique fingerprint.

Cite this