Towards Unbiased Multi-label Zero-Shot Learning with Pyramid and Semantic Attention

Ziming Liu, Song Guo, Jingcai Guo, Yuanyuan Xu, Fushuo Huo

Research output: Journal article publicationJournal articleAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Multi-label zero-shot learning extends conventional single-label zero-shot learning to a more realistic scenario that aims at recognizing multiple unseen labels of classes for each input sample. Existing works usually exploit attention mechanism to generate the correlation among different labels. However, most of them are usually biased on several <italic>major classes</italic> while neglect most of the <italic>minor classes</italic> with the same importance in input samples, and may thus result in overly diffused attention maps that cannot sufficiently cover <italic>minor classes</italic>. We argue that disregarding the connection between major and minor classes, i.e., correspond to the global and local information, respectively, is the cause of the problem. In this paper, we propose a novel framework of unbiased multi-label zero-shot learning, by considering various class-specific regions to calibrate the training process of the classifier. Specifically, <italic>Pyramid Feature Attention</italic> (<italic>PFA</italic>) is proposed to build the correlation between global and local information of samples to balance the presence of each class. Meanwhile, for the generated semantic representations of input samples, we propose <italic>Semantic Attention</italic> (<italic>SA</italic>) to strengthen the element-wise correlation among these vectors, which can encourage the coordinated representation of them. Extensive experiments on the large-scale multi-label benchmarks <italic>MS-COCO</italic>, <italic>NUS-WIDE</italic> and <italic>Open-Images</italic> demonstrate that the proposed method surpasses other representative methods by significant margins.

Original languageEnglish
Pages (from-to)1-15
Number of pages15
JournalIEEE Transactions on Multimedia
DOIs
Publication statusAccepted/In press - 2022

Keywords

  • Attention Mechanism
  • Classification
  • Computational modeling
  • Correlation
  • Feature extraction
  • Image recognition
  • Multi-label Zero-shot learning
  • Pattern Recognition
  • Semantic Feature Space
  • Semantics
  • Task analysis
  • Training

ASJC Scopus subject areas

  • Signal Processing
  • Media Technology
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Towards Unbiased Multi-label Zero-Shot Learning with Pyramid and Semantic Attention'. Together they form a unique fingerprint.

Cite this