Enhanced Attention Tracking With Multi-Branch Network for Egocentric Activity Recognition

Tianshan Liu, Kin Man Lam, Rui Zhao, Jun Kong

Research output: Journal article publicationJournal articleAcademic researchpeer-review

9 Citations (Scopus)


The emergence of wearable devices has opened up new potentials for egocentric activity recognition. Although some methods integrate attention mechanisms into deep neural networks to capture fine-grained human-object interactions in a weak-supervision manner, they either ignore exploiting the temporal consistency or generate attention based on considering appearance cues only. To address these limitations, in this paper, we propose an enhanced attention-tracking method, combined with multi-branch network (EAT-MBNet), for egocentric activity recognition. Specifically, we propose class-aware attention maps (CAAMs) by employing a self-attention-based module to refine the class activation maps (CAMs). Our proposed method can enhance the semantic dependency between the activity categories and the feature maps. To highlight the discriminative features from the regions of interest across frames, we propose a flow-guided attention-tracking (F-AT) module, by simultaneously leveraging historical attention and motion patterns. Furthermore, we propose a cross-modality modeling branch based on an interactive GRU module, which captures the time-synchronized long-term relationships between the appearance and motion branches. Experimental results on four egocentric activity benchmarks demonstrate that the proposed method achieves state-of-the-art performance.

Original languageEnglish
Article number9513243
Pages (from-to)3587-3602
Number of pages16
JournalIEEE Transactions on Circuits and Systems for Video Technology
Issue number6
Publication statusPublished - Jun 2022


  • attention tracking
  • Egocentric activity recognition
  • fine-grained hand-object interactions
  • multi-branch network

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering


Dive into the research topics of 'Enhanced Attention Tracking With Multi-Branch Network for Egocentric Activity Recognition'. Together they form a unique fingerprint.

Cite this