Abstract
Learning discriminative representations with good robustness from facial observations serves as a fundamental step towards intelligent facial expression recognition (FER). In this paper, we propose a novel geometry-aware FER framework to boost the FER performance based on both the geometric and appearance knowledge. Specifically, we propose an encoding strategy for facial landmarks, and adopt a graph convolutional network (GCN) to fully explore the structural information of the facial components behind different expressions. A convolutional neural network (CNN) is further applied to the whole facial observation to learn the global characteristics of different expressions. The features from these two networks are fused into a comprehensive high-semantic representation, which promotes the FER reasoning from both visual and structural perspectives. Moreover, to facilitate the networks to concentrate on the most informative facial regions and components, we introduce multi-level attention mechanisms into the proposed framework, which enhance the reliability of the learned representations for effective FER. Experiments on two challenging FER benchmarks demonstrate that the attentive graph-based learning on the facial geometry boosts the FER accuracy. Furthermore, the insensitivity of the geometric information to the appearance variations also improves the generalization of the proposed framework.
Original language | English |
---|---|
Article number | 9454388 |
Pages (from-to) | 1-16 |
Number of pages | 16 |
Journal | IEEE Transactions on Affective Computing |
DOIs | |
Publication status | Accepted/In press - 2021 |
Keywords
- Cognition
- Convolution
- Emotion recognition
- Face recognition
- Facial expression recognition (FER)
- Feature extraction
- feature fusion
- geometry-aware representation learning
- graph convolutional network (GCN)
- multi-level attentive learning
- Semantics
- Streaming media
ASJC Scopus subject areas
- Software
- Human-Computer Interaction