TY - GEN
T1 - Using Motion Histories for Eye Contact Detection in Multiperson Group Conversations
AU - Fu, Eugene Yujun
AU - Ngai, Michael W.
N1 - Funding Information:
This work was supported, in part, by the Hong Kong Research Grant Council and the Hong Kong Polytechnic University under Grant 15600219.
Publisher Copyright:
© 2021 ACM.
PY - 2021/10/17
Y1 - 2021/10/17
N2 - Eye contact detection in group conversations is the key to developing artificial mediators that can understand and interact with a group. In this paper, we propose to model a group's appearances and behavioral features to perform eye contact detection for each participant in the conversation. Specifically, we extract the participants' appearance features at the detection moment, and extract the participants' behavioral features based on their motion history image, which is encoded with the participants' body movements within a small time window before the detection moment. In order to attain powerful representative features from these images, we propose to train a Convolutional Neural Network (CNN) to model them. A set of relevant features are obtained from the network, which achieves an accuracy of 0.60 on the validation set in the eye contact detection challenge in ACM MM 2021. Furthermore, our experimental results also demonstrate that making use of both participants' appearance and behavior features can lead to higher accuracy at eye detection than only using one of them.
AB - Eye contact detection in group conversations is the key to developing artificial mediators that can understand and interact with a group. In this paper, we propose to model a group's appearances and behavioral features to perform eye contact detection for each participant in the conversation. Specifically, we extract the participants' appearance features at the detection moment, and extract the participants' behavioral features based on their motion history image, which is encoded with the participants' body movements within a small time window before the detection moment. In order to attain powerful representative features from these images, we propose to train a Convolutional Neural Network (CNN) to model them. A set of relevant features are obtained from the network, which achieves an accuracy of 0.60 on the validation set in the eye contact detection challenge in ACM MM 2021. Furthermore, our experimental results also demonstrate that making use of both participants' appearance and behavior features can lead to higher accuracy at eye detection than only using one of them.
KW - deep learning
KW - eye contact detection
KW - group behavior
KW - motion history image
UR - http://www.scopus.com/inward/record.url?scp=85119383803&partnerID=8YFLogxK
U2 - 10.1145/3474085.3479230
DO - 10.1145/3474085.3479230
M3 - Conference article published in proceeding or book
AN - SCOPUS:85119383803
T3 - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
SP - 4873
EP - 4877
BT - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 29th ACM International Conference on Multimedia, MM 2021
Y2 - 20 October 2021 through 24 October 2021
ER -