TY - GEN
T1 - Hand-eye Coordination for Textual Difficulty Detection in Text Summarization
AU - Wang, Jun
AU - Ngai, Grace
AU - Leong, Hong Va
N1 - Publisher Copyright:
© 2020 ACM.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/10/21
Y1 - 2020/10/21
N2 - The task of summarizing a document is a complex task that requires a person to multitask between reading and writing processes. Since a person's cognitive load during reading or writing is known to be dependent upon the level of comprehension or difficulty of the article, this suggests that it should be possible to analyze the cognitive process of the user when carrying out the task, as evidenced through their eye gaze and typing features, to obtain an insight into the different difficulty levels. In this paper, we categorize the summary writing process into different phases and extract different gaze and typing features from each phase according to characteristics of eye-gaze behaviors and typing dynamics. Combining these multimodal features, we build a classifier that achieves an accuracy of 91.0% for difficulty level detection, which is around 55% performance improvement above the baseline and at least 15% improvement above models built on a single modality. We also investigate the possible reasons for the superior performance of our multimodal features.
AB - The task of summarizing a document is a complex task that requires a person to multitask between reading and writing processes. Since a person's cognitive load during reading or writing is known to be dependent upon the level of comprehension or difficulty of the article, this suggests that it should be possible to analyze the cognitive process of the user when carrying out the task, as evidenced through their eye gaze and typing features, to obtain an insight into the different difficulty levels. In this paper, we categorize the summary writing process into different phases and extract different gaze and typing features from each phase according to characteristics of eye-gaze behaviors and typing dynamics. Combining these multimodal features, we build a classifier that achieves an accuracy of 91.0% for difficulty level detection, which is around 55% performance improvement above the baseline and at least 15% improvement above models built on a single modality. We also investigate the possible reasons for the superior performance of our multimodal features.
KW - eye gaze behavior
KW - human-computer interaction
KW - keyboard dynamics
KW - multimodal interaction
UR - http://www.scopus.com/inward/record.url?scp=85096669223&partnerID=8YFLogxK
U2 - 10.1145/3382507.3418831
DO - 10.1145/3382507.3418831
M3 - Conference article published in proceeding or book
AN - SCOPUS:85096669223
T3 - ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction
SP - 269
EP - 277
BT - ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction
PB - Association for Computing Machinery, Inc
T2 - 22nd ACM International Conference on Multimodal Interaction, ICMI 2020
Y2 - 25 October 2020 through 29 October 2020
ER -