Abstract
Eye movement data are used in psycholinguistic studies to infer information regarding cognitive processes during reading. In this paper, we describe our proposed method for the Shared Task of Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Subtask 1, which involves data from multiple datasets on 6 languages. We compared different regression models using features of the target word and its previous word, and target word surprisal as regression features. Our final system, using a gradient boosting regressor, achieved the lowest mean absolute error (MAE), resulting in the best system of the competition.
Original language | English |
---|---|
Title of host publication | Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics |
Editors | Emmanuele Chersoni, Nora Hollenstein, Cassandra Jacobs, Yohei Oseki, Laurent Prévot, Enrico Santus |
Pages | 114–120 |
DOIs | |
Publication status | Published - Mar 2022 |
Event | Workshop on Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Dublin, Ireland Duration: 26 Apr 2022 → 26 Apr 2022 https://cmclorg.github.io/ |
Competition
Competition | Workshop on Cognitive Modeling and Computational Linguistics (CMCL) 2022 |
---|---|
Country/Territory | Ireland |
City | Dublin |
Period | 26/04/22 → 26/04/22 |
Internet address |
Keywords
- gradient boosting
- eyetracking
- prediction
- linguistic features
- crosslingual