Comparing and Predicting Eye-tracking Data in Mandarin and Cantonese

Junlin Li, Bo Peng, Yu Yin Hsu, Emmanuele Chersoni

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

4 Citations (Scopus)

Abstract

Eye-tracking data in Chinese languages present unique challenges due to the non-alphabetic and unspaced nature of the Chinese writing systems. This paper introduces the first deeply-annotated joint Mandarin-Cantonese eye-tracking dataset, from which we achieve a unified eye-tracking prediction system for both language varieties. In addition to the commonly studied first fixation duration and the total fixation duration, this dataset also includes the second fixation duration, expressing fixation patterns that are more relevant to higher-level, structural processing. A basic comparison of the features and measurements in our dataset revealed variation between Mandarin and Cantonese on fixation patterns related to word class and word position. The test of feature usefulness suggested that traditional features are less powerful in predicting the second-pass fixation, to which the linear distance to root makes a leading contribution in Mandarin. In contrast, Cantonese eye-movement behavior relies more on word position and part of speech.

Original languageEnglish
Title of host publicationACL 2023 - 10th Workshop on NLP for Similar Languages, Varieties and Dialects, VarDial 2023 - Proceedings of the Workshop
EditorsYves Scherrer, Tommi Jauhiainen, Nikola Ljubesic, Preslav Nakov, Jorg Tiedemann, Marcos Zampieri
PublisherAssociation for Computational Linguistics (ACL)
Pages121-132
Number of pages12
ISBN (Electronic)9781959429500
Publication statusPublished - May 2023
Event10th Workshop on NLP for Similar Languages, Varieties and Dialects, VarDial 2023 - Hybrid, Dubrovnik, Croatia
Duration: 5 May 2023 → …

Publication series

NameACL 2023 - 10th Workshop on NLP for Similar Languages, Varieties and Dialects, VarDial 2023 - Proceedings of the Workshop

Conference

Conference10th Workshop on NLP for Similar Languages, Varieties and Dialects, VarDial 2023
Country/TerritoryCroatia
CityHybrid, Dubrovnik
Period5/05/23 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'Comparing and Predicting Eye-tracking Data in Mandarin and Cantonese'. Together they form a unique fingerprint.

Cite this