The Canto-Lexicon Project: A Preliminary Report

Kai Yan Lau, I. Fan Su, Yen Na Yum

Research output: Unpublished conference presentation (presented paper, abstract, poster)AbstractAcademic researchpeer-review


Introduction. There are subtle differences in the orthographic and phonological forms of the Chinese language used in Mainland China, Taiwan and Hong Kong. In Hong Kong, traditional characters and Cantonese are used. Previous database projects conducted in Mainland China (Liu, Shu, & Li, 2007) and Taiwan (Chang, Hsu, Tsai, Chen & Lee, 2016) have documented the significance of different psycholinguistic variables in predicting character recognition. In the current study, preliminary data of the Canto-Lexicon Project conducted using traditional characters and Cantonese are reported. Methods. In the Canto-Lexicon Project, a total of 4376 most frequently found traditional characters (3327 of them are phonetic compounds [PC] and 1049 are non-phonetic compounds [nonPC]) in newspapers in Hong Kong (Leung & Lau, 2010) were selected as stimuli. An equal number of pseudo-characters were constructed by shuffling the constituent radicals of the target characters for a lexical decision task. Twenty undergraduates (age ranged from 19 to 22 years old, gender balanced, with no prior linguistic training and literacy problem reported) were recruited for both a lexical decision task and a character naming task. Another twenty undergraduates were recruited to give ratings of (1) Imageability (IMG), (2) Age of Acquisition (AoA), (3) Concreteness, and (4) Familiarity of each of the real characters. Measures of character frequency, radical frequency, phonetic regularity, phonetic consistency, homophone number and stroke number of each selected character were obtained from the Hong Kong Corpus of Chinese News-Paper (Leung & Lau, 2010). In this study, the data of 14 participants in the lexical decision task together with the rating data of IMG and AoA were analyzed using linear mixed effects modeling. Two models were fitted: Model 1 using RT data of both PC and nonPC and Model 2 using RT data of PC only. Results. Results showed that in Model 1, after controlling for the session order and the within-session presentation order, effects of stroke number, frequency, AoA and IMG, and two-way interactions between frequency and AoA, AoA and IMG, and frequency and IMG significantly predicted the RT. In Model 2, after controlling for the session order and the within-session presentation order, similar effects to Model 1 in addition to phonetic regularity and interactions among frequency, AoA and phonetic regularity significantly predicted the RT. Discussion. Results of Model 1 indicated that in the lexical decision task, character recognition is affected by both lexical and semantic factors. Results of Model 2 added that a phonological effect is significant in lexical decision for Chinese phonetic compounds. In general, results of both models are consistent with previous reports regarding the roles of different psycholinguistic variables in character recognition. Data collection is still in progress and results of additional analyses will be presented. Implications for theories of lexical processing in Chinese will also be discussed.
Original languageEnglish
Publication statusPublished - 9 Oct 2019
EventAcademy of Aphasia 57th Annual Meeting, Macau - Macau, China
Duration: 27 Oct 201929 Oct 2019


CompetitionAcademy of Aphasia 57th Annual Meeting, Macau
Internet address


  • Chinese (Cantonese)
  • psycholinguistic
  • age of acquisition (AoA)
  • Lexical Processing
  • imagability


Dive into the research topics of 'The Canto-Lexicon Project: A Preliminary Report'. Together they form a unique fingerprint.

Cite this