Abstract
English and Mandarin Chinese are two distinct languages in many aspects, such as orthography and morphology. Previous network analyses show strong clustering coefficients (C) on English semantic networks, revealing the interconnectedness of semantic representations between words. However, it is not clear whether such semantic representation properties are language specific or general, and whether the linguistic- feature difference (e.g., subword components such as orthography and morphology) may affect the lexico-semantic structure. Here, we compared Cs of words in English and Mandarin semantic networks based on a) feature norms empirically derived from human subjects and b) distributed semantic information of text retrieved by word embedding models. We consistently observed higher Cs of Mandarin words than English words, especially when the semantic network considers subword features. Linear regressions suggested that the subword components’ semantic properties in Mandarin, but not in English, could significantly and positively predict the C of words in semantic networks. The results indicate an important role of language-specific properties in lexico-semantic structures and imply the diversity of human language processing.
Original language | English |
---|---|
Pages (from-to) | 3484-3491 |
Number of pages | 8 |
Journal | Proceedings of the Annual Meeting of the Cognitive Science Society |
Volume | 45 |
Publication status | Published - Jul 2023 |
Event | 44th Annual Meeting of the Cognitive Science Society (COGSCI 2023) - International Convention Centre Sydney, Sydney, Australia Duration: 26 Jul 2023 → 29 Jul 2023 |
Keywords
- Network science
- Semantic networks
- Cross-linguistic comparison
- Feature norms
- Word embeddings
- Computational modeling