Large-scale Network Analyses Reveal Cross-Language Differences in Semantic Structures: A Comparative Study

Qihui Xu, Yingying Peng, Ping Li

Research output: Journal article publicationConference articleAcademic researchpeer-review

Abstract

English and Mandarin Chinese are two distinct languages in many aspects, such as orthography and morphology. Previous network analyses show strong clustering coefficients (C) on English semantic networks, revealing the interconnectedness of semantic representations between words. However, it is not clear whether such semantic representation properties are language specific or general, and whether the linguistic- feature difference (e.g., subword components such as orthography and morphology) may affect the lexico-semantic structure. Here, we compared Cs of words in English and Mandarin semantic networks based on a) feature norms empirically derived from human subjects and b) distributed semantic information of text retrieved by word embedding models. We consistently observed higher Cs of Mandarin words than English words, especially when the semantic network considers subword features. Linear regressions suggested that the subword components’ semantic properties in Mandarin, but not in English, could significantly and positively predict the C of words in semantic networks. The results indicate an important role of language-specific properties in lexico-semantic structures and imply the diversity of human language processing.
Original languageEnglish
Pages (from-to)3484-3491
Number of pages8
JournalProceedings of the Annual Meeting of the Cognitive Science Society
Volume45
Publication statusPublished - Jul 2023
Event44th Annual Meeting of the Cognitive Science Society (COGSCI 2023) - International Convention Centre Sydney, Sydney, Australia
Duration: 26 Jul 202329 Jul 2023

Keywords

  • Network science
  • Semantic networks
  • Cross-linguistic comparison
  • Feature norms
  • Word embeddings
  • Computational modeling

Fingerprint

Dive into the research topics of 'Large-scale Network Analyses Reveal Cross-Language Differences in Semantic Structures: A Comparative Study'. Together they form a unique fingerprint.

Cite this