TY - GEN
T1 - Discovering Semantic Subdimensions through Disentangled Conceptual Representations
AU - Zhang, Yunhao
AU - Wang, Shaonan
AU - Lin, Nan
AU - Dong, Xinyi
AU - Li, Chong
AU - Zong, Chengqing
N1 - Publisher Copyright:
©2025 Association for Computational Linguistics.
PY - 2025/11
Y1 - 2025/11
N2 - Understanding the core dimensions of conceptual semantics is fundamental to uncovering how meaning is organized in language and the brain. Existing approaches often rely on predefined semantic dimensions that offer only broad representations, overlooking finer conceptual distinctions. This paper proposes a novel framework to investigate the subdimensions underlying coarse-grained semantic dimensions. Specifically, we introduce a Disentangled Continuous Semantic Representation Model (DCSRM) that decomposes word embeddings from large language models into multiple sub-embeddings, each encoding specific semantic information. Using these sub-embeddings, we identify a set of interpretable semantic subdimensions. To assess their neural plausibility, we apply voxel-wise encoding models to map these subdimensions to brain activation. Our work offers more fine-grained interpretable semantic subdimensions of conceptual meaning. Further analyses reveal that semantic dimensions are structured according to distinct principles, with polarity emerging as a key factor driving their decomposition into subdimensions. The neural correlates of the identified subdimensions support their cognitive and neuroscientific plausibility.
AB - Understanding the core dimensions of conceptual semantics is fundamental to uncovering how meaning is organized in language and the brain. Existing approaches often rely on predefined semantic dimensions that offer only broad representations, overlooking finer conceptual distinctions. This paper proposes a novel framework to investigate the subdimensions underlying coarse-grained semantic dimensions. Specifically, we introduce a Disentangled Continuous Semantic Representation Model (DCSRM) that decomposes word embeddings from large language models into multiple sub-embeddings, each encoding specific semantic information. Using these sub-embeddings, we identify a set of interpretable semantic subdimensions. To assess their neural plausibility, we apply voxel-wise encoding models to map these subdimensions to brain activation. Our work offers more fine-grained interpretable semantic subdimensions of conceptual meaning. Further analyses reveal that semantic dimensions are structured according to distinct principles, with polarity emerging as a key factor driving their decomposition into subdimensions. The neural correlates of the identified subdimensions support their cognitive and neuroscientific plausibility.
UR - https://www.scopus.com/pages/publications/105028951808
U2 - 10.18653/v1/2025.findings-emnlp.325
DO - 10.18653/v1/2025.findings-emnlp.325
M3 - Conference article published in proceeding or book
AN - SCOPUS:105028951808
T3 - EMNLP 2025 - 2025 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025
SP - 6127
EP - 6144
BT - EMNLP 2025 - 2025 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025
A2 - Christodoulopoulos, Christos
A2 - Chakraborty, Tanmoy
A2 - Rose, Carolyn
A2 - Peng, Violet
PB - Association for Computational Linguistics (ACL)
T2 - 30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025
Y2 - 4 November 2025 through 9 November 2025
ER -