Skip to main navigation Skip to search Skip to main content

A large dataset of semantic ratings and its computational extension

  • Shaonan Wang
  • , Yunhao Zhang
  • , Weiting Shi
  • , Guangyao Zhang
  • , Jiajun Zhang
  • , Nan Lin (Corresponding Author)
  • , Chengqing Zong

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Evidence from psychology and cognitive neuroscience indicates that the human brain’s semantic system contains several specific subsystems, each representing a particular dimension of semantic information. Word ratings on these different semantic dimensions can help investigate the behavioral and neural impacts of semantic dimensions on language processes and build computational representations of language meaning according to the semantic space of the human cognitive system. Existing semantic rating databases provide ratings for hundreds to thousands of words, which can hardly support a comprehensive semantic analysis of natural texts or speech. This article reports a large database, the Six Semantic Dimension Database (SSDD), which contains subjective ratings for 17,940 commonly used Chinese words on six major semantic dimensions: vision, motor, socialness, emotion, time, and space. Furthermore, using computational models to learn the mapping relations between subjective ratings and word embeddings, we include the estimated semantic ratings for 1,427,992 Chinese and 1,515,633 English words in the SSDD. The SSDD will aid studies on natural language processing, text analysis, and semantic representation in the brain.

Original languageEnglish
Article number106
JournalScientific data
Volume10
Issue number1
DOIs
Publication statusPublished - 23 Feb 2023
Externally publishedYes

ASJC Scopus subject areas

  • Statistics and Probability
  • Information Systems
  • Education
  • Computer Science Applications
  • Statistics, Probability and Uncertainty
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'A large dataset of semantic ratings and its computational extension'. Together they form a unique fingerprint.

Cite this