Evaluating Monolingual and Crosslingual Embeddings on Datasets of Word Association Norms

Trina Kwong, Emmanuele Chersoni, Rong Xiang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

In free word association tasks, human subjects are presented with a stimulus word and are then asked to name the first word (the response word) that comes up to their mind. Those associations, presumably learned on the basis of conceptual contiguity or similarity, have attracted for a long time the attention of researchers in linguistics and cognitive psychology, since they are considered as clues about the internal organization of the lexical knowledge in the semantic memory. Word associations data have also been used to assess the performance of Vector Space Models for English, but evaluations for other languages have been relatively rare so far. In this paper, we introduce word associations datasets for Italian, Spanish and Mandarin Chinese by extracting data from the Small World of Words project, and we propose two different tasks inspired by the previous literature. We tested both monolingual and crosslingual word embeddings on the new datasets, showing that they perform similarly in the evaluation tasks.
Original languageEnglish
Title of host publicationProceedings of 15th Workshop on Building and Using Comparable Corpora (BUCC 2022)
EditorsReinhard Rapp, Pierre Zweigenbaum, Serge Sharoff
PublisherThe European Language Resources Association(ELRA)
Pages1–7
ISBN (Print)979-10-95546-94-8
Publication statusPublished - Jun 2022
Event15th Workshop on Building and Using Comparable Corpora - Palais de Pharo, Marseille, France
Duration: 25 Jun 202225 Jun 2022
https://comparable.limsi.fr/bucc2022/

Conference

Conference15th Workshop on Building and Using Comparable Corpora
Abbreviated titleBUCC 2022
Country/TerritoryFrance
CityMarseille
Period25/06/2225/06/22
Internet address

Keywords

  • Word Associations
  • Distributional Semantic Models
  • Crosslingual Embeddings

Fingerprint

Dive into the research topics of 'Evaluating Monolingual and Crosslingual Embeddings on Datasets of Word Association Norms'. Together they form a unique fingerprint.

Cite this