Decoding Word Embeddings with Brain-Based Semantic Features

Emmanuele Chersoni, Enrico Santus, Chu-ren Huang, Alessandro Lenci

Research output: Journal article publicationJournal articleAcademic researchpeer-review

24 Citations (Scopus)

Abstract

Word embeddings are vectorial semantic representations built with either counting or predicting techniques aimed at capturing shades of meaning from word co-occurrences. Since their intro- duction, these representations have been criticized for lacking interpretable dimensions. This property of word embeddings limits our understanding of the semantic features they actually encode. Moreover, it contributes to the “black box” nature of the tasks in which they are used, since the reasons for word embedding performance often remain opaque to humans. In this contribution, we explore the semantic properties encoded in word embeddings by mapping them onto interpretable vectors, consisting of explicit and neurobiologically motivated semantic features (Binder et al. 2016). Our exploration takes into account different types of embeddings including factorized count vectors and predict models (Skip-Gram, GloVe, etc.), as well as the most recent contextualized representations (i.e., ELMo and BERT).
Original languageEnglish
Pages (from-to)663–698
JournalComputational Linguistics
Volume47
Issue number3
DOIs
Publication statusPublished - Sept 2021

Fingerprint

Dive into the research topics of 'Decoding Word Embeddings with Brain-Based Semantic Features'. Together they form a unique fingerprint.

Cite this