Abstract
In Distributional Semantic Models (DSMs), Vector Cosine is widely used to estimate similarity between word vectors, although this measure was noticed to suffer from several shortcomings. The recent lit ENGLerature has proposed other methods which attempt to mitigate such biases. In this paper, we intend to investigate APSyn, a measure that computes the extent of the intersection between the most associated contexts of two target words, weighting it by context relevance. We evaluated this metric in a similarity estimation task on several popular test sets, and our results show that APSyn is in fact highly competitive, even with respect to the results reported in the lit ENGLerature for word embeddings. On top of it, APSyn addresses some of the weaknesses of Vector Cosine, performing well also on genuine similarity estimation.
Original language | English |
---|---|
Title of host publication | Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016 |
Publisher | Institute for the Study of Language and Information |
Pages | 229-238 |
Number of pages | 10 |
ISBN (Electronic) | 9788968174285 |
Publication status | Published - 1 Jan 2016 |
Event | 30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016 - Kyung Hee University, Seoul, Korea, Republic of Duration: 28 Oct 2016 → 30 Oct 2016 |
Conference
Conference | 30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016 |
---|---|
Country/Territory | Korea, Republic of |
City | Seoul |
Period | 28/10/16 → 30/10/16 |
ASJC Scopus subject areas
- Language and Linguistics
- Computer Science (miscellaneous)
- Information Systems