EVALution 1.0: An Evolving Semantic Dataset for Training and Evaluation of Distributional Semantic Models

Enrico Santus, Frances Yung, Alessandro Lenci, Chu Ren Huang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

70 Citations (Scopus)

Abstract

In this paper, we introduce EVALution 1.0, a dataset designed for the training and the evaluation of Distributional Semantic Models (DSMs). This version consists of almost 7.5K tuples, instantiating several semantic relations between word pairs (including hypernymy, synonymy, antonymy, meronymy). The dataset is enriched with a large amount of additional information (i.e. relation domain, word frequency, word POS, word semantic field, etc.) that can be used for either filtering the pairs or performing an in-depth analysis of the results. The tuples were extracted from a combination of ConceptNet 5.0 andWord- Net 4.0, and subsequently filtered through automatic methods and crowdsourcing in order to ensure their quality. The dataset is freely downloadable1. An extension in RDF format, including also scripts for data processing, is under development.

Original languageEnglish
Title of host publicationProceedings of the 4th Workshop on Linked Data in Linguistics
Subtitle of host publicationResources and Applications, LDL 2015 - collocated with 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2015
EditorsChristian Chiarcos, John Philip McCrae, Petya Osenova, Philipp Cimiano, Nancy Ide
PublisherAssociation for Computational Linguistics (ACL)
Pages64-69
Number of pages6
ISBN (Electronic)9781941643570
DOIs
Publication statusPublished - Jul 2015
Event4th Workshop on Linked Data in Linguistics: Resources and Applications, LDL 2015 - Beijing, China
Duration: 31 Jul 2015 → …

Publication series

NameProceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications, LDL 2015 - collocated with 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2015

Conference

Conference4th Workshop on Linked Data in Linguistics: Resources and Applications, LDL 2015
Country/TerritoryChina
CityBeijing
Period31/07/15 → …

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'EVALution 1.0: An Evolving Semantic Dataset for Training and Evaluation of Distributional Semantic Models'. Together they form a unique fingerprint.

Cite this