TY - GEN
T1 - Evalution-man 2.0: Expand the evaluation dataset for vector space models
AU - Liu, Hongchao
AU - Huang, Chu-ren
PY - 2016/1/1
Y1 - 2016/1/1
N2 - We introduce EVALution 2.0, a simplified Mandarin dataset for the evaluation of Vector Space Models. We take a psycholinguistics-based methodology through the use of a verbal association task, which differs from previous datasets that use corpus and ontology to construct word relation pairs. Semantic neighbors were created for 100 target words and surprisingly, to which participants produced 1129 word relation pairs. In a separate agreement-rating task, only 62 pairs showed were rejected. The methodology has proven to be a way to expand the existing resources quickly while maintaining a high level of quality.
AB - We introduce EVALution 2.0, a simplified Mandarin dataset for the evaluation of Vector Space Models. We take a psycholinguistics-based methodology through the use of a verbal association task, which differs from previous datasets that use corpus and ontology to construct word relation pairs. Semantic neighbors were created for 100 target words and surprisingly, to which participants produced 1129 word relation pairs. In a separate agreement-rating task, only 62 pairs showed were rejected. The methodology has proven to be a way to expand the existing resources quickly while maintaining a high level of quality.
UR - http://www.scopus.com/inward/record.url?scp=85007246749&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-49508-8_25
DO - 10.1007/978-3-319-49508-8_25
M3 - Conference article published in proceeding or book
SN - 9783319495071
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 261
EP - 268
BT - Chinese Lexical Semantics - 17th Workshop, CLSW 2016, Revised Selected Papers
PB - Springer Verlag
T2 - 17th Chinese Lexical Semantics Workshop, CLSW 2016
Y2 - 20 May 2016 through 22 May 2016
ER -