Comparison study on critical components in composition model for phrase representation

Research output: Journal article publicationJournal articleAcademic researchpeer-review

17 Citations (Scopus)

Abstract

Phrase representation, an important step in many NLP tasks, involves representing phrases as continuousvalued vectors. This article presents detailed comparisons concerning the effects of word vectors, training data, and the composition and objective function used in a composition model for phrase representation. Specifically, we first discuss how the augmented word representations affect the performance of the composition model. Then, we investigate whether different types of training data influence the performance of the composition model and, if so, how they influence it. Finally, we evaluate combinations of different composition and objective functions and discuss the factors related to composition model performance. All evaluations were conducted in both English and Chinese. Our main findings are as follows: (1) The Additive model with semantic enhanced word vectors performs comparably to the state-of-the-art model; (2) The Additive model which updates augmented word vectors and the Matrix model with semantic enhanced word vectors systematically outperforms the state-of-the-art model in bigram and multi-word phrase similarity task, respectively; (3) Representing the high frequency phrases by estimating their surrounding contexts is a good training objective for bigram phrase similarity tasks; and (4) The performance gain of composition model with semantic enhanced word vectors is due to the composition function and the greater weight attached to important words. Previous works focus on the composition function; however, our findings indicate that other components in the composition model (especially word representation) make a critical difference in phrase representation.

Original languageEnglish
Article number16
JournalACM Transactions on Asian and Low-Resource Language Information Processing
Volume16
Issue number3
DOIs
Publication statusPublished - 20 Jan 2017
Externally publishedYes

Keywords

  • Composition model
  • Max-margin
  • Mean square error
  • Phrase representation
  • Retrofitting
  • Word paraphrasing

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Comparison study on critical components in composition model for phrase representation'. Together they form a unique fingerprint.

Cite this