Short Text Understanding Through Lexical-Semantic Analysis

Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, Xiaofang Zhou

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

133 Citations (Scopus)

Abstract

Understanding short texts is crucial to many applications, but challenges abound. First, short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain sufficient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. We argue that knowledge is needed in order to better understand short texts. In this work, we use lexical-semantic knowledge provided by a well-known semantic network for short text understanding. Our knowledge-intensive approach disrupts traditional methods for tasks such as text segmentation, part-of-speech tagging, and concept labeling, in the sense that we focus on semantics in all these tasks. We conduct a comprehensive performance evaluation on real-life data. The results show that knowledge is indispensable for short text understanding, and our knowledge-intensive approaches are effective in harvesting semantics of short texts.

Original languageEnglish
Title of host publication2015 IEEE 31st International Conference on Data Engineering, ICDE 2015
PublisherIEEE Computer Society
Pages495-506
Number of pages12
ISBN (Electronic)9781479979639
DOIs
Publication statusPublished - 26 May 2015
Externally publishedYes
Event2015 31st IEEE International Conference on Data Engineering, ICDE 2015 - Seoul, Korea, Republic of
Duration: 13 Apr 201517 Apr 2015

Publication series

NameProceedings - International Conference on Data Engineering
Volume2015-May
ISSN (Print)1084-4627

Conference

Conference2015 31st IEEE International Conference on Data Engineering, ICDE 2015
Country/TerritoryKorea, Republic of
CitySeoul
Period13/04/1517/04/15

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'Short Text Understanding Through Lexical-Semantic Analysis'. Together they form a unique fingerprint.

Cite this