Using longest common subsequence matching for Chinese information retrieval

Xiao Yun, Wing Pong Robert Luk, K. F. Wong, K. L. Kwok

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

1 Citation (Scopus)

Abstract

This paper is about adopting the longest common subsequence (LCS) matching for Chinese information retrieval. We re-ranked the retrieved documents by a mixture of the original similarity score and the LCS score obtained by matching the document titles and the query. This LCS-based similarity score is also used in pseudo-relevance feedback in various ways (e.g., selecting terms and filtering documents with low LCS values). We evaluated the use of LCS in title re-ranking and PRF based on the NTCIR-4 test collection for Chinese ad hoc information retrieval. For title queries, our best MAP achieved is 26.7% evaluated using rigid relevance judgement and 30.2% evaluated using relax relevance judgement.
Original languageEnglish
Title of host publicationProceedings of the International Conference on Chinese Computing 2005, ICCC 2005
PublisherChinese and Oriental Languages Information Processing Society (COLIPS)
ISBN (Electronic)9789810530075, 9810530072
Publication statusPublished - 1 Jan 2005
EventInternational Conference on Chinese Computing 2005, ICCC 2005 - Singapore, Singapore
Duration: 21 Mar 200523 Mar 2005

Conference

ConferenceInternational Conference on Chinese Computing 2005, ICCC 2005
CountrySingapore
CitySingapore
Period21/03/0523/03/05

ASJC Scopus subject areas

  • Computer Science(all)

Cite this