Fast dimension reduction for document classification based on Imprecise Spectrum Analysis

Hu Guan, Bin Xiao, Jingyu Zhou, Minyi Guo, Tao Yang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

3 Citations (Scopus)

Abstract

This paper proposes an algorithm called Imprecise Spectrum Analysis (ISA) to carry out fast dimension reduction for document classification. ISA is designed based on the one-sided Jacobi method for Singular Value Decomposition (SVD). To speedup dimension reduction, it simplifies the orthog-onalization process of Jacobi computation and introduces a new mapping formula for transforming original DOCument-term vectors. To improve classification accuracy using ISA, a feature selection method is further developed to make inter-class feature vectors more orthogonal in building the initial weighted term-document matrix. Our experimental results show that ISA is extremely fast in handling large term-document matrices and delivers better or competitive classification accuracy compared to SVD-based LSI.
Original languageEnglish
Title of host publicationCIKM'10 - Proceedings of the 19th International Conference on Information and Knowledge Management and Co-located Workshops
Pages1753-1756
Number of pages4
DOIs
Publication statusPublished - 1 Dec 2010
Event19th International Conference on Information and Knowledge Management and Co-located Workshops, CIKM'10 - Toronto, ON, Canada
Duration: 26 Oct 201030 Oct 2010

Conference

Conference19th International Conference on Information and Knowledge Management and Co-located Workshops, CIKM'10
Country/TerritoryCanada
CityToronto, ON
Period26/10/1030/10/10

Keywords

  • Dimension reduction
  • Feature selection
  • LSI
  • SVD

ASJC Scopus subject areas

  • General Decision Sciences
  • General Business,Management and Accounting

Fingerprint

Dive into the research topics of 'Fast dimension reduction for document classification based on Imprecise Spectrum Analysis'. Together they form a unique fingerprint.

Cite this