Abstract
This paper proposes an algorithm called Imprecise Spectrum Analysis (ISA) to carry out fast dimension reduction for document classification. ISA is designed based on the one-sided Jacobi method for Singular Value Decomposition (SVD). To speedup dimension reduction, it simplifies the orthog-onalization process of Jacobi computation and introduces a new mapping formula for transforming original DOCument-term vectors. To improve classification accuracy using ISA, a feature selection method is further developed to make inter-class feature vectors more orthogonal in building the initial weighted term-document matrix. Our experimental results show that ISA is extremely fast in handling large term-document matrices and delivers better or competitive classification accuracy compared to SVD-based LSI.
Original language | English |
---|---|
Title of host publication | CIKM'10 - Proceedings of the 19th International Conference on Information and Knowledge Management and Co-located Workshops |
Pages | 1753-1756 |
Number of pages | 4 |
DOIs | |
Publication status | Published - 1 Dec 2010 |
Event | 19th International Conference on Information and Knowledge Management and Co-located Workshops, CIKM'10 - Toronto, ON, Canada Duration: 26 Oct 2010 → 30 Oct 2010 |
Conference
Conference | 19th International Conference on Information and Knowledge Management and Co-located Workshops, CIKM'10 |
---|---|
Country/Territory | Canada |
City | Toronto, ON |
Period | 26/10/10 → 30/10/10 |
Keywords
- Dimension reduction
- Feature selection
- LSI
- SVD
ASJC Scopus subject areas
- General Decision Sciences
- General Business,Management and Accounting