Abstract
With the tremendous growth in the volume of text documents available on the Internet and digital libraries, accurate specific topic text filtering is needed. In this paper we propose a Rough Set aided method to reduce the dimensionality of feature vectors. In order to extract accurate features, we also provide a novel filtering technique called twice-filtering to treat with two different feature sets; "Inter-Keywords" and "Intra-Keyword", A\ simple application of E-mail filtering system based on our topic-specific filtering technology shows that with the incorporation of variant weighting methods and more accurate features extracted, our filtering algorithm can speedup the filtering operation with a high precision and recall.
Original language | English |
---|---|
Pages (from-to) | 1095-1098 |
Number of pages | 4 |
Journal | Canadian Conference on Electrical and Computer Engineering |
Volume | 2 |
Publication status | Published - 1 Oct 2003 |
Externally published | Yes |
Event | CCECE 2003 Canadian Conference on Electrical and Computer Engineering: Toward a Caring and Humane Technology - Montreal, Canada Duration: 4 May 2003 → 7 May 2003 |
Keywords
- DF
- Document Filtering
- Precision
- Recall
- Rough Set
- TF-IDF
ASJC Scopus subject areas
- Hardware and Architecture
- Electrical and Electronic Engineering