Abstract
With the tremendous growth in the volume of text documents available on the Internet and digital libraries, accurate specific topic text filtering is needed. In this paper we propose a Rough Set aided method to reduce the dimensionality of feature vectors. In order to extract accurate features, we also provide a novel filtering technique called twice-filtering to treat with two different feature sets; "Inter-Keywords" and "Intra-Keyword", A\ simple application of E-mail filtering system based on our topic-specific filtering technology shows that with the incorporation of variant weighting methods and more accurate features extracted, our filtering algorithm can speedup the filtering operation with a high precision and recall.
| Original language | English |
|---|---|
| Pages (from-to) | 1095-1098 |
| Number of pages | 4 |
| Journal | Canadian Conference on Electrical and Computer Engineering |
| Volume | 2 |
| Publication status | Published - 1 Oct 2003 |
| Externally published | Yes |
| Event | CCECE 2003 Canadian Conference on Electrical and Computer Engineering: Toward a Caring and Humane Technology - Montreal, Canada Duration: 4 May 2003 → 7 May 2003 |
Keywords
- DF
- Document Filtering
- Precision
- Recall
- Rough Set
- TF-IDF
ASJC Scopus subject areas
- Hardware and Architecture
- Electrical and Electronic Engineering