Abstract
Choosing representative samples and removing data redundancy are two key issues in large-scale data classification. This paper proposes a new model, named interval extreme learning machine (ELM), for big data classification with continuousvalued attributes. The interval ELM model is built up based on two techniques, i.e., discretization of conditional attributes and fuzzification of class labels. First, inspired by the traditional decision tree (DT) induction algorithm, each conditional attribute is discretized into a number of intervals based on uncertainty reduction scheme. Then, the center and range of each interval are calculated as the mean and standard deviation of the values in it. Afterwards, the samples in the same intervals with regard to all the conditional attributes are merged as one record, and a fuzzification process is performed on the class labels. As a result, the original data set is transferred into a smaller one with fuzzy classes, and the interval ELM model is developed. Experimental comparisons demonstrate the feasibility and effectiveness of the proposed approach.
Original language | English |
---|---|
Pages (from-to) | 2391-2403 |
Number of pages | 13 |
Journal | Journal of Intelligent and Fuzzy Systems |
Volume | 28 |
Issue number | 5 |
DOIs | |
Publication status | Published - 23 Jun 2015 |
Keywords
- big data
- Extreme learning machine
- interval
- uncertainty reduction
ASJC Scopus subject areas
- Statistics and Probability
- General Engineering
- Artificial Intelligence