TY - GEN
T1 - An improved incremental training approach for large scaled dataset based on support vector machine
AU - Guo, Jingcai
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/12/6
Y1 - 2016/12/6
N2 - The Support Vector Machine(SVM) is well known in machine learning and artificial intelligence for its high performance in data classification, regression and forecasting. Usually for large scaled dataset, an incremental training algorithm is applied for tuning or balancing the training cost and the accuracy in SVM applications. This paper presents an improved incremental training approach for large scaled dataset on SVM. We focus on data's own distribution information to unfold our research, we proposed a self adaptive clustering method to extract the area and density information of data, a border detection technologies and uncertainty strategy is applied to maintain the border and some potential samples. Our proposed method can greatly reduce the training error for incremental training on SVM, especially for some uneven distribution dataset. We can greatly tuning or balancing the training cost and the accuracy of algorithms to achieve a better performance.
AB - The Support Vector Machine(SVM) is well known in machine learning and artificial intelligence for its high performance in data classification, regression and forecasting. Usually for large scaled dataset, an incremental training algorithm is applied for tuning or balancing the training cost and the accuracy in SVM applications. This paper presents an improved incremental training approach for large scaled dataset on SVM. We focus on data's own distribution information to unfold our research, we proposed a self adaptive clustering method to extract the area and density information of data, a border detection technologies and uncertainty strategy is applied to maintain the border and some potential samples. Our proposed method can greatly reduce the training error for incremental training on SVM, especially for some uneven distribution dataset. We can greatly tuning or balancing the training cost and the accuracy of algorithms to achieve a better performance.
KW - Classification
KW - Data distribution
KW - Incremental learning
KW - Self adaptive clustering
KW - Uncertainty
UR - http://www.scopus.com/inward/record.url?scp=85013191642&partnerID=8YFLogxK
U2 - 10.1145/3006299.3006307
DO - 10.1145/3006299.3006307
M3 - Conference article published in proceeding or book
AN - SCOPUS:85013191642
T3 - Proceedings - 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2016
SP - 149
EP - 157
BT - Proceeding of 2016 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT)
PB - Association for Computing Machinery, Inc
T2 - 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2016
Y2 - 6 December 2016 through 9 December 2016
ER -