Kernel density estimation, kernel methods, and fast learning in large data sets

Shitong Wang, Jun Wang, Fu Lai Korris Chung

Research output: Journal article publicationJournal articleAcademic researchpeer-review

47 Citations (Scopus)

Abstract

Kernel methods such as the standard support vector machine and support vector regression trainings take O(N^{3})$ time and O(N^{2})$ space complexities in their naïve implementations, where $N$ is the training set size. It is thus computationally infeasible in applying them to large data sets, and a replacement of the naive method for finding the quadratic programming (QP) solutions is highly desirable. By observing that many kernel methods can be linked up with kernel density estimate (KDE) which can be efficiently implemented by some approximation techniques, a new learning method called fast KDE (FastKDE) is proposed to scale up kernel methods. It is based on establishing a connection between KDE and the QP problems formulated for kernel methods using an entropy-based integrated-squared-error criterion. As a result, FastKDE approximation methods can be applied to solve these QP problems. In this paper, the latest advance in fast data reduction via KDE is exploited. With just a simple sampling strategy, the resulted FastKDE method can be used to scale up various kernel methods with a theoretical guarantee that their performance does not degrade a lot. It has a time complexity of O(m^{3})$ where $m$ is the number of the data points sampled from the training set. Experiments on different benchmarking data sets demonstrate that the proposed method has comparable performance with the state-of-art method and it is effective for a wide range of kernel methods to achieve fast learning in large data sets.
Original languageEnglish
Article number6542693
Pages (from-to)1-20
Number of pages20
JournalIEEE Transactions on Cybernetics
Volume44
Issue number1
DOIs
Publication statusPublished - 1 Jan 2014

Keywords

  • Kernel density estimate (KDE)
  • kernel methods
  • quadratic programming (QP)
  • sampling
  • support vector machine (SVM)

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this