Input Feature and Kernel Selection Improves Speed, Accuracy and Efficiency of Support Vector Machine Classification
Inventors: Olvi Mangasarian, Glenn Fung
The Wisconsin Alumni Research Foundation (WARF) is seeking commercial partners interested in developing a selection technique to increase the computational speed, accuracy and efficiency of SVM classification for data mining operations.
Support vector machines (SVMs) are powerful tools for classifying data that are often used in data mining operations. An SVM classifies data in extremely large data sets by identifying a linear or non-linear separating surface in the input space of a data set. Advantageously, the separating surface depends only on a subset of the original data known as a set of support vectors. To enhance the performance of an SVM classifier, the set of support vectors defining the separating surface should be made as small as possible, either by reducing a set of input features in the case of linear SVM classifiers, or by reducing a set of kernel functions in the case of non-linear SVMs.
UW-Madison researchers have developed a selection technique that makes use of a fast Newton method to produce a reduced set of input features for linear SVM classifiers or a reduced set of kernel functions for non-linear SVM classifiers. The ability to suppress less useful data and select a reduced set of meaningful support vectors promises to increase the computational speed, accuracy and efficiency of SVM classification for data mining operations.
- SVM classification for data mining operations, including those involving gene expression data, fraud detection, credit evaluation and medical diagnosis and prognosis
- Generates an SVM classifier that depends on very few input features, such as 7 features out of an original 28,000
- Handles classification problems in very high dimensional spaces, e.g., more than 28,000 dimensions
- Readily implemented with a simple linear equation solver, eliminating the need for specialized and expensive linear programming packages