Turkish Journal of Electrical Engineering and Computer Sciences

Fast and de-noise support vector machine training method based on fuzzy clustering method for large real world datasets

Abstract

Classifying large and real-world datasets is a challenging problem in machine learning algorithms. Among the machine learning methods, the support vector machine (SVM) is a well-known approach with high generalization ability. Unfortunately, while the number of training data increases and the data contain noise, the performance of SVM significantly decreases. In this paper, a fast and de-noise two-stage method for training SVMs to deal with large, real-world datasets is proposed. In the first stage, data that contain noises or are suspected to be noisy are identified and eliminated from the genuine training dataset. The process of elimination and identification is based on the movement of the center of the convex hull data in the training dataset. The convex hull data are computed via the QHull algorithm. On the other hand, the well-known fuzzy clustering method (FCM) is applied to compress and reduce the size of the training dataset. Finally, the reduced and purified cluster centers are used for training the SVM. A set of experiments is conducted on the four benchmarking datasets of the UCI database. Moreover, the amount of training time and the generalization of the proposed approach are compared with FCM-SVM and normal SVM. The results indicate that the proposed method reduces the amount of training time and has a considerable success in removing noisy data from the training dataset. Therefore, the proposed method can achieve a higher generalization performance in comparison with the other methods in large, real-world datasets.

DOI

10.3906/elk-1304-139

Keywords

Support vector machine, fuzzy clustering method, convex hull, QHull algorithm, reduction set method, noisy training dataset

First Page

219

Last Page

233

Recommended Citation

ALMASI, O. N, & ROUHANI, M (2016). Fast and de-noise support vector machine training method based on fuzzy clustering method for large real world datasets. Turkish Journal of Electrical Engineering and Computer Sciences 24 (1): 219-233. https://doi.org/10.3906/elk-1304-139

Download

Included in

Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons

COinS

Turkish Journal of Electrical Engineering and Computer Sciences

Fast and de-noise support vector machine training method based on fuzzy clustering method for large real world datasets

Abstract

DOI

Keywords

First Page

Last Page

Recommended Citation

Included in

Issues by Year

Search

Turkish Journal of Electrical Engineering and Computer Sciences

Fast and de-noise support vector machine training method based on fuzzy clustering method for large real world datasets

Authors

Abstract

DOI

Keywords

First Page

Last Page

Recommended Citation

Included in

Share

Issues by Year

Search