Turkish Journal of Electrical Engineering and Computer Sciences
DOI
10.3906/elk-1909-162
Abstract
Visualizing multidimensional data has been a crucial task in recent years regarding the growing amount of data from various sources. To achieve this, dimensionality reduction algorithms have been used to reduce the number of dimensions for visualization of the data on a screen. However, these algorithms may fail to faithfully represent high dimensional data in lower dimensions and eventually lead to erroneous visualizations. In this work, we propose an error detection algorithm for dimensionality reduction algorithms based on recently developed error prediction algorithms for medical image registration. The proposed algorithm matches the neighborhoods of high and low dimensional data with different similarity measures and predicts the errors using a random forest classifier. The results on three datasets show that the proposed algorithm can successfully detect errors with an accuracy up to 86% and area under the curve score of 0.81.
Keywords
Dimensionality reduction, error estimation, t-SNE, random forests, matching
First Page
2883
Last Page
2894
Recommended Citation
SAYGILI, GÖRKEM
(2020)
"A supervised learning approach for detecting erroneoussamples in embeddings,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 28:
No.
5, Article 33.
https://doi.org/10.3906/elk-1909-162
Available at:
https://journals.tubitak.gov.tr/elektrik/vol28/iss5/33
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons