학술논문

Random-Matrix Regularized Discriminant Analysis of High-Dimensional Dataset
Document Type
Conference
Source
2018 17th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES) DCABES Distributed Computing and Applications for Business Engineering and Science (DCABES), 2018 17th International Symposium on. :204-207 Oct, 2018
Subject
Computing and Processing
Covariance matrices
Eigenvalues and eigenfunctions
Linear discriminant analysis
Estimation
Prediction algorithms
Training data
Correlation
random matrix theory, classification, covariance matrix
Language
ISSN
2473-3636
Abstract
Linear discriminant analysis (LDA) is one of the most popular parametric classification methods in machine learning and data mining tasks. Although it performs well in many applications, LDA is impractical for high-dimensional data sets. A primary reason for it is that the sample covariance matrix is no longer a good estimator of the actual covariance matrix when the dimension of feature vector p is close to or even larger than the sample size n. Here we propose to regularize LDA classifier by employing a consistent estimator of high-dimensional covariance matrices. Using the theoretical tools from random matrix theory, the covariance matrices in high-dimensions are estimated in a linear or nonlinear shrinkage manner depending on the relationship between the dimension p and the sample size n. Numerical simulations demonstrate that the regularized discriminant analysis using random matrix theory yield higher accuracies than existing competitors for a wide variety of synthetic and real data sets.