Inversion of Covariance Matrix for High Dimension Data

Samruam Chongcharoen

doi:10.3844/jmssp.2011.227.229

Research Article Open Access

Inversion of Covariance Matrix for High Dimension Data

Samruam Chongcharoen

Abstract

Problem statement: In the testing statistic problem for the mean vector of independent and identically distributed multivariate normal random vectors with unknown covariance matrix when the data has sample size less than the dimension n≤p, for example, the data came from DNA microarrays where a large number of gene expression levels are measured on relatively few subjects, the p×p sample covariance matrix S does not have an inverse.. Hence any statistic value involving inversion of S does not exist. Approach: In this study, we showed a version of some modification on S, S+cI and find a real smallest value c≠0 which makes (S + cI)^-1 exist. Results: The result from study provided when the dimension p tends to infinity and smallest change in S, the (S + cI)^-1 do exist when c = 1. Conclusion: In statistical analysis involving with high dimensional data that an inversion of sample covariance matrix do not exist, one way to modify a sample covariance matrix S to have an inverse is to consider a sample covariance matrix, S, as the form S + cI and we recommend to choose c = 1.

Journal of Mathematics and Statistics

Volume 7 No. 3, 2011, 227-229

DOI: https://doi.org/10.3844/jmssp.2011.227.229

Submitted On: 16 March 2011 Published On: 27 July 2011

How to Cite: Chongcharoen, S. (2011). Inversion of Covariance Matrix for High Dimension Data. Journal of Mathematics and Statistics, 7(3), 227-229. https://doi.org/10.3844/jmssp.2011.227.229

Copyright: © 2011 Samruam Chongcharoen. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

4,294 Views
2,673 Downloads
0 Citations

Download

Keywords

DNA micro arrays
eigenvalue
positive semidefinite
positive definite
gene expression
covariance matrix
statistic value
real vector
real number
determinant
symmetric matrix
definite matrix