Getting a grip on the covariance determinant

On Covariance and overall variability

NeuroImaging data are multivariate by nature, and even if lots of the statistics we use summarize the data (mass-univariate approaches obviously, but even the multivariate approaches rarely take the whole data space), it is useful to have a grip on the relationships between measurements.

One way to investigate the relationships between measurements is the covariance. The covariance is useful because it measures the linear relationship between variables and most of the statistics we use are linear. I am limiting myself here to the sample covariance, that is what we compute from data. The goal is purely educative, I myself sometimes struggled with some concepts so I thought I'd lay down the basics to understand it.

Covariance between variables

The estimated covariance is simply the (adjusted N-1) average of the product between centered values. Written in matrix form this is simply cov = (A-mean(A))'*(B-mean(B))/(N-1). Because we take the product between values, the sign reflects the tendency of the linear relationship. The magnitude the covariance is much harder to interpret (unless normalized by the product of the variances = correlation), except if the covariance equal 0, which implies that variables are independent.

Most often, the relationship between variables is presented via a variance-covariance matrix S with variances located on the main diagonal. For p variables, the variance-covariance matrix is a square matrix of dimension p*p.


Matrix determinant

For covariance matrices, the determinant provides information about the relationship between variance and covariance values, and between covariances values themselves. In most applications and in particular when dealing with covariance matrices, we are using real values and the determinant can be interpreted geometrically (see video below). If we were to apply a matrix (i.e. transform a vector or space), the sign of the determinant reflects the orientation of the transformation and the magnitude reflects its scaling.



Overall variability

The generalized sample variance is given by the determinant of S det(S) while the total sample variance is given by the trace of S tr(S) [1].

A p dimensional hyperellipsoid (y-mean(y))'*inv(S)*(y-mean(y)) = a^2 is centered on mean(y) with axes proportional to the square roots of the eigen values of S. Since the eigen values reflect the main directions of the transformation matrix S, the axes of the hyperellipsoid are proportional to those axes. Importantly if some eigen values are 0, this implies that the hyperellipsoid is null in that direction, and that there is no data in that subspace and thus the volume in p-space is 0 and det(S) = 0. A large value of det(S) indicates a large scatter in the data (large hyperellipsoid volume) while small det(S) can indicate either a small scatter or multicollinearity.


Conclusion

Variance-covariance matrices can be used to evaluate the linear relationship between variables and the data scatter can be estimated using the determinant. Zero covariance implies independence between variables while zero determinant implies redundancy in the data. 

References

[1] Rencher, A. (2002). Methods of multivariate analysis, 2nd ed. Wiley series in probability and statistics.


 

Comments

Popular Posts