Covariance matrix estimation¶

Meva provides two methods to estimate the covariance matrix of log asset returns: one, cov_pca based on pricipal component analysis (PCA) and one, cov_fa based on factor analysis. Both functions return $B$ and $d$ such that the covariance matrix $V$ is given by

$V = \mbox{cov}(y) = BB' + \mbox{diag}(d),$

where $B$ is a $n\times k$ matrix mapping $k$ factors to $n$ assets, $d$ is…

cov_pca (PCA-based)¶

The cov_pca function uses principal component analysis to fit a factor-model decomposition of market variability. This model decomposes the market return $y$ into two parts:

$y = Bx + w.$

Here, $B$ is a $n\times k$ matrix mapping $k$ factors to $n$ assets, $x$ is a draw from a $k$ dimensional iid standard normal distribution, and $w$ is a draw from an independent normal distribution in which the variance of the $j$ -th component is given by $d_j$ .

The cov_pca function handles missing data, coded as nan.

cov_fa (factor-analysis-based)¶

The cov_fa function fits the same factor model to the market returns. But instead of using principal component analyisis to decompose the variance, this factor analysis function uses the EM (expectation-maximization) algorithm to iteratively find a maximum-likelihood fit.

For background on the factor analysis model and its EM solution see Andrew Ng’s freely available machine-leaning notes.

The only difference between our algorithm and his is that we use the matrix identities

$\big(E - FH^{-1}G\big)^{-1} FH^{-1} = E^{-1}F \big(H - GE^{-1}F\big)^{-1},$

and

$\big(E - FH^{-1}G\big)^{-1} = E^{-1} + E^{-1}F\big(H - GE^{-1}F\big)^{-1}GE^{-1}$

to reduce $n\times n$ matrix inversions to $k \times k$ matrix inversions. These identities result form standard blockwise inversion teckniques, for instance on wikipedia here.