Introduction to portfolio optimization

Objective function

In mean-variance portfolio optimization we maximize, subject to various constraints on the portfolio weights x, the objective function

(1)L = \rho x' \mu - \frac{1}{2}x' V x,

where \rho (a scalar) is the risk tolerance, \mu (a n
\times 1 vector) is the expected return, V is the (n \times n) covariance matrix of asset returns, x (n \times 1) is a vector of portfolio weights, and n is the number of assets.

Because we like return (the first term in Eq (1)) but don’t like variance (second term), we want to find the portfolio x that maximizes Eq (1). To do so, the derivative of L is more useful than L itself:

(2)\frac{\partial L}{\partial x}&= \rho \mu - \frac{1}{2}(V + V')x \\
&= \rho \mu - Vx

where we have made use of the fact that \frac{\partial}{\partial x}x'Ax
= (A + A')x and that the covariance matrix V is symmetric (i.e. V = V').

Here’s some more matrix algebra that will come in handy:

  • (AB)^{-1} = B^{-1}A^{-1}
  • (A+B)' = A' + B'
  • (AB)' = B' A'
  • AI=IA
  • (A')^{-1} = (A^{-1})'

Minimum variance

Let’s start by finding the minimum variance portfolio subject to the constraint that the weights x sum to k. (If we did not add a sum to k constraint then the minimum variance portfolio would be to have zero weight in each asset.)

Because the sum to k constraint is a linear equality constraint (x'\mathbf{1} = k), we can use the Lagrange multiplier method to add the constraint to Eq (1):

L = \rho x' \mu - \frac{1}{2} x'Vx + \lambda (x'\mathbf{1} - k)

where \mathbf{1} is a n \times 1 vector of ones.

Setting \rho to zero since we have no tolerance for risk, gives:

L = - \frac{1}{2} x'Vx + \lambda (x'\mathbf{1} - k)

Setting the first derivative of the Langrangian (with respect to a change in the portfolio weights) to zero and solving for the minimum variance portfolio gives

(3)\frac{\partial L}{\partial x} &= 0 \\
- \frac{1}{2} (V + V')x + \lambda \mathbf{1}&= 0\\
Vx &= \lambda \mathbf{1}\\
V^{-1}Vx &= \lambda V^{-1} \mathbf{1}\\
x &= \lambda V^{-1} \mathbf{1}

But what is the value of the Langrange multipler \lambda? From the sum-to-k constaint, x is also given by

(4)x' \mathbf{1} = k

Substituting Eq (3) into Eq (4) gives

(5)\left(\lambda V^{-1} \mathbf{1}\right) ' \mathbf{1} &= k\\
\lambda (V^{-1} \mathbf{1}) ' \mathbf{1} &= k\\
\lambda  \mathbf{1}' V^{-1}  \mathbf{1} &= k\\
\lambda &= \frac{k}{ \mathbf{1}' V^{-1}  \mathbf{1}}

Finally substituting Eq (5) into Eq (3) gives the minimum variance portfolio that sums to k:

x &= \left( \frac{k}{ \mathbf{1}' V^{-1} \mathbf{1}} \right) V^{-1}
     \mathbf{1}\\
x &=k \frac{V^{-1} \mathbf{1}}{ \mathbf{1}' V^{-1} \mathbf{1}}

Mean–variance

Next, let’s find the optimal mean-variance portfolio without a constrait on the sum of the portfolio weights.

We wish to find the x that maximizes Eq (1). To do so we take the derivative of L (as is already done in Eq (3)) and solve for x to get the optimimal portfolio with no constraint on the sum of the weights:

Vx &= \rho \mu \\
x &= \rho V^{-1} \mu.

Let’s now add a sum-to-k constraint.

Given that we are optimizing a convex function subject to linear equality constraints (sum-to-k), the method of Lagrange is guaranteed to produce an optimal solution. The Lagrangian is

L = \rho x' \mu - \frac{1}{2} x'Vx + \lambda (x'\mathbf{1} - k)

Setting the first derivative equal to zero and simplifying gives

(6)\frac{\partial L}{\partial x} &= 0 \\
\rho \mu - Vx + \lambda \mathbf{1}&= 0 \\
Vx &= \left(\rho \mu  + \lambda \mathbf{1}\right) \\
V^{-1}Vx &=  \left( \rho V^{-1} \mu  + \lambda V^{-1} \mathbf{1} \right) \\
x &= \left(\rho V^{-1} \mu  + \lambda V^{-1} \mathbf{1}\right)

Substituting Eq (6) in Eq (4):

(7)\left(\rho V^{-1} \mu  + \lambda V^{-1} \mathbf{1}\right)' \mathbf{1}&= k\\
\rho \mu' V^{-1} \mathbf{1} + \lambda \mathbf{1}' V^{-1} \mathbf{1} &= k \\
\lambda \mathbf{1}' V^{-1}  \mathbf{1}
    &=  k - \rho \mu' V^{-1}  \mathbf{1} \\
\lambda &= \frac{k - \rho \mu' V^{-1}  \mathbf{1}}
                {\mathbf{1}' V^{-1}  \mathbf{1}}

Finally substituting Eq (7) into Eq (6) gives the portfolio that, given the sum-to-k constraint, makes the optimal trade-off between expected return and variance:

x = \rho \left( V^{-1} \mu +
    \frac{k - \mu' V^{-1} \mathbf{1}}
         {\mathbf{1}' V^{-1}  \mathbf{1} } V^{-1} \mathbf{1} \right)

General linear-equality constraints

In the preceding section we described mean-variance optimization under a linear equality constraint of x'\mathbf{1} = k. We can generalize that result so that it applies to multiple, simultaneous linear equality constraints of the form a'x = b.

We want to maximize the utility

\rho x' \mu - \frac{1}{2} x'Vx

subject to the constraints a_k'x = b_k for constraints k, k=0,...,K-1. The Lagrangian for this problem is

L = \rho x' \mu - \frac{1}{2} x'Vx + \sum_k \lambda_k (a_k'x - b_k).

Taking the partial of L with respect to x, we find that

\frac{\partial L}{\partial x} &= \rho\mu - Vx + \sum_k \lambda_k a_k,

and setting this to zero yields

x = \rho V^{-1}\mu + \sum_k \lambda_k V^{-1} a_k.

Substituting this solution into each constraint yields, for each j\le K,

a_j' \rho V^{-1}\mu + \sum_k \lambda_k a_j' V^{-1} a_k &= b_j \\
\sum_k \lambda_k a_j' V^{-1} a_k &= b_j - a_j' \rho V^{-1}\mu,

which can be expressed in matrix form as

\left[a_j' V^{-1} a_k\right] \left(\lambda_k\right)
= \left(b_j - a_j' \rho V^{-1} \mu\right),

where square brackets indicate a matrix and parentheses indicate a column vector, and where j indexes columns and k indexes rows.

By solving the matrix equation above, we can find the \lambda_k, and this in turn allows us to find the optimal x.

Soft constraints

In this section we add soft linear equality constraints. Soft constraints are like equality constraint but instead of insisting on equality at all cost, we allow a deviation from equality. A penalty proportional to the squared deviation from equality is subtracted from the objective function

A generic quadratic penality Z can be expressed as

Z = - \theta (x'q - q_{0})^{2},

where \theta is a scalar that tells us the strength of the penality, q is a n \times 1 vector such as a vector of stock betas, and q_{0} is a scalar away from which we penalize values of x'q. Simplifying gives

(8)Z&= - \theta \left( x'q x' q - 2 x' q q_{0} + q_{0}^{2} \right) \\
&= - \theta x'q q' x + 2 \theta x' q q_{0} - \theta q_{0}^{2}

Adding the quadratic penalty Z to the objective function L in Eq (8) gives

L &= \rho x' \mu - \frac{1}{2}x' V x - \theta x'q q' x + 2 \theta x' q q_{0} - \theta q_{0}^{2} \\
  &= \rho x' \mu + 2 \theta x' q q_{0} - x' \left( \frac{1}{2} V + \theta q q' \right) x - \theta q_{0}^{2}\\
  &= \rho x' \left( \mu + \frac{2 \theta q_{0}}{\rho} q \right) - \frac{1}{2} x' \left( V + 2 \theta q q'\right) x - \theta q_{0}^{2}\\
  &= \rho x' \mu_{q} - \frac{1}{2} x' V_{q} x - \theta q_{0}^{2},

where

\mu_{q} &= \mu + 2 \frac{\theta q_{0}}{\rho} q \\
V_{q} &= V + 2 \theta q q'.

So adding soft linear equality constraints to an algorithm that finds mean-variance optimal portfolios could not be easier—no modification is necessary. We need only to adjust the expected return and the covariance matrix.

Inverse of covariance matrix

The equations above take the inverse of the n \times n covariance matrix of asset returns, where n is the number of assets.

Of the two meva optimizers, aopt and nopt, only aopt takes an explicit inverse. In fact the bottleneck of the analytical optimizer aopt is typically in finding the inverse of the covariance matrix.

Here’s a trick that for 500 assets and 10 risk factors, speeds up aopt by a factor of 7. (The downside of the trick is that the individual variance, D, can no longer be zero.)

Lemma 0.28 of Edgar Brunner’s lecture notes (October 14, 2002) states: “Let E=E_{m \times n}, F=E_{m \times n}, C=C_{m \times
m} and assume that the inverses C^{-1} and (C - EF)^{-1} exist. Then,

(9)(C - EF)^{-1} = C^{-1} + C^{-1}E\left( I - FC^{-1}E\right)^{-1}FC^{-1}."

Making the following substitutions

C = D\\
E = -B\\
F = B',

Eq (9) becomes

(10)(D + BB')^{-1} = D^{-1} - D^{-1}B\left( I + B'D^{-1}B\right)^{-1}B'D^{-1}.

Note that

\left(B'D^{-1}\right)' = D^{-1}B.

Finally, making the substitution Z = B'D^{-1} in Eq (10) gives

(11)(D + BB')^{-1} = D^{-1} - Z'\left(I + XB\right)^{-1}Z.

Let’s try an example. Create a covariance matrix:

>>> n = 500; k = 10
>>> b = np.random.rand(n,k)
>>> d = np.random.rand(n)
>>> cov = np.dot(b,b.T) + np.diag(d)

Take the inverse the brute-force way:

>>> timeit np.linalg.inv(cov)
10 loops, best of 3: 59.9 ms per loop

And now that fast way:

>>> from meva.opt.util import fast_inverse
>>> timeit fast_inverse(b, d)
100 loops, best of 3: 7.1 ms per loop

Numerical optimization

For some problems an analytical solution does not exist.

We’ve seen that linear equality constraints can be easily handled analytically. But what happens when you want the sum of the shorts to be -1 and the sum of the longs to be 1? In that case we do not know ahead of time which assets will be short and which will be long.

The numerical optimizer in meva is based on an heuristic, gradient method. The heuristic part refers to dealing with the complications of a long-short portfolio with constraints.

See here for background on the gradient method of portfolio optimization.