Data Science

Efficient Frontier in Python – Detailed Tutorial

Implementing Modern Portfolio Theory in Python

Rian Dolphin

Jul 18, 2021

4 min read

Without Packages

👁 The output of the code contained in this article

The output of the code contained in this article

Introduction

Harry Markowitz introduced modern portfolio theory in his 1952 paper titled _Portfolio Selection_. He begins by outlining that portfolio selection is a two-step process; firstly, an investor must consider the future performance of the available assets (in terms of both risk and return) and subsequently, a decision can be made about how to construct the portfolio (i.e. how much money to allocate to each asset).

Markowitz focuses on the portfolio construction aspect and leaves the more speculative task of predicting future performance to the reader. In fact, throughout the paper, returns are assumed to follow a simple Gaussian (Normal) distribution. This assumption is the foundation upon which the whole of modern portfolio theory is based but has been the cause for much criticism as stock price returns have been shown not to follow a normal distribution.

However, let’s put the critics to one side and write some Python code to plot the Markowitz bullet and understand better the mathematical theory that earned Markowitz his Nobel prize.

The Data

Our data consists of daily returns for a universe of six stocks, as shown below. The returns range from the beginning of 2000 to late 2018.

👁 Returns Data

Returns Data

We can immediately get the expected annualised returns, variance, and covariance matrix using two lines of code from this data. We note that by using past data like this, we assume that the future will follow past trends. This is a highly contentious assumption in the financial markets, but it will do for now.

I also made a YouTube tutorial, using the same code as in this article, if video is your preferred Medium of learning!

Getting The Portfolios – Simplistic Version

Without using any packages other than numpy, we can quickly and easily create a mean-variance plot using the following code. A mean-variance plot allows us to see the trade-off between risk and return for each portfolio.

Some things to note about this code:

We use the following formula for expected return

👁 For more on this see here

For more on this see here

And the following formula for portfolio variance (which includes covariances as well as individual variances)

👁 For more on this see here

For more on this see here

You can adjust the number of assets to be considered in each portfolio.

Using the list of mean-variance pairs we have generated, we can now plot these portfolios 🎉

👁 Image

In this plot, each point represents a portfolio. It is an important note here that the majority of portfolios shown here are not on the Efficient Frontier. We address this in the next section of this post.

I am using the graphing library Plotly here; if you have not tried it, I strongly recommend giving it a chance! Plots are fully interactive in Jupyter Notebook, which is a great help for data exploration. See here for the interactive version of this plot.

More Complex Version

Previously we were randomly sampling portfolio weights; let’s try to be more efficient (pun intended). We can introduce the concept of domination so that we are not unnecessarily sampling ‘bad’ portfolios.

As a (somewhat) rational investor, if presented with two portfolios with equal return, we will choose the one with lower risk, and given two portfolios with equal risk we will choose the one with a higher return. Yes? 🤞 What does this mean in terms of the plot above? The top left is the best place to be, and any portfolio with another portfolio above it and to the left is dominated and would not be chosen by any rational investor.

So, let’s ensure our code does not include any portfolios dominated by a portfolio we have already sampled.

In this version, we also store the asset names and weights for each portfolio. Using the interactive feature of Plotly, we can hover over each point on our plot, and it will tell us the assets and weights for each. You can see this statically in the screenshot below but click here to see the interactive plot!

👁 Image

Using this sampling technique, we end up very quickly sampling portfolios along the efficient frontier. Some of the early sampled portfolios are clearly visible below the efficient frontier; however, this can be remedied very easily by only adding portfolios after a certain iteration (similar to the idea of a ‘burn-in period’ in MCMC).

Join Medium with my referral link – Rian Dolphin

To those who made it this far, thank you for reading! Please feel free to ask any questions in the comments below. The full code in Jupyter Notebook form can be viewed here. You can also download the data and notebook used from my Github.

Written By

Rian Dolphin

See all from Rian Dolphin

Data Science, Finance, Investing, Programming, Python

Share This Article

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

URL: https://towardsdatascience.com/efficient-frontier-in-python-detailed-tutorial-84a304f03e79/