VOOZH about

URL: https://towardsdatascience.com/predicting-prices-of-bitcoin-with-machine-learning-3e83bb4dd35f/

⇱ Predicting Prices of Bitcoin with Machine Learning | Towards Data Science


Skip to content

Predicting Prices of Bitcoin with Machine Learning

Using Time Series Models to Forecast Cryptocurrency Trends

9 min read

I Tried to Predict Bitcoin’s Prices with Machine Learning

👁 Photo by André François McKenzie on Unsplash
Photo by André François McKenzie on Unsplash

UPDATE: click below to see the next article depicting the process of forecasting Bitcoin prices with Deep Learning

I Tried Deep Learning Models to Predict Bitcoin Prices

Predicting the future is no easy task. Many have tried and many have failed. But many of us would want to know what will happen next and would go to great lengths to figure that out. Imagine the possibilities of knowing what will happen in the future! Imagine what you would have done back in 2012 when Bitcoin was less than $15 knowing that it would surpass $18,000! Many people may regret not buying Bitcoin back then but how were they supposed to know in the first place? This is the dilemma we now face in regards to Cryptocurrency. We do not want to miss out on the next jump in price but we do not know when that will or will not happen. So how can we potentially solve this dilemma? Maybe machine learning can tell us the answer.

Machine learning models can likely give us the insight we need to learn about the future of Cryptocurrency. It will not tell us the future but it might tell us the general trend and direction to expect the prices to move. Let’s try and use these machine learning models to our advantage and predict the future of Bitcoin by coding them out in Python!

Sign up for a Medium Membership here to gain unlimited access and support content like mine! With your support I earn a small portion of the membership fee. Thanks!


Modeling Time Series

The machine learning models we are going to implement are called Time Series models. These models will examine the past and look for patterns and trends to anticipate the future. Without these models, we would have to do all of those analyses ourselves and that would take just way too much time. Luckily, we can program these Time Series models in Python to do all of that work for us, which is what we will be doing today!

The Time Series models that we will be using today are: SARIMA and an additive model implemented by Facebook Prophet. SARIMA or ARIMA is a relatively basic Time Series model that we will be coding out and explaining the components when necessary. Facebook Prophet uses an additive model for forecasting time series data that is fast and tunable. After modeling, we will compare the results from each model’s unique insights into Bitcoin’s future.

The steps to modeling SARIMA are as follows:

  1. Gather, explore, and visualize the data.
  2. Difference the data and check for stationarity.
  3. Plot the ACF and PACF for the differenced data.
  4. Start modeling by searching for the best parameters.
  5. Train and test the model with the optimized parameters.
  6. Forecast the future!

These descriptions are very brief and simplified but we will soon go over each step in greater detail. The following code snippets are taken from the Github shared at the end.

Bitcoin Price Data

The first thing we have to do is retrieve the historical data of Bitcoin which can be downloaded as a convenient CSV file from Yahoo Finance. Once we have that, we can begin by formatting the CSV file as a Pandas DataFrame. Then, we use that same DataFrame for the rest of our plotting and calculations.

Another option would be to use a financial data API such as EOD Historical Data. It is free to sign up and you’ll have access to vast amounts of financial data. Disclosure: I earn a small commission from any purchases made through the link above.

Next, we plot our dataframe to see Bitcoin’s price movement over the last two years. The last two years were selected because Bitcoin, and Cryptocurrency in general became very popular and are a better representation of current market trends.

👁 Image

Stationarity

Let’s prepare the data for modeling by making the data stationary. We do this by simply differencing the data and testing for stationarity by using something called the Dickey-Fuller test. We are aiming for a P-Value of less than the critical value of 5%, or simply trying to get as close to zero as possible. For even a lower P-value, we’ll take the log of the prices, then difference the log instead of just differencing the prices.

You might be wondering why we care about stationarity. Simply put, stationarity removes trends from the dataset which can be extremely intrusive to our models. Basically, stationarity makes our models perform and predict better.

👁 Our stationary differenced log of BTC.
Our stationary differenced log of BTC.

ACF and PACF

Next, we’ll have to plot the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF). Since we are working with daily data, the ACF shows us which day in the past correlates the most with the current day with respect to the days in between. PACF shows us which day in the past correlates directly to the current day by ignoring the days in between.

👁 The ACF and PACF for the Log of BTC
The ACF and PACF for the Log of BTC

SARIMA Modeling

By knowing the PACF and ACF, we now better understand our dataset and the parameters to potentially choose. Now, we can move on to modeling our data by using the SARIMA model.

Optimizing Parameters

In order to get the best performance out of the model, we must find the optimum parameters. We do this by trying many different combinations of the parameters and selecting the one with the relatively lowest AIC score. Don’t worry, we wrote a function that will do this for us.

Depending on your computer, the process of finding the best parameters may take awhile. For some like us, we’ll have to settle for the best parameters limited by our computer’s specifications. Unfortunately, not all computers are equal and some models will perform better based on the computer that is running them.

Fitting and Training

Now that we have our parameters, let’s go ahead and train and fit the model to Bitcoin’s prices.

To test the model’s performance even further, we can see how its predictions line up with the values that we already know by plotting them out.

👁 Image

The model tests okay because the actual values still remain within our confidence intervals (shaded in gray) and the prices are rising as forecasted. The rest of the training data seems to fit well within our intervals (green shade) and line up with the model’s predicted values.

Forecasting Future Prices

Now we can get to the part that we really want to know about – Predicting Bitcoin’s future prices! We do this by forecasting from the present day and seeing where it might go in the future.

👁 General forecast of BTC
General forecast of BTC

We probably need to take a closer look. . .

👁 Image

According to the model, it appears that Bitcoin will continue slightly upwards in the next month. However, do not take this as a fact. The shaded region shows us where Bitcoin’s price may potentially go in the next month, but it also happens to show that Bitcoin may potentially go down. Although, the model seems to be tilting towards the price rising instead of declining.

SARIMA’s forecast should not be the only forecast to take into consideration. There are other time series models and procedures to consider and one of them was actually created by Facebook’s Data Science team!


Facebook Prophet

Using Facebook Prophet will be relatively easier than modeling with SARIMA. This is due to FB Prophet’s simplicity and ease of use. You’ll be able to see how much easier it is compared to SARIMA.

The steps to using Facebook Prophet are:

  1. Format data for Prophet.
  2. Fit and train the model to the data.
  3. Create future dates to forecast.
  4. Forecast and visualize the future!

Here’s the code using the steps above for Facebook Prophet:

Facebook Prophet Breakdown

In the first step, we format our previous data from before by making two columns for the dates and the price. Then, we can jump straight into modeling by fitting and training the data! No need to tune parameters or check for stationarity!

After modeling, we can now advance to forecasting the future by first creating the future dates we want Prophet to predict prices for us. We can also plot these dates which will also show us how the model stacks up against past values and where prices may go next.

👁 Image

Zoom in for a closer look at the future forecast. . .

👁 FB Prophet zoomed in
FB Prophet zoomed in
  • Blue line = Forecasted values
  • Black dots = Observed (actual) values
  • Blue-shaded region = Uncertainty intervals

According to FB Prophet, Bitcoin will rise in the next month. But again, this is not a fact. Compared to SARIMA, FB Prophet has a clearer forecast and direction. FB Prophet has even more features and parameters to experiment with, but we did not go through all of them here. If you feel like you need to alter the model then click here for FB Prophet’s documentation.


Closing Thoughts

Now that we have two forecasts for the future of Bitcoin, feel free to make your own unique observations of both to determine the future of Bitcoin. Outside of SARIMA and FB Prophet, there are many more time series models to learn about and experiment with. Do not feel limited to only these two! We just did a brief overview of time series, modeling, and machine learning. There are many more topics to cover and research!

It’s close to impossible to predict the future of Bitcoin, but with machine learning, we can understand where it might go with a high degree of confidence. We wouldn’t suggest using these machine learning models to make all your investing decisions but it’s great to see what might happen for the future of Bitcoin and Cryptocurrency!

Not a Medium member? Click here to support them and me!


Resources:

marcosan93/BTC-Forecaster

I Tried Deep Learning Models to Predict Bitcoin Prices

Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.


Written By

Marco Santos

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles