![]() |
VOOZH | about |
In simple words, "Stock" is the ownership of a small part of a company. The more stock you have the bigger the ownership is. Stock price prediction is challenging due to the dynamic and volatile nature of stock prices. We will be using machine learning algorithms to predict a company's stock price aims to forecast the future value of the company stock.
We will predict Netflix Stock Prices using machine learning algorithms in R. We will use the ARIMA model to predict the future stock prices of Netflix based on historical data.
For this project, we use Netflix's historical stock price data from 2002-01-01 to 2022-12-31. The dataset contains the following columns:
We focus on the NFLX.Close column for stock price prediction.
You can download the dataset from here: NFLX.csv
We start by installing and loading the necessary libraries for time series analysis and forecasting.
We load the Netflix stock price data either from a CSV file and print the first few rows (head(df)) to verify the data structure.
Output:
We will check the dataset's dimensions and identify any missing values to ensure the data is clean for modeling.
Output:
The dataset has 5044 rows and 7 columns and there are no missing values in the dataset. By checking for missing values we ensure that we don't have any incomplete data, which could affect model accuracy.
We take a summary of the data to get an overview of the stock's behavior, such as minimum, maximum and average values.
Output:
The summary provides a overview of the distribution of stock prices. The mean and median help identify the average stock price, while the max and min give us the extremes of the dataset. These values show that Netflixβs stock price has fluctuated significantly over time.
We visualize the stock price data using the chartSeries() function from the quantmod package to observe the trends in stock prices.
Output:
The chartSeries() function automatically generates a suitable chart (candlestick or line chart) for the stock price, which visually displays trends, patterns and fluctuations in Netflixβs stock price over time.
We check if the data is stationary using visualizations. Non-stationary data needs to be transformed before forecasting.
Output:
The histogram and density plot show that the data is non-stationary, as it is not normally distributed and shows a trend.
Note: A stationary time series has a constant mean, variance and autocorrelation over time. The non-stationary nature of the data indicates that it has trends and needs to be differenced or transformed before applying the ARIMA model.
We separate the Close price (the target variable) into training and testing datasets. The data is split into training and testing sets (80:20 ratio). The training set will be used to fit the model, while the testing set will validate the model's performance.
We use the ARIMA model for forecasting. The auto.arima() function automatically selects the best ARIMA model by testing various combinations of the parameters.
Output:
We check the model's performance by comparing the forecast results on the training and test data.
Output:
The training set has minimal errors across all metrics (ME, RMSE, MAE), suggesting the model fits well on the training data. However, the test set shows higher errors, indicating the model's poor generalization to unseen data (potential overfitting).
Finally, we predict the stock prices using our ARIMA model for the next 7 days, visualize the predicted stock prices and compare them with the actual test data to evaluate the model's performance.
Output:
From our analysis, we concluded that predicting Netflix stock prices using the ARIMA model can provide reasonable forecasts. However, the model showed signs of overfitting, as it performed well on the training set but not on new, unseen data. This suggests that improvements such as parameter tuning or using more sophisticated models could help enhance the prediction accuracy for stock prices.