Deep learning is commonly used to analyze large datasets but to understand its core concepts itβs helpful to start with smaller, more manageable datasets. One such dataset is the Wine Quality dataset which includes information about the chemical properties of wines and their quality ratings.
In this article, weβll explore how deep learning can be applied to predict wine quality based on its chemical composition. But before loading the data it is important to understand its features. The dataset consists of 12 variables. Here are a few of them:
Fixed Acidity: This refers to the non-volatile acids in the wine, which contribute to the wine's tartness.
Volatile Acidity: This refers to acetic acid content which can contribute to a vinegar-like taste in wine.
Citric Acid: Citric acid is one of the fixed acids in wine.
Residual Sugar: This is the sugar that remains after fermentation stops.
Chlorides: Chlorides can contribute to saltiness in wine.
Free Sulfur Dioxide: This is the sulfur dioxide that is added to wine.
Total Sulfur Dioxide: This is the sum of bound and free sulfur dioxide.
We will be loading dataset from provided URL. After that we will preprocess it so that model can use this cleaned dataset for training and making prediction.
type : is added to distinguish between red and white wine: 1 for red wine and 0 for white wine.
pd.concat(): using this two datasets (red and white) are concatenated into a single data frame wine.
dropna(): Any rows with missing values are dropped using this to ensure clean data for training the model.
We will use matplotlib to create a visual representation of the distribution of alcohol content for red and white wines. We will create a histogram for each wine type (red and white) with a specified number of bins (10).
We will split our dataset into training and testing.
X contains all columns except the target variable (type), which is represented by y.
train_test_split(): splits the dataset into training (66%) and testing (34%) sets.
random_state=45: ensures that the split is reproducible.
5. Creating Neural Network Model
The function Sequential() defines a neural network with 3 layers:
Input layer: Dense(12) with 12 neurons and ReLU activation function where input_dim=12 matches the number of features (columns) in the input data.
Hidden layer: Dense(9) with 9 neurons and ReLU activation.
Output layer: Dense(1) with a single neuron and a sigmoid activation function since this is a binary classification problem (predicting red or white wine).
We make predictions using the trained model on the test data (X_test) and get the predicted probabilities for each wine sample. We then convert these probabilities into binary labels (1 for Red wine, 0 for White wine) and display the wine type prediction for the first 12 samples.