Data Science

Feed Forward Neural Networks – How To Successfully Build Them in Python

A detailed graphical explanation of Neural Networks with a Python example using real-life data

Saul Dobilas

Dec 27, 2021

20 min read

Neural Networks

👁 Feed Forward Neural Networks. Image by author.

Feed Forward Neural Networks. Image by author.

Intro

Neural Networks have been the central talking point over the last few years. While they may initially seem intimidating, I assure you that you do not need a Ph.D. to understand how they work.

In this article, I will take you through the main ideas behind basic Neural Networks, also known as Feed Forward NNs or Multilayer Perceptrons (MLPs), and show you how to build them in Python using Tensorflow and Keras libraries.

Feed Forward Neural Network’s place within the universe of Machine Learning
A visual explanation of how Feed Forward NNs work
Network structure and terminology
Parameters and activation functions
Loss functions, optimizers, and training
Python examples of how to build and train your own Feed Forward Neural Networks

Feed Forward Neural Network’s place within the universe of Machine Learning

Machine Learning is a vast and ever-expanding space with new algorithms developed daily. I have attempted to bring structure to this world by categorizing some of the most commonly used algorithms in the interactive chart below. Click on different categories to enlarge and reveal more.👇

While this categorization is not perfect, it brings a general understanding of how different pieces fit together, and hopefully, it can also facilitate your data science learning journey.

I have placed Neural Networks in a distinct category recognizing their unique approach to Machine Learning. However, it is essential to remember that Neural Networks are most frequently employed to solve classification and regression problems using labeled training data. Hence, an alternative approach could be to put them under the Supervised branch of Machine Learning.

If you enjoy Data Science and Machine Learning, please subscribe to get an email whenever I publish a new story.

A visual explanation of how Feed Forward NNs work

Structure and terminology

First, let’s familiarize ourselves with the basic structure of a Neural Network.

👁 Basic structure of a Feed Forward (FF) Neural Network. Image by author.

Basic structure of a Feed Forward (FF) Neural Network. Image by author.

Input Layer – contains one or more input nodes. For example, suppose you want to predict whether it will rain tomorrow and base your decision on two variables, humidity and wind speed. In that case, your first input would be the value for humidity, and the second input would be the value for wind speed.
Hidden Layer – this layer houses hidden nodes, each containing an activation function (more on these later). Note that a Neural Network with multiple hidden layers is known as Deep Neural Network.
Output Layer – contains one or more output nodes. Following the same weather prediction example above, you could choose to have only one output node generating a rain probability (where >0.5 means rain tomorrow, and ≤0.5 no rain tomorrow). Alternatively, you could have two output nodes, one for rain and another for no rain. Note, you can use a different activation function for output nodes vs. hidden nodes.
Connections – lines joining different nodes are known as connections. These contain kernels (weights) and biases, the parameters that get optimized during the training of a neural network.

Parameters and activation functions

Let’s take a closer look at kernels (weights) and biases to understand what they do. For simplicity, we create a basic neural network with one input node, two hidden nodes, and one output node (1–2–1).

👁 Detailed view of how weights and biases are applied within the Feed Forward (FF) Neural Network. Image by author.

Detailed view of how weights and biases are applied within the Feed Forward (FF) Neural Network. Image by author.

Kernels (weights) – **** used to scale input and hidden node values. Each connection typically holds a different weight.
Biases – used to adjust scaled values before passing them through an activation function.
Activation functions – think of activation functions as standard curves (building blocks) used by the Neural Network to create a custom curve to fit the training data. Passing different input values through the network selects different sections of the standard curve, which are then assembled into a final custom-fit curve.

There are many activation functions to choose from, with Softplus, ReLU, and Sigmoid being the most commonly used. Here are the shapes and equations of six frequently used activation functions in Neural Networks:

👁 Activation functions. Image by author.

Activation functions. Image by author.

As we are now familiar with kernels (weights), biases, and activation functions, let’s use the same Neural Network to calculate the probability of rain tomorrow based on today’s humidity.

Note, I have already trained this Neural Network (see Python section below). Hence, we already know the values for kernels (weights) and biases. The below illustration shows you a step-by-step process of how FF Neural Network takes an input value and produces the answer (output value).

👁 Example calculation performed by Feed Forward (FF) Neural Network. Image by author.

Example calculation performed by Feed Forward (FF) Neural Network. Image by author.

As you can see, the above Neural Network tells us that a 50% humidity today implies a 33% probability of rain tomorrow.

Loss functions, optimizers, and training

Training Neural Networks involves a complicated process known as backpropagation. I will not go through a step-by-step explanation of how backpropagation works since it is a big enough topic deserving a separate article.

Instead, let me briefly introduce you to loss functions and optimizers and summarize what happens when we "train" a Neural Network.

Loss – represents the "size" of error between the true values/labels and the predicted values/labels. The goal of training a Neural Network is to minimize this loss. The smaller the loss, the closer the match between the true and the predicted data. There are many loss functions to choose from, with BinaryCrossentropy, CategoricalCrossentropy, and MeanSquaredError being the most common.
Optimizers – are the algorithms used in backpropagation. The goal of an optimizer is to find the optimum set of kernels (weights) and biases to minimize the loss. Optimizers typically use a gradient descent approach, which allows them to iteratively find the "best" possible configuration of weights and biases. The most commonly used ones are SGD, ADAM, and RMSProp.

Training a Neural Network is basically fitting a custom curve through the training data until it can approximate it as well as possible. The graph below illustrates what a custom-fitted curve could look like in a specific scenario. This example contains a set of data that seem to flip between 0 and 1 as the value for input increases.

👁 Fitting a curve to training data. Image by author.

Fitting a curve to training data. Image by author.

In general, the wide selection of activation functions combined with the ability to add as many hidden nodes as we wish (provided we have sufficient computational power) means that Neural Networks can create a curve of any shape to fit the data.

However, having this extreme flexibility may sometimes lead to overfitting the data. Hence, we must always ensure that we validate the model on the test/validation set before using it to make predictions.

Summarizing what we have learned

Feed Forward Neural Networks take one or multiple input values and apply transformations using kernels (weights) and biases before passing results through activation functions. In the end, we get an output (prediction), which is a result of this complex set of transformations optimized through training.

We train Neural Networks by fitting a custom curve through the training data, guided by loss minimization and achieved through parameter (kernels and biases) optimization.

👁 Image

Building and training Feed Forward Neural Networks in Python

Let’s now have some fun and build our own Neural Network. We will use historic Australian weather data to train a Neural Network that predicts whether it will rain tomorrow or not.

Setup

We’ll need the following data and libraries:

Australian weather data from Kaggle (license: Creative Commons, original source of the data: Commonwealth of Australia, Bureau of Meteorology).
Pandas and Numpy for data manipulation
Plotly for data visualizations
Tensorflow/Keras for Neural Networks
Scikit-learn library for splitting the data into train-test samples, and for some basic model evaluation

Let’s import all the libraries:

# Tensorflow / Keras
from tensorflow import keras # for building Neural Networks
print('Tensorflow/Keras: %s' % keras.__version__) # print version
from keras.models import Sequential # for creating a linear stack of layers for our Neural Network
from keras import Input # for instantiating a keras tensor
from keras.layers import Dense # for creating regular densely-connected NN layers.

# Data manipulation
import pandas as pd # for data manipulation
print('pandas: %s' % pd.__version__) # print version
import numpy as np # for data manipulation
print('numpy: %s' % np.__version__) # print version

# Sklearn
import sklearn # for model evaluation
print('sklearn: %s' % sklearn.__version__) # print version
from sklearn.model_selection import train_test_split # for splitting data into train and test samples
from sklearn.metrics import classification_report # for model evaluation metrics

# Visualization
import plotly 
import plotly.express as px
import plotly.graph_objects as go
print('plotly: %s' % plotly.__version__) # print version

The above code prints package versions used in this example:

Tensorflow/Keras: 2.7.0
pandas: 1.3.4
numpy: 1.21.4
sklearn: 1.0.1
plotly: 5.4.0

Next, we download and ingest Australian weather data (source: Kaggle). We also do some simple data manipulations and derive new variables for our models.

# Set Pandas options to display more columns
pd.options.display.max_columns=50

# Read in the weather data csv
df=pd.read_csv('weatherAUS.csv', encoding='utf-8')

# Drop records where target RainTomorrow=NaN
df=df[pd.isnull(df['RainTomorrow'])==False]

# For other columns with missing values, fill them in with column mean
df=df.fillna(df.mean())

# Create a flag for RainToday and RainTomorrow, note RainTomorrowFlag will be our target variable
df['RainTodayFlag']=df['RainToday'].apply(lambda x: 1 if x=='Yes' else 0)
df['RainTomorrowFlag']=df['RainTomorrow'].apply(lambda x: 1 if x=='Yes' else 0)

# Show a snaphsot of data
df

And this is what the data looks like:

👁 A snippet of Kaggle's Australian weather data with some modifications. Image by author.

A snippet of Kaggle’s Australian weather data with some modifications. Image by author.

Neural Networks

Now we train and evaluate our Feed Forward (FF) Neural Network. I have extensively commented the code below to provide you with a clear understanding of what each part does. Hence, I will not repeat the same in the body of the article.

Using one input (Humidity3pm)

In short, we are using humidity at 3 pm today to predict whether it will rain tomorrow or not. Our Neural Network has a simple structure (1–2–1) analyzed earlier in this article: one input node, two hidden nodes, and one output node.

A couple of things to note:

The below code performs validation twice, once on a portion of X_train data (see validation_split in step 5) and another time on a test sample created in step 2. Of course, there is no need to do it twice, so feel free to use either method to validate your model.
The data was imbalanced (more sunny days than rainy days), so I’ve adjusted classes_weight in step 5.

##### Step 1 - Select data for modeling
X=df[['Humidity3pm']]
y=df['RainTomorrowFlag'].values

##### Step 2 - Create training and testing samples
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

##### Step 3 - Specify the structure of a Neural Network
model = Sequential(name="Model-with-One-Input") # Model
model.add(Input(shape=(1,), name='Input-Layer')) # Input Layer - need to speicfy the shape of inputs
model.add(Dense(2, activation='softplus', name='Hidden-Layer')) # Hidden Layer, softplus(x) = log(exp(x) + 1)
model.add(Dense(1, activation='sigmoid', name='Output-Layer')) # Output Layer, sigmoid(x) = 1 / (1 + exp(-x))

##### Step 4 - Compile keras model
model.compile(optimizer='adam', # default='rmsprop', an algorithm to be used in backpropagation
 loss='binary_crossentropy', # Loss function to be optimized. A string (name of loss function), or a tf.keras.losses.Loss instance.
 metrics=['Accuracy', 'Precision', 'Recall'], # List of metrics to be evaluated by the model during training and testing. Each of this can be a string (name of a built-in function), function or a tf.keras.metrics.Metric instance. 
 loss_weights=None, # default=None, Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs.
 weighted_metrics=None, # default=None, List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing.
 run_eagerly=None, # Defaults to False. If True, this Model's logic will not be wrapped in a tf.function. Recommended to leave this as None unless your Model cannot be run inside a tf.function.
 steps_per_execution=None # Defaults to 1. The number of batches to run during each tf.function call. Running multiple batches inside a single tf.function call can greatly improve performance on TPUs or small models with a large Python overhead.
 )

##### Step 5 - Fit keras model on the dataset
model.fit(X_train, # input data
 y_train, # target data
 batch_size=10, # Number of samples per gradient update. If unspecified, batch_size will default to 32.
 epochs=3, # default=1, Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided
 verbose='auto', # default='auto', ('auto', 0, 1, or 2). Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. 'auto' defaults to 1 for most cases, but 2 when used with ParameterServerStrategy.
 callbacks=None, # default=None, list of callbacks to apply during training. See tf.keras.callbacks
 validation_split=0.2, # default=0.0, Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. 
 #validation_data=(X_test, y_test), # default=None, Data on which to evaluate the loss and any model metrics at the end of each epoch. 
 shuffle=True, # default=True, Boolean (whether to shuffle the training data before each epoch) or str (for 'batch').
 class_weight={0 : 0.3, 1 : 0.7}, # default=None, Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class.
 sample_weight=None, # default=None, Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only).
 initial_epoch=0, # Integer, default=0, Epoch at which to start training (useful for resuming a previous training run).
 steps_per_epoch=None, # Integer or None, default=None, Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. 
 validation_steps=None, # Only relevant if validation_data is provided and is a tf.data dataset. Total number of steps (batches of samples) to draw before stopping when performing validation at the end of every epoch.
 validation_batch_size=None, # Integer or None, default=None, Number of samples per validation batch. If unspecified, will default to batch_size.
 validation_freq=3, # default=1, Only relevant if validation data is provided. If an integer, specifies how many training epochs to run before a new validation run is performed, e.g. validation_freq=2 runs validation every 2 epochs.
 max_queue_size=10, # default=10, Used for generator or keras.utils.Sequence input only. Maximum size for the generator queue. If unspecified, max_queue_size will default to 10.
 workers=1, # default=1, Used for generator or keras.utils.Sequence input only. Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1.
 use_multiprocessing=False, # default=False, Used for generator or keras.utils.Sequence input only. If True, use process-based threading. If unspecified, use_multiprocessing will default to False. 
 )

##### Step 6 - Use model to make predictions
# Predict class labels on training data
pred_labels_tr = (model.predict(X_train) > 0.5).astype(int)
# Predict class labels on a test data
pred_labels_te = (model.predict(X_test) > 0.5).astype(int)

##### Step 7 - Model Performance Summary
print("")
print('-------------------- Model Summary --------------------')
model.summary() # print model summary
print("")
print('-------------------- Weights and Biases --------------------')
for layer in model.layers:
 print("Layer: ", layer.name) # print layer name
 print(" --Kernels (Weights): ", layer.get_weights()[0]) # weights
 print(" --Biases: ", layer.get_weights()[1]) # biases

print("")
print('---------- Evaluation on Training Data ----------')
print(classification_report(y_train, pred_labels_tr))
print("")

print('---------- Evaluation on Test Data ----------')
print(classification_report(y_test, pred_labels_te))
print("")

👁 Training a Feed Forward (FF) Neural Network. Gif image by author.

Training a Feed Forward (FF) Neural Network. Gif image by author.

The above code prints the following summary and evaluation metrics for our 1–2–1 Neural Network:

👁 1–2–1 Feed Forward (FF) Neural Network performance. Image by author.

1–2–1 Feed Forward (FF) Neural Network performance. Image by author.

Note that weights and biases for this model are different from the ones in the calculated example earlier in this article. It is because Neural Network training uses a stochastic (random) approach within the optimizer algorithms. Hence, your model will be different every time you re-train it.

Let’s now plot the prediction curve on a chart.

# Create 100 evenly spaced points from smallest X to largest X
X_range = np.linspace(X.min(), X.max(), 100)
# Predict probabilities for rain tomorrow
y_predicted = model.predict(X_range.reshape(-1, 1))

# Create a scatter plot
fig = px.scatter(x=X_range.ravel(), y=y_predicted.ravel(), 
 opacity=0.8, color_discrete_sequence=['black'],
 labels=dict(x="Value of Humidity3pm", y="Predicted Probability of Rain Tomorrow",))

# Change chart background color
fig.update_layout(dict(plot_bgcolor = 'white'))

# Update axes lines
fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey', 
 zeroline=True, zerolinewidth=1, zerolinecolor='lightgrey', 
 showline=True, linewidth=1, linecolor='black')

fig.update_yaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey', 
 zeroline=True, zerolinewidth=1, zerolinecolor='lightgrey', 
 showline=True, linewidth=1, linecolor='black')

# Set figure title
fig.update_layout(title=dict(text="Feed Forward Neural Network (1 Input) Model Results", 
 font=dict(color='black')))
# Update marker size
fig.update_traces(marker=dict(size=7))

fig.show()

👁 Prediction curve produced by the Neural Network with one input. Image by author.

Prediction curve produced by the Neural Network with one input. Image by author.

Using two inputs (WindGustSpeed and Humidity3pm)

Let’s see how the network and predictions change when we use two inputs (WindGustSpeed and Humidity3pm) to train a Neural Network that has a 2–2–1 structure.

Feel free to experiment at your own time by training a model with 17 inputs and a different number of hidden nodes.

##### Step 1 - Select data for modeling
X=df[['WindGustSpeed', 'Humidity3pm']]
y=df['RainTomorrowFlag'].values

##### Step 2 - Create training and testing samples
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

##### Step 3 - Specify the structure of a neural network
model2 = Sequential(name="Model-with-Two-Inputs") # Model
model2.add(Input(shape=(2,), name='Input-Layer')) # Input Layer - need to speicfy the shape of inputs
model2.add(Dense(2, activation='softplus', name='Hidden-Layer')) # Hidden Layer, softplus(x) = log(exp(x) + 1)
model2.add(Dense(1, activation='sigmoid', name='Output-Layer')) # Output Layer, sigmoid(x) = 1 / (1 + exp(-x))

##### Step 4 - Compile the keras model
model2.compile(optimizer='adam', # default='rmsprop', an algorithm to be used in backpropagation
 loss='binary_crossentropy', # Loss function to be optimized. A string (name of loss function), or a tf.keras.losses.Loss instance.
 metrics=['Accuracy', 'Precision', 'Recall'], # List of metrics to be evaluated by the model during training and testing. Each of this can be a string (name of a built-in function), function or a tf.keras.metrics.Metric instance. 
 loss_weights=None, # default=None, Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs.
 weighted_metrics=None, # default=None, List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing.
 run_eagerly=None, # Defaults to False. If True, this Model's logic will not be wrapped in a tf.function. Recommended to leave this as None unless your Model cannot be run inside a tf.function.
 steps_per_execution=None # Defaults to 1. The number of batches to run during each tf.function call. Running multiple batches inside a single tf.function call can greatly improve performance on TPUs or small models with a large Python overhead.
 )

##### Step 5 - Fit keras model on the dataset
model2.fit(X_train, # input data
 y_train, # target data
 batch_size=10, # Number of samples per gradient update. If unspecified, batch_size will default to 32.
 epochs=3, # default=1, Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided
 verbose='auto', # default='auto', ('auto', 0, 1, or 2). Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. 'auto' defaults to 1 for most cases, but 2 when used with ParameterServerStrategy.
 callbacks=None, # default=None, list of callbacks to apply during training. See tf.keras.callbacks
 validation_split=0.2, # default=0.0, Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. 
 #validation_data=(X_test, y_test), # default=None, Data on which to evaluate the loss and any model metrics at the end of each epoch. 
 shuffle=True, # default=True, Boolean (whether to shuffle the training data before each epoch) or str (for 'batch').
 class_weight={0 : 0.3, 1 : 0.7}, # default=None, Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class.
 sample_weight=None, # default=None, Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only).
 initial_epoch=0, # Integer, default=0, Epoch at which to start training (useful for resuming a previous training run).
 steps_per_epoch=None, # Integer or None, default=None, Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. 
 validation_steps=None, # Only relevant if validation_data is provided and is a tf.data dataset. Total number of steps (batches of samples) to draw before stopping when performing validation at the end of every epoch.
 validation_batch_size=None, # Integer or None, default=None, Number of samples per validation batch. If unspecified, will default to batch_size.
 validation_freq=3, # default=1, Only relevant if validation data is provided. If an integer, specifies how many training epochs to run before a new validation run is performed, e.g. validation_freq=2 runs validation every 2 epochs.
 max_queue_size=10, # default=10, Used for generator or keras.utils.Sequence input only. Maximum size for the generator queue. If unspecified, max_queue_size will default to 10.
 workers=1, # default=1, Used for generator or keras.utils.Sequence input only. Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1.
 use_multiprocessing=False, # default=False, Used for generator or keras.utils.Sequence input only. If True, use process-based threading. If unspecified, use_multiprocessing will default to False. 
 )

##### Step 6 - Use model to make predictions
# Predict class labels on training data
pred_labels_tr = (model2.predict(X_train) > 0.5).astype(int)
# Predict class labels on a test data
pred_labels_te = (model2.predict(X_test) > 0.5).astype(int)

##### Step 7 - Model Performance Summary
print("")
print('-------------------- Model Summary --------------------')
model2.summary() # print model summary
print("")
print('-------------------- Weights and Biases --------------------')
for layer in model2.layers:
 print("Layer: ", layer.name) # print layer name
 print(" --Kernels (Weights): ", layer.get_weights()[0]) # kernels (weights)
 print(" --Biases: ", layer.get_weights()[1]) # biases

print("")
print('---------- Evaluation on Training Data ----------')
print(classification_report(y_train, pred_labels_tr))
print("")

print('---------- Evaluation on Test Data ----------')
print(classification_report(y_test, pred_labels_te))
print("")

And the results are:

👁 2–2–1 Feed Forward (FF) Neural Network model performance. Image by author.

2–2–1 Feed Forward (FF) Neural Network model performance. Image by author.

Since we used two inputs, we can still visualize the predictions. However, this time we need a 3D chart to do it:

def Plot_3D(X, X_test, y_test, clf, x1, x2, mesh_size, margin):

 # Specify a size of the mesh to be used
 mesh_size=mesh_size
 margin=margin

 # Create a mesh grid on which we will run our model
 x_min, x_max = X.iloc[:, 0].min() - margin, X.iloc[:, 0].max() + margin
 y_min, y_max = X.iloc[:, 1].min() - margin, X.iloc[:, 1].max() + margin
 xrange = np.arange(x_min, x_max, mesh_size)
 yrange = np.arange(y_min, y_max, mesh_size)
 xx, yy = np.meshgrid(xrange, yrange)

 # Calculate Neural Network predictions on the grid
 Z = model2.predict(np.c_[xx.ravel(), yy.ravel()])
 Z = Z.reshape(xx.shape)

 # Create a 3D scatter plot
 fig = px.scatter_3d(x=X_test[x1], y=X_test[x2], z=y_test,
 opacity=0.8, color_discrete_sequence=['black'], height=900, width=1000)

 # Set figure title and colors
 fig.update_layout(#title_text="Scatter 3D Plot with FF Neural Network Prediction Surface",
 paper_bgcolor = 'white',
 scene_camera=dict(up=dict(x=0, y=0, z=1), 
 center=dict(x=0, y=0, z=-0.1),
 eye=dict(x=0.75, y=-1.75, z=1)),
 margin=dict(l=0, r=0, b=0, t=0),
 scene = dict(xaxis=dict(title=x1,
 backgroundcolor='white',
 color='black',
 gridcolor='#f0f0f0'),
 yaxis=dict(title=x2,
 backgroundcolor='white',
 color='black',
 gridcolor='#f0f0f0'
 ),
 zaxis=dict(title='Probability of Rain Tomorrow',
 backgroundcolor='lightgrey',
 color='black', 
 gridcolor='#f0f0f0', 
 )))

 # Update marker size
 fig.update_traces(marker=dict(size=1))

 # Add prediction plane
 fig.add_traces(go.Surface(x=xrange, y=yrange, z=Z, name='FF NN Prediction Plane',
 colorscale='Bluered',
 reversescale=True,
 showscale=False, 
 contours = {"z": {"show": True, "start": 0.5, "end": 0.9, "size": 0.5}}))
 fig.show()
 return fig

# Call the above function
fig = Plot_3D(X, X_test, y_test, model2, x1='WindGustSpeed', x2='Humidity3pm', mesh_size=1, margin=0)

👁 Curved prediction surface produced by the Neural Network with two inputs. Image by author.

Curved prediction surface produced by the Neural Network with two inputs. Image by author.

Conclusions

Neural Networks are not as scary as they seem at first. I sincerely hope you enjoyed reading this article and obtained some new knowledge.

Feel feel to use the code provided in this article to build your own Neural Networks. Also, you can find the complete Jupyter Notebook in my GitHub repository.

As I try to make my articles more useful for readers, I would appreciate it if you could let me know what has driven you to read this piece and whether it has given you the answers you were looking for. If not, what was missing?

Cheers! 👏 Saul Dobilas

UMAP Dimensionality Reduction – An Incredibly Robust Machine Learning Algorithm

Self-Training Classifier: How to Make Any Algorithm Behave Like a Semi-Supervised One

BBN: Bayesian Belief Networks – How to Build Them Effectively in Python?

Written By

Saul Dobilas

See all from Saul Dobilas

Data Science, Editor’s Picks, Machine Learning, Neural Networks, TensorFlow

Share This Article

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

URL: https://towardsdatascience.com/feed-forward-neural-networks-how-to-successfully-build-them-in-python-74503409d99a/