VOOZH about

URL: https://www.analyticsvidhya.com/blog/2021/06/a-comprehensive-tutorial-on-deep-learning-part-2/

โ‡ฑ Deep Learning | A comprehensive tutorial on Deep Learning - Part 2


India's Most Futuristic AI Conference Is Back โ€“ Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

Reading list

A comprehensive tutorial on Deep Learning โ€“ Part 2

Sion Last Updated : 13 Jun, 2021
6 min read

This article was published as a part of the Data Science Blogathon

Introduction

Hello readers. This is Part 2 in the series of A Comprehensive tutorial on Deep learning. If you havenโ€™t read the first part, you can read about it here:

A comprehensive tutorial on Deep Learning โ€“ Part 1 | Sion 

In the first part we discussed the following topics:

  • About Deep Learning
  • Importing the dataset and Overview of the Data
  • Computational Graph
  • Initializing weights and parameters
  • Forward Propagation
  • Gradient Descent
  • Logistic regression with Sklearn

In this article, we will continue the discussion and introduce Artificial Neural Networks(ANN)

Contents

1) Artificial Neural Networks

  • What does a Neural network mean?
  • How many layers are needed to call it a โ€œDeepโ€ neural network?
  • Why are the layers called hidden?

2) 2 Layered Neural Network

  • Creating layers and initializing parameter weights and biases
  • Forward Propagation
  • Cost Function and Loss Function
  • Back Propagation
  • Updating Parameters
  • Prediction
  • Creating Model

3) What to expect in the next article

4) Endnotes

Artificial Neural Network(ANN)

This is another name for Deep Neural network or Deep Learning.

What does a Neural Network mean?

What neural network essentially means is we take logistic regression and repeat it multiple times. In a normal logistic regression, we have an input layer and an output layer. But in the case of a  Neural Network, there is at least one hidden layer of regression between these input and output layers.

How many layers are needed to call it a โ€œDeepโ€ neural network?

Well of course there is no specific amount of layers to classify a neural network as deep. The term โ€œDeepโ€ is quite frankly relative to every problem. The correct question we can ask is โ€œHow much deep?โ€. For example, the answer to โ€œHow deep is your swimming pool?โ€ can be answered in multiple ways. It could be 2 meters deep or 10 meters deep, but it has โ€œdepthโ€. Same with our neural network, it can have 2 hidden layers or โ€œthousandsโ€ hidden layers(yes you heard that correctly).

So Iโ€™d like to just stick with the question of โ€œHow much deep?โ€ for the time being.

Why are the layers called hidden?

They are called hidden because they do not see the original inputs( the training set ). For example, letโ€™s say you have a NN with an input layer, one hidden layer, and an output layer. When asked how many layers your NN has, your answer should be โ€œIt has 2 layersโ€, because while computation the initial, or the input layer, is ignored.

Let me help visualize how a 2 layer Neural network looks like:

Step by step we shall understand this image.

1) As you can see here we have a 2 Layered Artificial Neural Network. A Neural network was created to mimic the biological neuron of the human brain. In our ANN we have a โ€œkโ€ number of nodes. The number of nodes is a hyperparameter, which essentially means that the amount is configured by the practitioner making the model.

2) The inputs and outputs layers do not change. We have โ€œnโ€ input features and 3 possible outcomes.

3) Unlike Logistic regression, neural networks use the tanh function as their activation function instead of the sigmoid function which you are quite familiar with. The reason is that the mean of its output is closer to 0 which makes the more centered for input to the next layer. tanh function can cause an increase in non-linearity which makes our model learn better.

4) In normal logistic regression: Input => Output.

  Whereas in a Neural network: Input => Hidden Layer => Output. The hidden layer can be imagined as the output of part 1 and input of part 2 of our ANN.

Now let us have a more practical approach to a 2 Layered Neural Network.

(Important Note: We shall continue where we left off in the previous article. Iโ€™m not going to waste your time and mine by loading the dataset again and preparing it. The link to the Part 1 of this series is given above.)

2 Layered Neural Network

1) Creating layers and initializing parameter weights and biases

Our training set has 348 samples, thus x(348). In logistic regression, we initialized the bias at 0 and weights at 0.01. But this time we shall initialize the weights randomly because if we initialize them with 0, all the neurons in the 1st layer will be computing the same things as the other neurons. Thus we shall initialize randomly. Also, these initial weights should be small because if they are large in the beginning, they shall cause the inputs to tanh to be large, causing the gradients to be close to 0, making the optimization algorithm slow.

Biases can be made 0 initially.

# intialize parameters and layer sizes
def initialize_parameters_and_layer_sizes_NN(x_train, y_train):
 parameters = {"weight1": np.random.randn(3,x_train.shape[0]) * 0.1,
 "bias1": np.zeros((3,1)),
 "weight2": np.random.randn(y_train.shape[0],3) * 0.1,
 "bias2": np.zeros((y_train.shape[0],1))}
 return parameters

2) Forward Propagation

The forward propagation is almost the same as we used in Logistic regression. The only difference here is that we use the tanh function and do all the processes twice. The tanh function is included in the Numpy module.

def forward_propagation_NN(x_train, parameters):

 Z1 = np.dot(parameters["weight1"],x_train) +parameters["bias1"]
 A1 = np.tanh(Z1)
 Z2 = np.dot(parameters["weight2"],A1) + parameters["bias2"]
 A2 = sigmoid(Z2)

 cache = {"Z1": Z1,
 "A1": A1,
 "Z2": Z2,
 "A2": A2}
 
 return A2, cache

3) Cost Function and Loss Function

The loss and cost functions are the same as the logistic regression.

The cross-entropy loss:

# Compute cost
def compute_cost_NN(A2, Y, parameters):
 logprobs = np.multiply(np.log(A2),Y)
 cost = -np.sum(logprobs)/Y.shape[1]
 return cost

4) Back Propagation

Backward propagation is essentially the derivative. This is a very extensive and crucial topic which deserves an article of its own. Be sure to check out my future articles for a tutorial on Backpropagation. Let us write the code:

# Backward Propagation
def backward_propagation_NN(parameters, cache, X, Y):

 dZ2 = cache["A2"]-Y
 dW2 = np.dot(dZ2,cache["A1"].T)/X.shape[1]
 db2 = np.sum(dZ2,axis =1,keepdims=True)/X.shape[1]
 dZ1 = np.dot(parameters["weight2"].T,dZ2)*(1 - np.power(cache["A1"], 2))
 dW1 = np.dot(dZ1,X.T)/X.shape[1]
 db1 = np.sum(dZ1,axis =1,keepdims=True)/X.shape[1]
 grads = {"dweight1": dW1,
 "dbias1": db1,
 "dweight2": dW2,
 "dbias2": db2}
 return grads

5) Updating Parameters

The updating of the Parameters is also the same as Logistic regression. This is the reason why you should read Part 1 of this series. We shall use the Logistic regression several times since it is the building block of an Artificial Neural Network.

# update parameters
def update_parameters_NN(parameters, grads, learning_rate = 0.01):
 parameters = {"weight1": parameters["weight1"]-learning_rate*grads["dweight1"],
 "bias1": parameters["bias1"]-learning_rate*grads["dbias1"],
 "weight2": parameters["weight2"]-learning_rate*grads["dweight2"],
 "bias2": parameters["bias2"]-learning_rate*grads["dbias2"]}
 
 return parameters

6) Prediction

Now we create a function for prediction:

# prediction
def predict_NN(parameters,x_test):
 # x_test is a input for forward propagation
 A2, cache = forward_propagation_NN(x_test,parameters)
 Y_prediction = np.zeros((1,x_test.shape[1]))
 # if z is bigger than 0.5, our prediction is sign one (y_head=1),
 # if z is smaller than 0.5, our prediction is sign zero (y_head=0),
 for i in range(A2.shape[1]):
 if A2[0,i]<= 0.5:
 Y_prediction[0,i] = 0
 else:
 Y_prediction[0,i] = 1

 return Y_prediction

7) Creating Model

Now time to put all of them together and see the magic happen:

# 2 - Layer neural network
def two_layer_neural_network(x_train, y_train,x_test,y_test, num_iterations):
 cost_list = []
 index_list = []
 #initialize parameters and layer sizes
 parameters = initialize_parameters_and_layer_sizes_NN(x_train, y_train)

 for i in range(0, num_iterations):
 # forward propagation
 A2, cache = forward_propagation_NN(x_train,parameters)
 # compute cost
 cost = compute_cost_NN(A2, y_train, parameters)
 # backward propagation
 grads = backward_propagation_NN(parameters, cache, x_train, y_train)
 # update parameters
 parameters = update_parameters_NN(parameters, grads)
 
 if i % 100 == 0:
 cost_list.append(cost)
 index_list.append(i)
 print ("Cost after iteration %i: %f" %(i, cost))
 plt.plot(index_list,cost_list)
 plt.xticks(index_list,rotation='vertical')
 plt.xlabel("Number of Iterarion")
 plt.ylabel("Cost")
 plt.show()
 
 # predict
 y_prediction_test = predict_NN(parameters,x_test)
 y_prediction_train = predict_NN(parameters,x_train)

 # Print train/test Errors
 print("train accuracy: {} %".format(100 - np.mean(np.abs(y_prediction_train - y_train)) * 100))
 print("test accuracy: {} %".format(100 - np.mean(np.abs(y_prediction_test - y_test)) * 100))
 return parameters

parameters = two_layer_neural_network(x_train, y_train,x_test,y_test, num_iterations=2500)

From the previous article, we had an accuracy of 92% after using the Logistic Regression. Thus you can see we have substantially higher accuracy by just adding an extra layer of Logistic regression. This is the reason why Neural Networks is one of the cutting-edge technologies nowadays.

What to expect in the next article?

In the next article, which I plan to publish shortly, we shall generalize the number of layers and build a model accordingly. In this article, we just used 1 hidden layer, and as you can see it performed really well. In the next article, we will create an L-layered Neural Network. Since it is really hectic to create functions and classes for each layer independently, we will use libraries like Keras and Pytorch for more efficient code.

Endnotes

Today you learned how to build your first neural network from scratch, congratulations! Although the same can be done with Keras with just a few lines of code, itโ€™s always better to know what is happening under the hood. You can read the 3rd article in the link below after itโ€™s published:

Sion | Author at Analytics Vidhya

Thank you and have a nice day, Cheers!!

The media shown in this article are not owned by Analytics Vidhya and are used at the Authorโ€™s discretion.

Login to continue reading and enjoy expert-curated content.

Free Courses

Ensemble Learning and Ensemble Learning Techniques

Learn ensemble learning, its techniques, and how it works in this course!

Dimensionality Reduction for Machine Learning

Master key dimensionality reduction techniques for ML success!

Responses From Readers

usable_pushkar98

Nicely explain

usable_pushkar98

Very nicely explained

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
๐Ÿ‘ Av Logo White

Continue your learning for FREE

Forgot your password?
๐Ÿ‘ Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

๐Ÿ‘ Popup Banner
๐Ÿ‘ AI Popup Banner