A Perceptron is the simplest form of a neural network that makes decisions by combining inputs with weights and applying an activation function. It is mainly used for binary classification problems. It forms the basic building block of many deep learning models.
These are the features or measurable attributes of a data point that the perceptron uses to make a decision. Each input provides a signal that contributes to the final output.
Example: For an OR gate, the inputs are binary:
Inputs themselves have no inherent influence unless multiplied by weights.
2. Weights
Weights determine how strongly each input contributes to the prediction. A larger weight means the corresponding input has a higher impact.
Weights are learned during training, adjusting based on errors.
They act like importance scores for each feature.
3. Bias ()
The bias is a constant value added to the weighted sum to shift the decision boundary.
It allows the perceptron to classify correctly even when all input features are zero.
Bias ensures the model is not forced to pass the decision boundary through the origin.
Difference Between Weights and Bias
Weights control how much each input influences the output.
Bias controls when the perceptron activates, independent of any input.
Mathematically, Weights tilt the line and Bias shifts the line up/down or left/right.
4. Net Input (Weighted Sum)
This is the combined effect of all inputs and their weights:
Represents the activation strength before passing through the activation function.
If is high or low enough, it determines the final class.
5. Activation Function (Step Function)
The activation function converts the numerical input into a binary output:
It introduces non-linearity in the decision-making, although the decision boundary remains linear.
Output is always 0 or 1 making perceptrons suitable for binary classification.
Fundamentals of Neural Network
A neural network extends the perceptron by connecting many neurons across multiple layers.
1. Input layer: The input layer provides the network with the raw feature vector:
No computation happens here.
It simply passes the input values to the next layer.
2. Hidden layers: Hidden layers contain multiple perceptrons (neurons) that learn intermediate representations of the data.
Hidden Layer Computation:
where:
: weight matrix for hidden layer
: bias vector
: non-linear activation function (ReLU, Sigmoid, Tanh, etc.)
Hidden layers identify complex patterns not visible from raw input alone.
Adding more hidden layers improves model expressiveness.
3. Output layer: The output layer produces the final prediction, which may be binary, multi-class or a continuous value.
Output Layer Computation:
Output activation depends on the task:
Sigmoid: binary classification
Softmax: multi-class classification
Linear: regression
Because of multiple layers and non-linear activations, neural networks can model complex, non-linear decision boundaries, while a single perceptron can only model a straight line.
Working
Training a perceptron means finding suitable weights wi and bias b such that most training points are correctly classified.
1. Compute the Weighted Sum
The perceptron first calculates a weighted combination of the input features, along with a bias term that helps shift the decision boundary.
2. Apply the Activation Function (Step Function)
The perceptron uses a simple threshold activation to convert the numerical value into a binary class label.
3. Compare Prediction with Actual Output
The perceptron checks if the predicted output matches the true label.
4. Update the Weights (Learning Rule)
Whenever the perceptron misclassifies a sample, it updates each weight by an amount proportional to the error and the input value.
5. Update the Bias Term
The bias is adjusted similarly to shift the decision boundary left or right.
6. Repeat for All Samples Across Multiple Epochs
The perceptron cycles through the entire dataset several times (epochs), refining weights gradually until it reaches a stable solution.
7. Final Learned Model
After training, the perceptron produces predictions using:
Implementation
Let's implement the model:
Step 1: Import Libraries and Create the Dataset
We import NumPy for numerical operations and Matplotlib for visualizations. The dataset represents the OR logic gate, which is linearly separable and suitable for perceptron learning.
Step 2: Define the Perceptron Class
This defines the entire Perceptron class: constructor, predict and .fit() trains the model by adjusting weights and bias whenever a misclassification occurs and tracks errors per epoch.
Step 3: Train the Perceptron on OR Data
We create a perceptron instance and train it on the OR dataset.
After training, we print the learned weights, bias and predictions (which should be [0 1 1 1] for the OR gate).