A Boltzmann Machine is an unsupervised generative neural network that models data using energy-based states with fully connected bidirectional neurons. Due to this, full connectivity training is computationally expensive and inefficient in practice.
A Restricted Boltzmann Machine (RBM) is a simplified version of the Boltzmann Machine designed to make training feasible. It consists of a visible layer and a hidden layer with no connections allowed between neurons within the same layer. RBMs learn latent features from unlabeled data and are widely used for representation learning and dimensionality reduction.
RBM is an unsupervised, generative, energy-based model that learns a probability distribution over input data.
It has two layers only a visible (input) layer and a hidden (feature) layer with no intra-layer connections.
RBMs are trained using Contrastive Divergence, which is an efficient approximation to maximum likelihood learning.
They can discover latent features that explain correlations in the input data.
RBMs are commonly used in feature learning, collaborative filtering, dimensionality reduction and as building blocks of Deep Belief Networks (DBNs).
How RBM Works
A Restricted Boltzmann Machine (RBM) is a generative stochastic neural network consisting of two layers a visible layer and a hidden layer. The term restricted means there are no connections within the same layer only between visible and hidden units.
👁 learning Parameter Learning vs. Sample Generation
The image shows the two phases of a Restricted Boltzmann Machine (RBM).
During learning visible units receive training data and adjust weights with hidden units to model the data distribution.
During generating the trained RBM uses sampling between hidden and visible units to produce new data samples similar to the training data.
RBM Architecture
Visible units (v): Represent the input data
Hidden units (h): Capture latent features
Weights (W): Connections between visible and hidden units. connects .
Biases: Visible bias and hidden bias
Energy Function
The RBM assigns an energy to each configuration of visible and hidden units:
where
: state of visible unit
: state of hidden unit
: weight between visible unit and hidden unit
Lower energy leads to higher probability.
Learning Process of Restricted Boltzmann Machine
The learning process of an RBM aims to reduce the reconstruction error. This is achieved by iteratively updating the weights so that the reconstructed data becomes closer to the original data distribution.
1. Reconstruction Error
Reconstruction error is define by:
where
: original input (visible units)
: reconstructed input
The goal of learning is to minimize this error over successive training iterations by adjusting the weights .
2. Forward Pass
In the forward pass, we compute the probability of activating hidden units given the visible input
3. Backward Pass
In the backward pass, the RBM reconstructs the input using the hidden activations
4. Joint Probability Distribution (Gibbs Distribution)
The joint probability of a visible–hidden configuration is:
where the partition function is:
5. Generative Learning Perspective
RBM performs reconstruction, not classification or regression.
It does not map inputs to labels
Instead, it learns the probability distribution of the input data
Hence RBM is a generative model, unlike discriminative models used in classification.
6. Error Minimization Using KL-Divergence
The difference between distributions represents the learning error and is measured using Kullback–Leibler (KL) divergence:
where
: true data distribution
: model’s reconstructed distribution
KL-divergence measures how much information is lost when approximates
7. Weight Update Rule (Contrastive Divergence)
To reduce the KL-divergence RBM updates weights using:
where
: Learning Rate
: expectation from real data
: expectation from reconstructed data
Step By Step Implementation
In this code we train a Restricted Boltzmann Machine (RBM) on binarized MNIST images to learn feature representations then visualize reconstructed images from the RBM and generate new digit samples using Gibbs sampling
Step 1: Import Required Libraries
numpy is used for numerical computations and matrix operations
matplotlib is used for visualizing images and reconstructions
This output shows the original MNIST images and their reconstructions generated by the RBM. It shows how well the Restricted Boltzmann Machine has learned to capture the underlying patterns of the digits.
Step 11: Generate New Samples using Gibbs Sampling
Starts from random noise
Alternates between visible and hidden layers
Produces new digit-like samples learned from data distribution
This output shows new digit-like images generated entirely by the RBM from random noise. After multiple Gibbs sampling steps, the RBM produces samples that resemble the patterns it learned from the training data.