To work with sequential data where the actual states are not directly visible, the Hidden Markov Model (HMM) is a widely used probabilistic model in machine learning. It assumes that a system moves through hidden states over time, and each hidden state produces an observable output based on certain probabilities.
This example shows a Hidden Markov Model where the hidden states are weather conditions (Rainy, Cloudy, Sunny) and the observations are emotions (Happy, Neutral, Sad).
The green arrows represent transition probabilities, showing how likely the weather is to change from one state to another each day.
Red arrows represent emission probabilities, showing how likely each emotion is given the current weather.
Since we only observe the emotions and not the actual weather, the HMM helps infer the most probable hidden weather pattern behind those observations.
Assumptions of HMM
1. Hidden States
The actual state of the system is not visible.
Example: Weather (Sunny/Rainy) is hidden.
2. Observations
We only see the outcomes produced by the hidden states.
Example: Friend’s mood (Happy/Sad) is observed.
3. Markov Property
The model assumes the future state depends only on the current state not on the entire history.
Components of a Hidden Markov Model (HMM)
A Hidden Markov Model is defined by
where
(Transition Matrix): Probability of moving from one hidden state to another.
(Emission Matrix): Probability of observing a symbol given a hidden state.
(Initial State Distribution): Probability of starting in each hidden state.
1. Hidden States (N): These are the internal states of the system, which are not directly observable.
2. Observations (M): These are the visible outputs generated by the hidden states.
3. Initial State Distribution (): Represents the probability of starting in each hidden state.
4. Transition Probabilities (A): Defines the probability of moving from one hidden state to another.
5. Emission Probabilities (B): Defines the probability of producing a particular observed output from a given hidden state.
Three Fundamental Problems in Hidden Markov Models (HMMs)
Hidden Markov Models solve three core problems related to sequences of observations generated by hidden states.
1. Evaluation Problem (Forward Algorithm)
Problem: How to compute the probability of an observation sequence?
Mathematically, given an observation sequence:
and a model:
we compute:
Because summing over all possible state sequences is computationally expensive, the Forward Algorithm efficiently computes this probability using dynamic programming with the forward variable:
2. Decoding Problem (Viterbi Algorithm)
Problem: How to find the most likely hidden state sequence that explains the observations?
Mathematically, it finds:
The Viterbi Algorithm efficiently computes this using dynamic programming by keeping track of the maximum probability path to each state at each time step.
3. Learning Problem (Baum–Welch Algorithm / EM)
Problem: How to train the HMM to fit the observed data?
Here, the goal is to estimate the model parameters that maximize the likelihood of the observation sequence:
The Baum Welch Algorithm (a type of Expectation-Maximization) iteratively updates estimates of , and using the state occupancy probabilities and transition probabilities computed from the observation sequence.
HMM Algorithm
Step 1 Define State and Observation Space
Hidden States: Possible internal states the system can be in.
Observations: Visible outputs generated by hidden states.
Step 2 Define Initial State Distribution (): Set probabilities of starting in each hidden state.
Step 3 Define Transition Matrix (A): Set probabilities for moving from one hidden state to another.
Step 4 Define Emission Matrix (B): Set probabilities for each observation being emitted by a hidden state.
Step 5 Train Model: Estimate model parameters using:
Baum Welch Algorithm: Trains HMM by estimating A, B,.
Forward Backward Algorithm: Computes hidden state probabilities efficiently.
Step 6 Decode Hidden States: Determine the most likely hidden state sequence using the Viterbi algorithm.
Step 7 Evaluate Model Performance: Measure model accuracy using metrics like Accuracy, Precision and Recall.