BBN: Bayesian Belief Networks – How to Build Them Effectively in Python?
A detailed explanation of Bayesian Belief Networks using real-life data to build a model in Python
Machine Learning
Intro
Most of you may already be familiar with the Naive Bayes algorithm, a fast and simple modeling technique used in classification problems. While it is used widely due to its speed and relatively good performance, Naive Bayes is built on the assumption that all variables (model features) are independent, which in reality is often not true.
In some cases, you may want to build a model where you can specify which variables are dependent, independent, or conditionally independent (this is explained in the next section). You may also want to track real-time how event probabilities change as new evidence is introduced to the model.
This is where the Bayesian Belief Networks come in handy as they allow you to construct a model with nodes and directed edges by clearly outlining the relationships between variables.
Contents
- The category of algorithms Bayesian Belief Networks (BBN) belong to
- Introduction to Bayesian Belief Networks (BBN) and Directed Acyclic Graphs (DAG)
- Bayesian Belief Network Python example using real-life data
- Directed Acyclic Graph for weather prediction
- Data and Python library setup
- BBN setup
- Using BBN for predictions
- Conclusions
What category of algorithms does Bayesian Belief Networks (BBN) belong to?
Technically there is no training happening within BBN. We simply define how different nodes in the network are linked together. Then we observe how the probabilities change after passing some evidence into specific nodes. Hence, I have put Probabilistic Graphical Models into their own category (see below).
Side note, I have put Neural Networks in a category of their own due to their unique approach to Machine Learning. However, they can be used to solve a wide range of problems, including but not limited to classification and regression. The below chart is interactive so make sure to click👇 on different categories to enlarge and reveal more.
If you share a passion for Data Science and Machine Learning, please subscribe to receive an email whenever I publish a new story.
Bayesian Belief Networks (BBN) and Directed Acyclic Graphs (DAG)
Bayesian Belief Network (BBN) is a Probabilistic Graphical Model (PGM) that represents a set of variables and their conditional dependencies via a Directed Acyclic Graph (DAG).
To understand what this means, let’s draw a DAG and analyze the relationship between different nodes.
Using the above, we can state the relationship between variables (nodes):
- Independence: A and C are independent of each other. So are B and C. This is because knowing whether C has happened does not change our knowledge about A or B and vice versa.
- Dependence: B is dependent on A since A is the parent of B. This relationship can be written as a conditional probability:
P(B|A). D is also dependent on other variables, and in this case, it depends on two of them – B and C. Again, this can be written as a conditional probability:P(D|B,C). - Conditional Independence: D is considered conditionally independent of A. This is because as soon as we know whether event B has happened, A becomes irrelevant from the perspective of D. In other words, the following is true:
P(D|B,A) = P(D|B).
Bayesian Belief Network Python example using real-life data
Directed Acyclic Graph for weather prediction
Let’s use Australian weather data to build a BBN. This will enable us to predict if it will rain tomorrow based on a few weather observations from today.
First, let’s take a look at a DAG before we go through the details of how to build it. Note, I have displayed probabilities for all the different event combinations. You will see how we calculate these using our weather data in the following few sections.
Data and Python library setup
We will use the following data and libraries:
- Australian weather data from Kaggle
- PyBBN for creating Bayesian Belief Networks
- Pandas for data manipulation
- NetworkX and Matplotlib for drawing graphs
Let’s import all the libraries:
import pandas as pd # for data manipulation
import networkx as nx # for drawing graphs
import matplotlib.pyplot as plt # for drawing graphs
# for creating Bayesian Belief Networks (BBN)
from pybbn.graph.dag import Bbn
from pybbn.graph.edge import Edge, EdgeType
from pybbn.graph.jointree import EvidenceBuilder
from pybbn.graph.node import BbnNode
from pybbn.graph.variable import Variable
from pybbn.pptc.inferencecontroller import InferenceController
Then we get the Australian weather data from Kaggle, which you can download following this link: https://www.kaggle.com/jsphyg/weather-dataset-rattle-package.
We ingest the data and derive a few new variables for usage in the model.
# Set Pandas options to display more columns
pd.options.display.max_columns=50
# Read in the weather data csv
df=pd.read_csv('weatherAUS.csv', encoding='utf-8')
# Drop records where target RainTomorrow=NaN
df=df[pd.isnull(df['RainTomorrow'])==False]
# For other columns with missing values, fill them in with column mean
df=df.fillna(df.mean())
# Create bands for variables that we want to use in the model
df['WindGustSpeedCat']=df['WindGustSpeed'].apply(lambda x: '0.<=40' if x<=40 else
'1.40-50' if 40<x<=50 else '2.>50')
df['Humidity9amCat']=df['Humidity9am'].apply(lambda x: '1.>60' if x>60 else '0.<=60')
df['Humidity3pmCat']=df['Humidity3pm'].apply(lambda x: '1.>60' if x>60 else '0.<=60')
# Show a snaphsot of data
df
Here is a snapshot of the data:
Setting up Bayesian Belief Network
Now that we have all the libraries and data ready, it is time to set up a BBN. The first stage requires us to define nodes.
# Create nodes by manually typing in probabilities
H9am = BbnNode(Variable(0, 'H9am', ['<=60', '>60']), [0.30658, 0.69342])
H3pm = BbnNode(Variable(1, 'H3pm', ['<=60', '>60']), [0.92827, 0.07173,
0.55760, 0.44240])
W = BbnNode(Variable(2, 'W', ['<=40', '40-50', '>50']), [0.58660, 0.24040, 0.17300])
RT = BbnNode(Variable(3, 'RT', ['No', 'Yes']), [0.92314, 0.07686,
0.89072, 0.10928,
0.76008, 0.23992,
0.64250, 0.35750,
0.49168, 0.50832,
0.32182, 0.67818])
A few things to note:
- Probabilities here are normalized frequencies of the variable categories from the data. E.g., the "H9am" variable has 43,594 observations where the value is ≤60 and 98,599 observations where the value is >60.
- While I have used normalized frequencies (probabilities), it also works if you put actual frequencies instead. In that case, your code would look like this:
H9am = BbnNode(Variable(0, 'H9am',['<=60', '>60']), [43594, 98599]). - For child nodes, like "Humidity3pmCat", which has a parent "Humidity9amCat", we need to provide probabilities (or frequencies) for each combination as shown in the DAG (note each row adds up to 1):
- You can do this by calculating probabilities/frequencies of "H3pm" twice – the first time by taking a subset of data where "H9am"≤60 and the second time by taking a subset of data where "H9am">60.
- Since calculating frequencies one at a time is time-consuming, I have written a short function that gives us what we need.
# This function helps to calculate probability distribution, which goes into BBN (note, can handle up to 2 parents)
def probs(data, child, parent1=None, parent2=None):
if parent1==None:
# Calculate probabilities
prob=pd.crosstab(data[child], 'Empty', margins=False, normalize='columns').sort_index().to_numpy().reshape(-1).tolist()
elif parent1!=None:
# Check if child node has 1 parent or 2 parents
if parent2==None:
# Caclucate probabilities
prob=pd.crosstab(data[parent1],data[child], margins=False, normalize='index').sort_index().to_numpy().reshape(-1).tolist()
else:
# Caclucate probabilities
prob=pd.crosstab([data[parent1],data[parent2]],data[child], margins=False, normalize='index').sort_index().to_numpy().reshape(-1).tolist()
else: print("Error in Probability Frequency Calculations")
return prob
So, instead of manually typing in all the probabilities, let’s use the above function. At the same time, we will create an actual network:
# Create nodes by using our earlier function to automatically calculate probabilities
H9am = BbnNode(Variable(0, 'H9am', ['<=60', '>60']), probs(df, child='Humidity9amCat'))
H3pm = BbnNode(Variable(1, 'H3pm', ['<=60', '>60']), probs(df, child='Humidity3pmCat', parent1='Humidity9amCat'))
W = BbnNode(Variable(2, 'W', ['<=40', '40-50', '>50']), probs(df, child='WindGustSpeedCat'))
RT = BbnNode(Variable(3, 'RT', ['No', 'Yes']), probs(df, child='RainTomorrow', parent1='Humidity3pmCat', parent2='WindGustSpeedCat'))
# Create Network
bbn = Bbn()
.add_node(H9am)
.add_node(H3pm)
.add_node(W)
.add_node(RT)
.add_edge(Edge(H9am, H3pm, EdgeType.DIRECTED))
.add_edge(Edge(H3pm, RT, EdgeType.DIRECTED))
.add_edge(Edge(W, RT, EdgeType.DIRECTED))
# Convert the BBN to a join tree
join_tree = InferenceController.apply(bbn)
Note, if you are working with a small data sample, there is a risk of some event combinations not being present. In such scenario, you would get a "list index out of range" error. A solution could be to expand your data to include all possibe event combinations, or to identify missing combinations and add them in.
Now, we want to draw the graph to check that we have set it up as intended:
# Set node positions
pos = {0: (-1, 2), 1: (-1, 0.5), 2: (1, 0.5), 3: (0, -1)}
# Set options for graph looks
options = {
"font_size": 16,
"node_size": 4000,
"node_color": "white",
"edgecolors": "black",
"edge_color": "red",
"linewidths": 5,
"width": 5,}
# Generate graph
n, d = bbn.to_nx_graph()
nx.draw(n, with_labels=True, labels=d, pos=pos, **options)
# Update margins and print the graph
ax = plt.gca()
ax.margins(0.10)
plt.axis("off")
plt.show()
Here is the resulting graph, which matches our intended design:
Using BBN for predictions
With our model being ready, we can use it to predict whether it will rain tomorrow.
First, let’s plot probabilities for each node without passing any additional information to the graph. Note, I have set up a simple function so we don’t have to retype the same code later on, as we will want to regenerate the results multiple times.
# Define a function for printing marginal probabilities
def print_probs():
for node in join_tree.get_bbn_nodes():
potential = join_tree.get_bbn_potential(node)
print("Node:", node)
print("Values:")
print(potential)
print('----------------')
# Use the above function to print marginal probabilities
print_probs()
The above code prints the following:
As you can see, this gives us the likelihood of each event occurring with a "Rain Tomorrow (RT)" probability of 22%. While this is cool, we could have got the same 22% probability by looking at the frequency of the "RainTomorrow" variable in our original dataset.
Said that the following step is where we get a lot of value out of our BBN. We can pass evidence into BBN and see how that affects probabilities for every node in the network.
Let’s say it is 9 am right now, and we have measured the humidity outside. It says 72, which obviously belongs to the ">60" band. Hence, let’s pass this evidence into the BBN and see what happens. Note, I have created another small function to help us with that.
# To add evidence of events that happened so probability distribution can be recalculated
def evidence(ev, nod, cat, val):
ev = EvidenceBuilder()
.with_node(join_tree.get_bbn_node_by_name(nod))
.with_evidence(cat, val)
.build()
join_tree.set_observation(ev)
# Use above function to add evidence
evidence('ev1', 'H9am', '>60', 1.0)
# Print marginal probabilities
print_probs()
This gives us the following results:
As you can see, "Humidity9am>60" is now equal to 100%, and the likelihood of "Humidity3pm>60" has increased from 32.8% to 44.2%. At the same time, the chance of "RainTomorrow" has gone up to 26.1%.
Also, note how probabilities for "WindGustSpeed" did not change since "W" and "H9am" are independent of each other.
You can run the same evidence code one more time to remove the evidence from the network. After that, let’s pass two pieces of evidence for "H3pm" and "W."
# Add more evidence
evidence('ev1', 'H3pm', '>60', 1.0)
evidence('ev2', 'W', '>50', 1.0)
# Print marginal probabilities
print_probs()
And here are the results:
Unsurprisingly, this tells us that the chance of rain tomorrow has gone up to 67.8%. Note how "H9am" probabilities also changed, which tells us that despite us only measuring humidity at 3 pm, we are 93% certain that humidity was also above 60 at 9 am this morning.
Conclusions
There are many use cases for Bayesian Belief Networks, from helping to diagnose diseases to real-time predictions of a race outcome.
You can also build BBNs to help you with marketing decisions. Say, I may want to know how likely this article is to reach 10K views. Hence, I can build a BBN to tell me the probability of certain events occurring, such as posting a link to this article on Twitter and then evaluating how this probability changes as I get ten retweets.
At the end of the day, the possibilities are almost limitless, with the ability to generate real-time predictions that automatically update the entire network as soon as new evidence is introduced.
I hope you found Bayesian Belief Networks just as exciting as I did. Feel free to reach out if you have any questions or suggestions. Thanks for reading!
Cheers! 👏 Saul Dobilas
If you would like to continue with a Bayesian theme, you can check out the below article on Naive Bayes Classifier.
Naive Bayes Classifier – How to Successfully Use It in Python?
Share This Article
Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.
Write for TDS