Data Science

BBN: Bayesian Belief Networks – How to Build Them Effectively in Python?

A detailed explanation of Bayesian Belief Networks using real-life data to build a model in Python

Saul Dobilas

Apr 6, 2021

10 min read

Machine Learning

👁 Image by Gerd Altmann from Pixabay

Image by Gerd Altmann from Pixabay

Intro

Most of you may already be familiar with the Naive Bayes algorithm, a fast and simple modeling technique used in classification problems. While it is used widely due to its speed and relatively good performance, Naive Bayes is built on the assumption that all variables (model features) are independent, which in reality is often not true.

In some cases, you may want to build a model where you can specify which variables are dependent, independent, or conditionally independent (this is explained in the next section). You may also want to track real-time how event probabilities change as new evidence is introduced to the model.

This is where the Bayesian Belief Networks come in handy as they allow you to construct a model with nodes and directed edges by clearly outlining the relationships between variables.

The category of algorithms Bayesian Belief Networks (BBN) belong to
Introduction to Bayesian Belief Networks (BBN) and Directed Acyclic Graphs (DAG)
Bayesian Belief Network Python example using real-life data
Directed Acyclic Graph for weather prediction
Data and Python library setup
BBN setup
Using BBN for predictions
Conclusions

What category of algorithms does Bayesian Belief Networks (BBN) belong to?

Technically there is no training happening within BBN. We simply define how different nodes in the network are linked together. Then we observe how the probabilities change after passing some evidence into specific nodes. Hence, I have put Probabilistic Graphical Models into their own category (see below).

Side note, I have put Neural Networks in a category of their own due to their unique approach to Machine Learning. However, they can be used to solve a wide range of problems, including but not limited to classification and regression. The below chart is interactive so make sure to click👇 on different categories to enlarge and reveal more.

If you share a passion for Data Science and Machine Learning, please subscribe to receive an email whenever I publish a new story.

Bayesian Belief Networks (BBN) and Directed Acyclic Graphs (DAG)

Bayesian Belief Network (BBN) is a Probabilistic Graphical Model (PGM) that represents a set of variables and their conditional dependencies via a Directed Acyclic Graph (DAG).

To understand what this means, let’s draw a DAG and analyze the relationship between different nodes.

👁 Directed Acyclic Graph (DAG). Image by author.

Directed Acyclic Graph (DAG). Image by author.

Using the above, we can state the relationship between variables (nodes):

Independence: A and C are independent of each other. So are B and C. This is because knowing whether C has happened does not change our knowledge about A or B and vice versa.
Dependence: B is dependent on A since A is the parent of B. This relationship can be written as a conditional probability: P(B|A). D is also dependent on other variables, and in this case, it depends on two of them – B and C. Again, this can be written as a conditional probability: P(D|B,C).
Conditional Independence: D is considered conditionally independent of A. This is because as soon as we know whether event B has happened, A becomes irrelevant from the perspective of D. In other words, the following is true: P(D|B,A) = P(D|B).

👁 Image

Bayesian Belief Network Python example using real-life data

Directed Acyclic Graph for weather prediction

Let’s use Australian weather data to build a BBN. This will enable us to predict if it will rain tomorrow based on a few weather observations from today.

First, let’s take a look at a DAG before we go through the details of how to build it. Note, I have displayed probabilities for all the different event combinations. You will see how we calculate these using our weather data in the following few sections.

👁 Directed Acyclic Graph (DAG) for a Bayesian Belief Network (BBN) to forecast whether it will rain tomorrow. Image by author.

Directed Acyclic Graph (DAG) for a Bayesian Belief Network (BBN) to forecast whether it will rain tomorrow. Image by author.

Data and Python library setup

We will use the following data and libraries:

Australian weather data from Kaggle
PyBBN for creating Bayesian Belief Networks
Pandas for data manipulation
NetworkX and Matplotlib for drawing graphs

Let’s import all the libraries:

import pandas as pd # for data manipulation 
import networkx as nx # for drawing graphs
import matplotlib.pyplot as plt # for drawing graphs

# for creating Bayesian Belief Networks (BBN)
from pybbn.graph.dag import Bbn
from pybbn.graph.edge import Edge, EdgeType
from pybbn.graph.jointree import EvidenceBuilder
from pybbn.graph.node import BbnNode
from pybbn.graph.variable import Variable
from pybbn.pptc.inferencecontroller import InferenceController

Then we get the Australian weather data from Kaggle, which you can download following this link: https://www.kaggle.com/jsphyg/weather-dataset-rattle-package.

We ingest the data and derive a few new variables for usage in the model.

# Set Pandas options to display more columns
pd.options.display.max_columns=50

# Read in the weather data csv
df=pd.read_csv('weatherAUS.csv', encoding='utf-8')

# Drop records where target RainTomorrow=NaN
df=df[pd.isnull(df['RainTomorrow'])==False]

# For other columns with missing values, fill them in with column mean
df=df.fillna(df.mean())

# Create bands for variables that we want to use in the model
df['WindGustSpeedCat']=df['WindGustSpeed'].apply(lambda x: '0.<=40' if x<=40 else
 '1.40-50' if 40<x<=50 else '2.>50')
df['Humidity9amCat']=df['Humidity9am'].apply(lambda x: '1.>60' if x>60 else '0.<=60')
df['Humidity3pmCat']=df['Humidity3pm'].apply(lambda x: '1.>60' if x>60 else '0.<=60')

# Show a snaphsot of data
df

Here is a snapshot of the data:

👁 A snippet of Kaggle's Australian weather data with some modifications. Image by the author.

A snippet of Kaggle’s Australian weather data with some modifications. Image by the author.

Setting up Bayesian Belief Network

Now that we have all the libraries and data ready, it is time to set up a BBN. The first stage requires us to define nodes.

# Create nodes by manually typing in probabilities
H9am = BbnNode(Variable(0, 'H9am', ['<=60', '>60']), [0.30658, 0.69342])
H3pm = BbnNode(Variable(1, 'H3pm', ['<=60', '>60']), [0.92827, 0.07173, 
 0.55760, 0.44240])
W = BbnNode(Variable(2, 'W', ['<=40', '40-50', '>50']), [0.58660, 0.24040, 0.17300])
RT = BbnNode(Variable(3, 'RT', ['No', 'Yes']), [0.92314, 0.07686, 
 0.89072, 0.10928, 
 0.76008, 0.23992, 
 0.64250, 0.35750, 
 0.49168, 0.50832, 
 0.32182, 0.67818])

A few things to note:

Probabilities here are normalized frequencies of the variable categories from the data. E.g., the "H9am" variable has 43,594 observations where the value is ≤60 and 98,599 observations where the value is >60.

👁 Variable value counts. Image by author.

Variable value counts. Image by author.

While I have used normalized frequencies (probabilities), it also works if you put actual frequencies instead. In that case, your code would look like this: H9am = BbnNode(Variable(0, 'H9am',['<=60', '>60']), [43594, 98599]) .
For child nodes, like "Humidity3pmCat", which has a parent "Humidity9amCat", we need to provide probabilities (or frequencies) for each combination as shown in the DAG (note each row adds up to 1):

👁 "Humidity3pmCat" normalized frequencies (probabilities). Image by author.

"Humidity3pmCat" normalized frequencies (probabilities). Image by author.

You can do this by calculating probabilities/frequencies of "H3pm" twice – the first time by taking a subset of data where "H9am"≤60 and the second time by taking a subset of data where "H9am">60.
Since calculating frequencies one at a time is time-consuming, I have written a short function that gives us what we need.

# This function helps to calculate probability distribution, which goes into BBN (note, can handle up to 2 parents)
def probs(data, child, parent1=None, parent2=None):
 if parent1==None:
 # Calculate probabilities
 prob=pd.crosstab(data[child], 'Empty', margins=False, normalize='columns').sort_index().to_numpy().reshape(-1).tolist()
 elif parent1!=None:
 # Check if child node has 1 parent or 2 parents
 if parent2==None:
 # Caclucate probabilities
 prob=pd.crosstab(data[parent1],data[child], margins=False, normalize='index').sort_index().to_numpy().reshape(-1).tolist()
 else: 
 # Caclucate probabilities
 prob=pd.crosstab([data[parent1],data[parent2]],data[child], margins=False, normalize='index').sort_index().to_numpy().reshape(-1).tolist()
 else: print("Error in Probability Frequency Calculations")
 return prob

So, instead of manually typing in all the probabilities, let’s use the above function. At the same time, we will create an actual network:

# Create nodes by using our earlier function to automatically calculate probabilities
H9am = BbnNode(Variable(0, 'H9am', ['<=60', '>60']), probs(df, child='Humidity9amCat'))
H3pm = BbnNode(Variable(1, 'H3pm', ['<=60', '>60']), probs(df, child='Humidity3pmCat', parent1='Humidity9amCat'))
W = BbnNode(Variable(2, 'W', ['<=40', '40-50', '>50']), probs(df, child='WindGustSpeedCat'))
RT = BbnNode(Variable(3, 'RT', ['No', 'Yes']), probs(df, child='RainTomorrow', parent1='Humidity3pmCat', parent2='WindGustSpeedCat'))

# Create Network
bbn = Bbn() 
 .add_node(H9am) 
 .add_node(H3pm) 
 .add_node(W) 
 .add_node(RT) 
 .add_edge(Edge(H9am, H3pm, EdgeType.DIRECTED)) 
 .add_edge(Edge(H3pm, RT, EdgeType.DIRECTED)) 
 .add_edge(Edge(W, RT, EdgeType.DIRECTED))

# Convert the BBN to a join tree
join_tree = InferenceController.apply(bbn)

Note, if you are working with a small data sample, there is a risk of some event combinations not being present. In such scenario, you would get a "list index out of range" error. A solution could be to expand your data to include all possibe event combinations, or to identify missing combinations and add them in.

Now, we want to draw the graph to check that we have set it up as intended:

# Set node positions
pos = {0: (-1, 2), 1: (-1, 0.5), 2: (1, 0.5), 3: (0, -1)}

# Set options for graph looks
options = {
 "font_size": 16,
 "node_size": 4000,
 "node_color": "white",
 "edgecolors": "black",
 "edge_color": "red",
 "linewidths": 5,
 "width": 5,}

# Generate graph
n, d = bbn.to_nx_graph()
nx.draw(n, with_labels=True, labels=d, pos=pos, **options)

# Update margins and print the graph
ax = plt.gca()
ax.margins(0.10)
plt.axis("off")
plt.show()

Here is the resulting graph, which matches our intended design:

👁 Directed Acyclic Graph (DAG) for weather prediction BBN. Image by author.

Directed Acyclic Graph (DAG) for weather prediction BBN. Image by author.

Using BBN for predictions

With our model being ready, we can use it to predict whether it will rain tomorrow.

First, let’s plot probabilities for each node without passing any additional information to the graph. Note, I have set up a simple function so we don’t have to retype the same code later on, as we will want to regenerate the results multiple times.

# Define a function for printing marginal probabilities
def print_probs():
 for node in join_tree.get_bbn_nodes():
 potential = join_tree.get_bbn_potential(node)
 print("Node:", node)
 print("Values:")
 print(potential)
 print('----------------')

# Use the above function to print marginal probabilities
print_probs()

The above code prints the following:

👁 Original BBN probabilities. Image by author.

Original BBN probabilities. Image by author.

As you can see, this gives us the likelihood of each event occurring with a "Rain Tomorrow (RT)" probability of 22%. While this is cool, we could have got the same 22% probability by looking at the frequency of the "RainTomorrow" variable in our original dataset.

Said that the following step is where we get a lot of value out of our BBN. We can pass evidence into BBN and see how that affects probabilities for every node in the network.

Let’s say it is 9 am right now, and we have measured the humidity outside. It says 72, which obviously belongs to the ">60" band. Hence, let’s pass this evidence into the BBN and see what happens. Note, I have created another small function to help us with that.

# To add evidence of events that happened so probability distribution can be recalculated
def evidence(ev, nod, cat, val):
 ev = EvidenceBuilder() 
 .with_node(join_tree.get_bbn_node_by_name(nod)) 
 .with_evidence(cat, val) 
 .build()
 join_tree.set_observation(ev)

# Use above function to add evidence
evidence('ev1', 'H9am', '>60', 1.0)

# Print marginal probabilities
print_probs()

This gives us the following results:

👁 BBN probabilities with "H9am" evidence. Image by author.

BBN probabilities with "H9am" evidence. Image by author.

As you can see, "Humidity9am>60" is now equal to 100%, and the likelihood of "Humidity3pm>60" has increased from 32.8% to 44.2%. At the same time, the chance of "RainTomorrow" has gone up to 26.1%.

Also, note how probabilities for "WindGustSpeed" did not change since "W" and "H9am" are independent of each other.

You can run the same evidence code one more time to remove the evidence from the network. After that, let’s pass two pieces of evidence for "H3pm" and "W."

# Add more evidence
evidence('ev1', 'H3pm', '>60', 1.0)
evidence('ev2', 'W', '>50', 1.0)
# Print marginal probabilities
print_probs()

And here are the results:

👁 BBN probabilities with "H3pm" and "W" evidence. Image by author.

BBN probabilities with "H3pm" and "W" evidence. Image by author.

Unsurprisingly, this tells us that the chance of rain tomorrow has gone up to 67.8%. Note how "H9am" probabilities also changed, which tells us that despite us only measuring humidity at 3 pm, we are 93% certain that humidity was also above 60 at 9 am this morning.

Conclusions

There are many use cases for Bayesian Belief Networks, from helping to diagnose diseases to real-time predictions of a race outcome.

You can also build BBNs to help you with marketing decisions. Say, I may want to know how likely this article is to reach 10K views. Hence, I can build a BBN to tell me the probability of certain events occurring, such as posting a link to this article on Twitter and then evaluating how this probability changes as I get ten retweets.

At the end of the day, the possibilities are almost limitless, with the ability to generate real-time predictions that automatically update the entire network as soon as new evidence is introduced.

I hope you found Bayesian Belief Networks just as exciting as I did. Feel free to reach out if you have any questions or suggestions. Thanks for reading!

Cheers! 👏 Saul Dobilas

If you would like to continue with a Bayesian theme, you can check out the below article on Naive Bayes Classifier.

Naive Bayes Classifier – How to Successfully Use It in Python?

Written By

Saul Dobilas

See all from Saul Dobilas

Bayesian Belief Networks, Data Science, Machine Learning, Probabilistic Programming, Python

Share This Article

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

URL: https://towardsdatascience.com/bbn-bayesian-belief-networks-how-to-build-them-effectively-in-python-6b7f93435bba/