Multinomial Naive Bayes

Last Updated : 2 May, 2026

Multinomial Naive Bayes is a variation of the Naive Bayes algorithm designed for discrete data. It is commonly used in text classification, where features represent word counts or frequencies.

Works with word frequencies by modeling how often words appear in a document.
Assumes a multinomial distribution for features like words.
Commonly used in spam detection, document classification, and sentiment analysis.

Working of Multinomial Naive Bayes

Multinomial Naive Bayes classifies text using word frequencies. Naive Bayes assumes words are independent, while Multinomial refers to counting how often words appear in a document. The model learns from training data by analyzing how often words occur in each class, such as spam or not spam.

Example: If the word 'free' appears frequently in spam emails, the model uses this information to predict whether a new email is spam. The probability of a document belonging to a class is calculated using the class-conditional multinomial distribution:

Where:

is the total number of trials
is the count of occurrences for outcome i
is the probability of outcome i

To estimate how likely each word is in a particular class like spam or not spam, we use a method called Maximum Likelihood Estimation (MLE). This helps find probabilities based on actual counts from our data. The formula is:

Where:

is the number of times word appears in documents of class .
is the total number of words in documents of class .
is the vocabulary size.

Multinomial Naive Bayes for Spam Detection

To understand how Multinomial Naive Bayes works, consider a simple example where we classify a message as Spam or Not Spam.

Message ID	Message Text	Class
M1	"buy cheap now"	Spam
M2	"limited offer buy"	Spam
M3	"meet me now"	Not Spam
M4	"let's catch up"	Not Spam

1. Vocabulary

First, extract all unique words from the dataset.

Vocabulary size

2. Word Frequencies by Class

Spam Class (M1, M2):

buy: 2
cheap: 1
now: 1
limited: 1
offer: 1

Total words: 6

Not Spam Class (M3, M4):

meet: 1
me: 1
now: 1
let's: 1
catch: 1
up: 1

Total words: 6

3. Test Message

Test Message: "buy now"

4. Applying Multinomial Naive Bayes

The probability formula:

Prior Probabilities:

Apply Laplace Smoothing:

To avoid zero probability we apply Laplace smoothing:

Spam Class:

Not Spam Class:

P(\text{now} \mid \text{Not Spam}) = \frac{1 + 1}{6 + 10} = \frac{2}{16}

5. Final Classification

Since,

Implementation

Let’s understand the implementation with an example of spam email detection, where emails are classified into spam or not spam.

1. Importing Libraries:

First, we import the required libraries used for data processing, model training and evaluation.

pandas: Used to handle data in DataFrame format.
CountVectorizer: Converts text documents into a matrix of word counts.
train_test_split: Splits the dataset into training and testing sets.
MultinomialNB: A Naive Bayes classifier used for discrete features such as word counts.
accuracy_score: Measures how accurately the model predicts the correct class.

2. Creating the Dataset

Next, we create a simple dataset containing text messages labelled as spam or not spam. This dataset is stored in a pandas DataFrame for easier processing.

3. Mapping Labels to Numerical Values

Next, the labels spam and not spam are converted into numerical values. This step is required because machine learning models work with numerical data.

spam: 1
not spam: 0

4. Splitting the Data

X contains the text messages (features) and y contains the labels (target).
The dataset is split into training (70%) and testing (30%) sets using train_test_split.

5. Vectorizing the Text Data

Next, the text data is converted into numerical form using CountVectorizer. This method transforms text into vectors by counting the occurrences of each word.

fit_transform(): Learns the vocabulary from the training data and converts it into a feature matrix.
transform(): Converts the test data into the same feature space using the learned vocabulary.

6. Training the Naive Bayes Model

Next, a Multinomial Naive Bayes classifier is created and trained using the vectorized training data and the corresponding labels.

7. Making Predictions and Evaluating Accuracy

After training the model, we use it to predict labels for the test data and then evaluate its performance using accuracy.

model.predict(): Generates predicted labels for the test dataset.
accuracy_score(): Compares predicted labels with the actual labels to measure model accuracy.

Output:

Accuracy: 66.67%

8. Predicting for a Custom Message

Finally, we test the model with a custom message to see how it classifies new input data.

vectorizer.transform(): Converts the custom message into a numerical vector using the learned vocabulary.
model.predict(): Predicts whether the message is spam or not spam.
Interpret result: 1 represents Spam and 0 represents Not Spam.

Output:

Congratulations, you've won a free vacation
Prediction for custom message: Spam

Download full code from here

Multinomial Naive vs Gaussian Naive Bayes

The Multinomial naive bayes and Gaussian naive bayes both are the variants of same algorithm. However they have several number of differences which are discussed below:

Multinomial Naive Bayes	Gaussian Naive Bayes
It is specially designed for discrete data particularly text data.	It is suitable for continuous data where features follow a Gaussian distribution.
It assumes features and represent its counts like word counts.	It assumes a Gaussian distribution for the likelihood.
It is commonly used in NLP for document classification tasks.	It is commonly used in tasks involving continuous data such as medical diagnosis, fraud detection and weather prediction.
The likelihood of each feature is calculated using the multinomial distribution.	The likelihood of each feature is modelled using the Gaussian distribution.
It is more efficient when the number of features is very high like in text datasets with thousands of words.	It may not perform well on non-normal or sparse data.

Comment

Article Tags:

Explore

Machine Learning Basics

Python for Machine Learning

Feature Engineering

Supervised Learning

Unsupervised Learning

Model Evaluation and Tuning

Advanced Techniques

Machine Learning Practice

Courses

URL: https://www.geeksforgeeks.org/machine-learning/multinomial-naive-bayes/

⇱ Multinomial Naive Bayes - GeeksforGeeks