What is data poisoning?

Last Updated : 23 Jul, 2025

Data poisoning is a cyberattack that targets the training data used to build machine learning models. Attackers try to slip misleading or incorrect information into the training dataset which can make the model produce incorrect outputs. This is done by adding new data, changing existing data, or even deleting some data.

👁 Data-Poisoning

Data poisoning

Types of data poisoning attacks

1. Targeted attacks

These attacks try to change what a machine learning model predicts in a specific way.
For example: Suppose a man sitting at his computer using a chatbot to get advice. A hacker sneaks in fake or poisoned data which makes the chatbot gives wrong answers on purpose.

2. Non targeted attacks

These attacks don’t aim for a specific result they just try to degrade the machine learning model in all ways.
For example: In a self driving car the machine learning model sees a stop sign but thinks it's a yield sign which could be very dangerous.

3. Clean label poisoning

In clean label attacks the poisoned data looks normal and is correctly labeled which makes it hard to detect.
The attacker changes some features of the data so the model learns a harmful pattern without anyone noticing.
For example: an suppose classifier might be trained on images that look like normal animals but small changes in the pixels cause the model to misclassify them later.

4. Backdoor Attacks

Backdoor attacks create a hidden rule inside the model which only activates under specific conditions.
The attacker trains the model with special inputs that act like a secret key and when the model sees this key later it produces the wrong output.
For example: a backdoor might cause a traffic sign detection system to read a stop sign as a speed limit sign but only when there’s a certain sticker on the sign.

5. Data injection

Data injection attack are the attacks where an attacker adds fake or harmful data into the training dataset of a machine learning model. The goal is to change how the model learns due to which it makes mistakes later.
For example: in a spam filter model an attacker could inject emails that are actually spam but label them as 'not spam' and later the filter might allow real spam messages through because it learned the wrong pattern.

6. Label flipping

Label flipping is a type of data poisoning attack where the attacker changes the labels of the training data to the wrong ones on purpose.
For example: if you're training a model to recognize animals and you feed tit with a picture of a cat but tell them it's a dog they will learn the wrong thing. Later they might predict every cat as a dog.

Impact of data poisoning on models

1. Model Degradation

Data poisoning can significantly degrade performance of a machine learning model's by introducing incorrect or misleading examples into the training set.
When the model learns from this corrupted data it can form inaccurate patterns or generalizations which can degrade the model.
When model is evaluated on clean test data its accuracy and reliability can drop dramatically which often goes unnoticed until the model fails in production environments.

2. Backdoor Attacks

A more insidious form of data poisoning is the backdoor attack where an attacker embeds a hidden pattern or trigger in the training data.
The model learns to behave normally in most cases but exhibits malicious behavior when the trigger is present in the input.
These attacks are dangerous because they allow external control of the model’s outputs.

3. Targeted Misclassification

In a targeted data poisoning attack the attacker manipulates the training data in a way that causes the model to misclassify specific inputs or classes intentionally.
This is done subtly to avoid detection and the model can be exploited for impersonation or evasion in security systems.

4. Data Privacy Leaks

Attackers can insert poisoned data that encodes sensitive or private information into the training set and the model might reproduce this data verbatim specially if it is a language model or a generative model.
This creates serious risks particularly in domains where confidentiality is critical such as healthcare or legal applications.

Prevention techniques

1. Data Validation and Sanitization

The first line of defense against data poisoning is rigorous validation and sanitization of training data which includes checking for duplicates, anomalies, outliers and mislabeled samples before data is used in training.
Manual review, statistical analysis and automated data cleaning tools can help identify suspicious patterns or corrupted inputs. By ensuring only high quality data is used the risk of poisoning is significantly reduced.

2. Robust Learning Algorithms

Using robust training methods can make models less sensitive to poisoned data.
Techniques such as loss function reweighting, noise-tolerant learning and robust optimization help limit the influence of any single or small group of data points.
These algorithms are designed to downplay the impact of outliers or mislabeled data which improves the model’s resistance to poisoning attempts.

3. Ensemble and Redundant Models

Training multiple models independently on different subsets of data or validating performance across redundant systems can help identify anomalies introduced by poisoned data.
If one model behaves abnormally compared to the rest it may indicate data integrity issues in its training set which enhances both robustness and fault detection.

4. Differential Privacy

Differential privacy adds noise to the training process in a controlled way which can help obscure the influence of individual training examples.
This not only protects against data leakage but also makes it harder for attackers to inject malicious samples that will strongly influence the model.

Comment

Article Tags:

Machine Learning

data

Machine Learning

Explore

Machine Learning Basics

Python for Machine Learning

Feature Engineering

Supervised Learning

Unsupervised Learning

Model Evaluation and Tuning

Advanced Techniques

Machine Learning Practice

Courses

URL: https://www.geeksforgeeks.org/machine-learning/what-is-data-poisoning/