Pandas Dataframe.sample() | Python

Last Updated : 11 Jul, 2025

Pandas DataFrame.sample() function is used to select randomly rows or columns from a DataFrame. It proves particularly helpful while dealing with huge datasets where we want to test or analyze a small representative subset. We can define the number or proportion of items to sample and manage randomness through parameters such as n, frac and random_state.

Example : Sampling a Single Random Row

In this example, we load a dataset and generate a single random row using the sample() method by setting n=1.

Output

👁 sample_one_row

one row of dataframe

The sample(n=1) function selects one random row from the DataFrame.

Syntax

DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)

Parameters:

n: int value, Number of random rows to generate.
frac: Float value, Returns (float value * length of data frame values ) . frac cannot be used with n.
replace: Boolean value, return sample with replacement if True.
random_state: int value or numpy.random.RandomState, optional. if set to a particular integer, will return same rows as sample in every iteration.
axis: 0 or 'row' for Rows and 1 or 'column' for Columns.

Return Type: New object of same type as caller.

To download the CSV file used, Click Here.

Examples of Pandas Dataframe.sample()

Example 1: Sample 25% of the DataFrame

In this example, we generate a random sample consisting of 25% of the entire DataFrame by using the frac parameter.

Output

👁 25_sample_data

25% of dataframe

As shown in the output image, the length of sample generated is 25% of data frame. Also the sample is generated randomly.

Example 2: Sampling with Replacement and a Fixed Random State

This example demonstrates how to sample multiple rows with replacement (i.e., allowing repetition of rows) and ensures reproducibility using a fixed random seed.

Output

👁 Sample_random_state

sampling with replacement

The replace=True parameter allows the same row to be sampled more than once, making it ideal for bootstrapping. random_state=42 ensures the result is reproducible across multiple runs very useful during testing and debugging.

Comment

Article Tags:

Misc

Python

Python-pandas

Python pandas-dataFrame