![]() |
VOOZH | about |
Pandas DataFrame.sample() function is used to select randomly rows or columns from a DataFrame. It proves particularly helpful while dealing with huge datasets where we want to test or analyze a small representative subset. We can define the number or proportion of items to sample and manage randomness through parameters such as n, frac and random_state.
In this example, we load a dataset and generate a single random row using the sample() method by setting n=1.
Output
The sample(n=1) function selects one random row from the DataFrame.
DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)
Parameters:
Return Type: New object of same type as caller.
To download the CSV file used, Click Here.
In this example, we generate a random sample consisting of 25% of the entire DataFrame by using the frac parameter.
Output
As shown in the output image, the length of sample generated is 25% of data frame. Also the sample is generated randomly.
This example demonstrates how to sample multiple rows with replacement (i.e., allowing repetition of rows) and ensures reproducibility using a fixed random seed.
Output
The replace=True parameter allows the same row to be sampled more than once, making it ideal for bootstrapping. random_state=42 ensures the result is reproducible across multiple runs very useful during testing and debugging.