Pandas Introduction

Last Updated : 8 Jun, 2026

Pandas is an open-source Python library used for data manipulation, analysis and cleaning. It provides fast and flexible tools to work with tabular data, similar to spreadsheets or SQL tables.

Pandas is used in data science and analytics due to its integration with libraries such as:

NumPy: numerical operations
Matplotlib and Seaborn: data visualization
SciPy: statistical analysis
Scikit-learn: machine learning workflows

Pandas allows efficient handling and analysis of data in a few lines of code.

👁 pandas_basic_operations

Pandas Basic Operations

Installation

Before using Pandas, make sure it is installed:

pip install pandas

After the Pandas have been installed in the system we need to import the library. The Pandas library can be imported using:

import pandas as pd

Note: pd is just an alias for Pandas. It’s not required but using it makes the code shorter when calling methods or properties.

Data Structures in Pandas

Pandas provides two data structures for manipulating data which are as follows:

1. Pandas Series

A Pandas Series is a one-dimensional labeled array capable of holding data of any type. It can be created from Python lists, NumPy arrays, dictionaries, scalar values or data loaded from external sources such as CSV files, Excel files and databases. The labels associated with a Series are called its index.

Output

Pandas Series:
0 g
1 e
2 e
3 k
4 s
dtype: object

Explanation:

Creates a NumPy array containing character values.
Uses pd.Series() to convert the array into a Pandas Series.
Each element is assigned a default integer index starting from 0.

2. Pandas DataFrame

Pandas DataFrame is a two-dimensional data structure with labeled axes (rows and columns). It is created by loading the datasets from existing storage which can be a SQL database, a CSV file or an Excel file. It can be created from lists, dictionaries, a list of dictionaries etc.

Output

Empty DataFrame
Columns: []
Index: []
 0
0 Geeks
1 For
2 Geeks
3 is
4 portal
5 for
6 Geeks

Operations in Pandas

Pandas provides essential operations for working with structured data efficiently. The sections below introduce the most commonly used functionalities with short explanations and simple examples.

1. Loading Data: This operation reads data from files such as CSV, Excel or JSON into a DataFrame.

Output

👁 LoadingDataset

Output of Loading Dataset

Explanation: pd.read_csv("data.csv") reads the CSV file and loads it into a DataFrame and df.head() shows the first 5 rows of the data.

You can download the data.csv file from here

2. Viewing and Exploring Data: After loading data, it is important to understand its structure and content. This methods allow you to inspect rows, summary statistics and metadata.

Output

👁 Info

Output of df.info()

Explanation: df.info() displays information about the DataFrame, including column names, data types, non-null counts, and memory usage.

3. Handling Missing Data: Datasets often contain empty or missing values. Pandas provides functions to detect, remove or replace these values.

Output

👁 HandlingMissingData

No Columns have NAN value

Explanation: df.fillna(0) replaces missing values with 0.

4. Selecting and Filtering Data: This operation retrieves specific columns, rows or records that match a condition. It allows precise extraction of required information.

Output

👁 FilteringData

Output of Filtering Data

Explanation: df[df['age'] > 25] returns rows where the "age" value is greater than 25.

5. Adding and Removing Columns: You can create new columns based on existing ones or delete unwanted columns from the DataFrame.

Output

👁 AddingRemovingData

Adding new column "total"

Explanation:df['total'] = df['a'] + df['b'] creates a new column named "total".

6. Grouping Data (GroupBy): Grouping allows you to organize data into categories and compute values for each group for example, sums, counts or averages.

Output

👁 GroupingData

Grouping Data

Explanation: df.groupby('category') divides the dataset based on the "category" column.

To learn Pandas from basic to advanced refer to Pandas tutorial

Comment

Article Tags:

Pandas

Python pandas-basics

Explore

Introduction

Creating Objects

Viewing Data

Selection & Slicing

Operations

Manipulating Data

Grouping Data

Merging, Joining, Concatenating and Comparing

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Visualization

Applications and Projects

Courses

URL: https://www.geeksforgeeks.org/pandas/introduction-to-pandas-in-python/