VOOZH about

URL: https://www.analyticsvidhya.com/blog/2024/01/how-to-sort-pandas-dataframe/

⇱ Ways to Sort Pandas Dataframe - Analytics Vidhya


India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

How to Sort Pandas DataFrame?

Deepsandhya Shukla Last Updated : 30 Jan, 2024
5 min read

Introduction

Pandas DataFrame is a powerful data structure in Python that allows for efficient data manipulation and analysis. Sorting is essential when working with data, as it helps better organise and understand the data. As an indispensable data structure, Pandas DataFrame empowers you to streamline and enhance your data-related tasks. Sorting, a fundamental operation in data handling, is pivotal in organizing and gaining insights from your datasets. This article will explore various sorting techniques, methods, and examples in Pandas DataFrame.

What is Pandas DataFrame?

Pandas DataFrame is a two-dimensional labeled data structure with columns of potentially different types. It is similar to a table in a relational database or a spreadsheet with rows and columns. Each column in a DataFrame can be of a different data type, such as integers, floats, strings, or even complex objects.

Why Sorting is Important in Pandas DataFrame?

Sorting is important in Pandas DataFrame for several reasons. It helps in:

Organizing the data

Sorting allows us to arrange the data in a specific order, making it easier to analyze and interpret.

Identifying patterns

Sorting helps identify patterns and trends in the data by arranging it meaningfully.

Filtering and querying

Sorting can be useful when filtering or querying the data based on specific criteria.

Data visualization

Sorting the data can enhance data visualization by presenting it in a more structured and meaningful way.

Sorting Techniques in Pandas DataFrame

There are several techniques available in Pandas DataFrame for sorting the data:

Sorting by Single Column

Sorting by a single column is the most common sorting technique. It arranges the rows of the DataFrame based on the values in a single column. For example, we can sort a DataFrame of students based on their grades in ascending or descending order.

Sorting by Multiple Columns

Sorting by multiple columns allows us to sort the DataFrame based on multiple criteria. For example, we can sort a DataFrame of employees based on their salary and age.

Sorting in Ascending Order

Sorting in ascending order arranges the data from the smallest value to the largest value. It is the default sorting order in Pandas DataFrame.

Sorting in Descending Order

Sorting in descending order arranges the data from the largest value to the smallest value. It can be useful when we want to find the top or bottom values in the data.

Sorting with Null Values

Sorting with null values can be tricky. By default, null values are sorted at the end of the DataFrame. However, we can customize the sorting behavior to handle null values differently.

Sorting Methods in Pandas DataFrame

Pandas provides several methods for sorting the DataFrame:

sort_values() Method

The sort_values() method is the primary method for sorting a DataFrame. It allows us to sort the DataFrame based on one or more columns. We can specify the sorting order (ascending or descending) and how to handle null values.

Example

import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Alice', 'Bob'],
                   'Age': [25, 30, 20],
                   'Salary': [50000, 60000, 45000]})
sorted_df = df.sort_values(by='Salary', ascending=False)
print(sorted_df)

Output

 Name  Age  Salary

1  Alice   30   60000

0   John   25   50000

2    Bob   20   45000

sort_index() Method

The sort_index() method allows us to sort the DataFrame based on the index. It rearranges the rows of the DataFrame based on the index values.

Example

import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Alice', 'Bob'],
                   'Age': [25, 30, 20],
                   'Salary': [50000, 60000, 45000]})
sorted_df = df.sort_index()
print(sorted_df)

Output

     Name  Age  Salary

0   John   25   50000

1  Alice   30   60000

2    Bob   20   45000

nsmallest() and nlargest() Methods

The nsmallest() and nlargest() methods allow us to find the n smallest or largest values in a DataFrame. These methods are useful to find the top or bottom values based on a specific column.

Example

import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Alice', 'Bob'],
                   'Age': [25, 30, 20],
                   'Salary': [50000, 60000, 45000]})
top_2_earners = df.nlargest(2, 'Salary')
print(top_2_earners)

Output

    Name  Age  Salary

1  Alice   30   60000

0   John   25   50000

Let’s explore some examples of sorting in Pandas DataFrame:

Sorting Numerical Data

Sorting numerical data is straightforward. We can use the sort_values() method to sort the DataFrame based on a numerical column.

Example

import pandas as pd
df = pd.DataFrame({'Numbers': [5, 2, 8, 1, 3]})
sorted_df = df.sort_values(by='Numbers')
print(sorted_df)

Output

   Numbers

3        1

1        2

4        3

0        5

2        8

Sorting Categorical Data

Category data can be sorted by specifying the sorting order using the sort_values() method.

Example

import pandas as pd
# Creating a DataFrame with a categorical column
df = pd.DataFrame({'Names': ['Alice', 'Bob', 'Charlie', 'Alice', 'David', 'Bob'],
                'Age': [25, 30, 22, 28, 35, 32],
                'Salary': [50000, 60000, 45000, 55000, 70000, 62000]})
# Sorting the DataFrame based on the 'Names' column in ascending order
sorted_df = df.sort_values(by='Names', ascending=True)
# Displaying the sorted DataFrame
print(sorted_df)

Output

      Names  Age  Salary

0    Alice      25     50000

3    Alice      28     55000

1      Bob     30     60000

5      Bob     32     62000

2  Charlie    22     45000

4    David    35     70000

Sorting DateTime Data

Sorting DateTime data is similar to sorting numerical data. We can use the sort_values() method to sort the DataFrame based on a DateTime column.

Example

import pandas as pd
df = pd.DataFrame({'Date': ['2022-01-01', '2022-02-01', '2022-03-01'],
                   'Sales': [100, 200, 150]})
df['Date'] = pd.to_datetime(df['Date'])
sorted_df = df.sort_values(by='Date')
print(sorted_df)

Output

        Date       Sales

0 2022-01-01    100

1 2022-02-01    200

2 2022-03-01    150

Sorting with Custom Functions

We can also sort the DataFrame using custom functions. The key parameter of the sort_values() method allows us to specify a custom function for sorting.

Example

import pandas as pd
df = pd.DataFrame({'Numbers': [5, 2, 8, 1, 3]})
sorted_df = df.sort_values(by='Numbers', key=lambda x: x % 2)
print(sorted_df)

Output

   Numbers

2        8

0        5

4        3

1        2

3        1

Also read: 10 Ways to Create Pandas Dataframe

Common Errors and Troubleshooting

Here are some common errors and troubleshooting tips when sorting Pandas DataFrame:

Handling Missing Values during Sorting

Missing values can affect the sorting order. We need to handle missing values appropriately to ensure the desired sorting behavior.

Dealing with Memory Errors during Sorting

Sorting large datasets can consume a significant amount of memory. We can optimize memory usage by selecting only the necessary columns for sorting or using chunking techniques.

Sorting Large Datasets Efficiently

Sorting large datasets can be time-consuming. Parallel processing or distributed computing techniques can improve sorting performance.

Conclusion

In conclusion, sorting is a crucial operation in Pandas DataFrame that significantly contributes to efficient data manipulation and analysis. Throughout this article, we delved into the importance of sorting in organizing and understanding data, identifying patterns, facilitating filtering and querying, and enhancing data visualization.

Mastering sorting techniques and methods in Pandas empowers data analysts and scientists to efficiently organize and analyze diverse datasets, unlocking valuable insights for informed decision-making.

If you are looking for AI and ML courses, enrol today in the Certified AI & ML BlackBelt PlusProgram. Our Certified AI & ML BlackBelt Plus Program is designed to equip you with the skills and knowledge needed to master the dynamic fields of Artificial Intelligence and Machine Learning. Whether you’re a beginner seeking a comprehensive introduction or an experienced professional aiming to stay ahead in this rapidly evolving industry, our program caters to all levels of expertise.

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Responses From Readers

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
πŸ‘ Av Logo White

Continue your learning for FREE

Forgot your password?
πŸ‘ Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

πŸ‘ Popup Banner
πŸ‘ AI Popup Banner