VOOZH about

URL: https://www.analyticsvidhya.com/blog/2020/07/5-striking-pandas-tips-and-tricks-for-analysts-and-data-scientists/

⇱ Pandas Tricks and Tips | Pandas Tips For Analysts


India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

5 Striking Pandas Tips and Tricks for Analysts and Data Scientists

Ram Dewani Last Updated : 29 Oct, 2024
5 min read

Overview

  • Pandas provide tools and techniques to make data analysis easier in Python
  • We’ll discuss tips and tricks that will help you become a better and efficient analyst

Introduction

Efficiency has become a key ingredient for the timely completion of work. One is not expected to spend more than a reasonable amount of time to get things done. Especially when the task involves basic coding. One such area where data scientists are expected to be the fastest is when using the Pandas library in Python.

πŸ‘ Pandas Hack

Pandas is an open-source package. It helps to perform data analysis and data manipulation in Python language. Additionally, it provides us with fast and flexible data structures that make it easy to work with Relational and structured data.

If you’re new to Pandas then go ahead and enroll in this free course. It will guide you through all the in’s and out’s of this wonderful Python library. And set you up for your data analysis journey. This is the sixth part of my Data Science hacks, tips, and tricks series. I highly recommend going through the previous articles to become a more efficient data scientist or analyst.

I have also converted my learning into a free course that you can check out:

Also, if you have your own Data Science hacks, tips, and tricks, you can share it with the open community on this GitHub repository: Data Science hacks, tips and tricks on GitHub.

Table of Contents

  • Pandas Hack #1 – Conditional Selection of Rows
  • Pandas Hack #2 – Binning of data
  • Pandas Hack #3 – Grouping Data
  • Pandas Hack #4 – Pandas mapping
  • Pandas Hack #5 – Conditional Formatting Pandas DataFrame

Pandas Hack #1 – Conditional Selection of Rows

To begin with, data exploration is an integral step in finding out the properties of a dataset. Pandas provide a quick and easy way to perform all sorts of analysis. One such important analysis is the conditional selection of rows or filtering of data.

The conditional selection of rows can be based on a single condition or multiple conditions in a single statement separated by logical operators.

For example, I’m taking up a dataset on loan prediction. You can check out the dataset here.

We are going to select the rows of customers who haven’t graduated and have an income of less than 5400. Let us see how do we perform it.

Note: Remember to put each of the conditions inside the parenthesis. Else you’ll set yourself up for an error.

Try this code out in the live coding window below.

import pandas as pd

data = pd.read_csv('loan_train.csv')
print(data[['Education', 'ApplicantIncome']].head())

print('\n\nConditional Selection of Rows\n\n')

data_2 = data.loc[(data['Education'] == 'Not Graduate') & (data['ApplicantIncome'] <= 5400)]

print('\n\nFiltered Data\n\n')
print(data_2[['Education', 'ApplicantIncome']].head())

Pandas Hack #2 – Binning of data

The data can be of 2 types – Continuous and categorical depending on the requirement of our analysis. Sometimes we do not require the exact value present in our continuous variable. But the group it belongs to. This is where Binning comes into play.

For instance, you have a continuous variable in your data – age. But you require an age group for your analysis such as – child, teenager, adult, senior citizen. Indeed, Binning is perfect to solve our problem here.

To perform binning, we use the cut() function. This useful for going from a continuous variable to a categorical variable.

Let us check out the video to get a better idea!

https://youtu.be/WQagYXIFjns

Pandas Hack #3 – Grouping Data

This operation is frequently performed in the daily lives of data scientists and analysts. Pandas provide an essential function to perform grouping of data which is Groupby.

The Groupby operation involves the splitting of an object based on certain conditions, applying a function, and then combining the results.

Let us again take the loan prediction dataset, say I want to look at the average loan amount given to the people from different property areas such as Rural, Semiurban, and Urban. Take a moment to understand this problem statement and think about how can you solve it.

Well, pandas groupby can solve this problem very efficiently. Firstly we split the data according to the property area. Secondly, we apply the mean() function to each of the categories. Finally we combine it all together and print it as a new dataframe.

Pandas Hack #4 – Pandas mapping

This is yet another important operation that provides high flexibility and practical applications.

Pandas map() is used for mapping each value in a series to some other value-based according to an input correspondence. In fact, this input may be a Series, Dictionary, or even a function.

Let us take up an interesting example. We have a dummy employee dataset. This dataset consists of the following columns – name, age, profession, city. Now you want to add another column stating the corresponding state. How would you do it? If the dataset is ranging to ten rows you might do it manually but what if you have thousands of rows? It would be much more advantageous to use the pandas map.

Note – Map is defined on Series only.

Pandas Hack #5 – Conditional Formatting Pandas DataFrame

This is one of my favorite Pandas Hacks. This hack provides me with the power to pinpoint the data visually which follows a certain condition.

You can use the Pandas style property to apply conditional formatting to your data frame. In fact, Conditional Formatting is the operation in which you apply visual styling to the dataframe based on some condition.

While Pandas provides an abundant number of operations, I’m going to show you a simple one here. For example, we have the sales data corresponding to each of the respective salespeople. I want to highlight the sales values as green that is higher than 80.

Note – We have applied the apply map function here since we want to apply our style function elementwise.

End Notes

To summarize, in this article, we covered seven useful Pandas hacks, tips, and tricks across various pandas modules and functions. I hope these hacks will help you with day-to-day niche tasks and save you a lot of time. In case you are completely new to python, I highly recommend this free course-

Let me know your Data Science hacks, tips, and tricks in the comments section below!

Product Growth Analyst at Analytics Vidhya. I'm always curious to deep dive into data, process it, polish it so as to create value. My interest lies in the field of marketing analytics.

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Responses From Readers

The data frame styling is interesting. Is this part of pandas or does it only apply within a Jupyter notebook? Can I use it work with styles in an Excel spreadsheet?

Manuel Cestari

The explanation about pandas is very valuable and understandable. Thanks!

Olabode James

Nice work! Can be better with embedding the code snippet in the post like the conditional selection trick.

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
πŸ‘ Av Logo White

Continue your learning for FREE

Forgot your password?
πŸ‘ Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

πŸ‘ Popup Banner
πŸ‘ AI Popup Banner