![]() |
VOOZH | about |
CRUD stands for Create, Read, Update and Delete. These are the four fundamental operations we'll use when working with data in Pandas. Whether we're creating a DataFrame from scratch, analyzing existing data, modifying values or saving our results these operations are important in Pandas.
Letβs see each operation step by step to see how they make data manipulation easy.
Creating a dataset in Pandas means building a DataFrame which is the main data structure in Pandas. We can create a DataFrame using various methods like reading from a file or directly creating one from Python objects like dictionaries, lists or arrays
1. Creating a DataFrame from a Dictationary
This is one of the easiest and most commonly used methods to create a dataset in Pandas
Output:
2. Creating a DataFrame from Lists
We can also create a DataFrame by combining lists.
Output:
3. Creating a DataFrame from a CSV File
We can also create a DataFrame by reading an external file like a CSV. Here we used the random car.csv data.
You can download dataset from here.
Output:
Now that weβve created a dataset using the Create operation, lets see by using the Read operation. This step is all about accessing and understanding our data. Pandas provides simple methods to view our dataset, check its structure and analyze its contents.
1. Viewing Rows in a DataFrame
Output:
2. Exploring Columns of the dataset
Output:
Index(['Name', 'Age', 'City'], dtype='object')
3. Checking Data Types with dtype
We use df.types to check the particular data type of the columns we have for further operations
Output:
4. Generating Descriptive Statistics with describe()
This is a very important command used in pandas to check the overall statistics for the numerical data so that we can make predictions and move easily in our data exploration.
Output:
3. Filtering Columns
Accessing a single Column.
Output:
4. Accessing Multiple columns
Output:
5. Finding Unique Values in a Column
Finding unique tends to provide the non-duplicates values in our columns.
Output:
['NY' 'LA' 'SF' 'Houston' 'Seattle']
6. Filtering Rows (Conditional Filtering)
Single Condition Filtering.
Output:
7. Filtering with Multiple Conditions (AND/OR Logic)
Output:
8. Indexing in Pandas
Integer-Based Indexing with iloc.
Output:
Output:
NY
10. Slicing Rows
Output:
11. Label-Based Indexing
Output:
Now, we will continue with the Update (U) and Delete (D) operations, which are important for modifying and managing data efficiently.
Update operation allows us to modify existing data within a DataFrame. Whether we're changing specific values, updating entire columns or applying conditions to update data, Pandas makes it simple.
We will use the following dataset for the update operations.
Output:
1. Updating a Single Value: We can update a single value in a specific row and column using loc or iloc.
Output:
2. Updating an Entire Column: We can update an entire column by assigning a new list, series or value.
Output:
3. Updating Based on a Condition: We can apply conditions to update values in a DataFrame.
Output:
π CRUD-CDelete operation allows us to remove data from a DataFrame. We can drop rows, columns or specific values providing flexibility in cleaning and manipulating datasets. For the delete operations we will use the dataset below.
Output:
1. Delete a Column: We can delete a column using the drop() method.
Output:
2. Delete a Row: Similarly we can delete rows by specifying the index.
Output:
3. Delete Rows Based on Condition: We can delete rows based on conditions.
Output:
4. Delete entire dataset: To delete the entire DataFrame, we can use the del statement or reassign it to an empty DataFrame.
It will return nothing as it empty the dataset. With these basic CRUD operations we can perform data manipulation easily in complex data manipulation tasks in Pandas.