![]() |
VOOZH | about |
Retrieving unique values from a column in a Pandas DataFrame helps identify distinct elements, analyze categorical data, or detect duplicates. For example, if a column B contains ['B1', 'B2', 'B3', 'B4', 'B4'], the unique values are ['B1', 'B2', 'B3', 'B4'].
Here is the sample DataFrame used in this article:
A B C D E 0 A1 B1 C1 D1 E1 1 A2 B2 C2 D2 E1 2 A3 B3 C3 D2 E1 3 A4 B4 C3 D2 E1 4 A5 B4 C3 D2 E1
Let's explore different methods to get unique values from a column in Pandas.
The unique() method returns a NumPy array. The order of the unique values is preserved based on their first occurrence.
Example: In this example, we retrieve the unique values from column 'B'.
Output
['B1' 'B2' 'B3' 'B4']
The nunique() method counts the number of unique values in a column. It is useful when you only need the count of distinct values rather than the values themselves.
Example: Here we get the number of unique values in columns 'A', 'B', 'C', and 'D'.
Output
Unique count in A: 5
Unique count in B: 4
Unique count in C: 3
Unique count in D: 2
The drop_duplicates() method removes duplicate values in the specified column, returning a DataFrame with only the unique values. The index of the original DataFrame is preserved.
Example: This code retrieves unique values from column 'C'.
Output
0 C1
1 C2
2 C3
Name: C, dtype: object
The value_counts() method counts the occurrences of each unique value in the column and returns the result as a Series.
Example: Here we count unique values in column 'D' and also extract only the unique values.
Output
Value counts in D:
D
D2 4
D1 1
Name: count, dtype: int64
Unique values in D: ['D2', 'D1']
You can also use Python’s built-in set() function, which converts the column values into a set, automatically removing duplicates.
Example: Here we get unique values from column 'D' using set().
Output
{'D2', 'D1'}