VOOZH about

URL: https://www.geeksforgeeks.org/python/string-manipulations-in-pandas-dataframe/

⇱ String Manipulations in Pandas DataFrame - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

String Manipulations in Pandas DataFrame

Last Updated : 10 Dec, 2025

String manipulation refers to cleaning, transforming, and processing text data so it becomes suitable for analysis. Pandas provides a wide collection of .str functions that make it easy to work with string columns inside a DataFrame such as converting cases, trimming spaces, splitting, extracting patterns, replacing values, and more.

In this article, we will perform string manipulation using the dataset shown below:


Output
 Name City
0 Lukas Berlin
1 Sofia Madrid
2 Hiroshi Tokyo
3 Marta Warsaw
4 Yannis Athens
5 NaN Oslo
6 Elena Lisbon

Column Datatype in Pandas

Sometimes columns that appear like strings may internally be stored as other datatypes. To ensure consistent string operations, it is often useful to convert selected columns to the string dtype.

Below, we convert the entire DataFrame to string type using .astype('string').

This ensures every column supports Pandas' string functions without errors.

String Operations in Pandas

Below are the commonly used string manipulation methods in Pandas, explained with short examples.

1. lower(): This method converts every character in the column to lowercase, ensuring consistent text formatting.

Output

0 lukas
1 sofia
2 hiroshi
3 marta
4 yannis
5 NaN
6 elena
Name: Name, dtype: object

2. upper(): This method transforms all characters in the column to uppercase for uniform, standardized text.

Output

0 LUKAS
1 SOFIA
2 HIROSHI
3 MARTA
4 YANNIS
5 NaN
6 ELENA
Name: Name, dtype: object

3. strip(): This method removes unwanted leading and trailing spaces from each string to clean the data.

Output

0 Lukas
1 Sofia
2 Hiroshi
3 Marta
4 Yannis
5 NaN
6 Elena
Name: Name, dtype: object

4. split(): This method splits each string into a list of parts based on a given separator.

Output

0 [Luk, s]
1 [Sofi, ]
2 [Hiroshi]
3 [M, rt, ]
4 [Y, nnis]
5 NaN
6 [Elen, ]
Name: Name, dtype: object

5. len(): This method calculates and returns the character length of each string in the column.

Output

0 5.0
1 5.0
2 7.0
3 5.0
4 6.0
5 NaN
6 5.0
Name: Name, dtype: float64

6. cat(): This method concatenates all strings in the column into a single string using a chosen separator.

Output

Lukas, Sofia, Hiroshi, Marta, Yannis, Elena

7. get_dummies(): This method converts each unique string into a separate one-hot encoded column for modeling.

Output

Athens Berlin Lisbon Madrid Oslo Tokyo Warsaw
0 0 1 0 0 0 0 0
1 0 0 0 1 0 0 0
2 0 0 0 0 0 1 0
3 0 0 0 0 0 0 1
4 1 0 0 0 0 0 0
5 0 0 0 0 1 0 0
6 0 0 1 0 0 0 0

8. startswith(): This method checks whether each string begins with the specified prefix.

Output

0 False
1 False
2 False
3 False
4 False
5 NaN
6 True
Name: Name, dtype: object

9. endswith(): This method checks whether each string ends with the specified suffix.

Output

0 False
1 True
2 False
3 True
4 False
5 NaN
6 True
Name: Name, dtype: object

10. replace(): This method replaces occurrences of a specific substring or pattern with a new value.

Output

0 Lukas
1 Sofia
2 Hiroshi
3 Marta
4 Yannis
5 NaN
6 Emily
Name: Name, dtype: object

11. repeat(): This method duplicates each string a given number of times.

Output

0 LukasLukas
1 SofiaSofia
2 HiroshiHiroshi
3 MartaMarta
4 YannisYannis
5 NaN
6 ElenaElena
Name: Name, dtype: object

12. count(): This method counts how many times a specific substring or pattern appears in each string.

Output

0 1.0
1 1.0
2 0.0
3 2.0
4 1.0
5 NaN
6 1.0
Name: Name, dtype: float64

13. find(): This method returns the index of the first occurrence of a pattern within each string.

Output

0 3.0
1 4.0
2 -1.0
3 1.0
4 1.0
5 NaN
6 4.0
Name: Name, dtype: float64

14. findall(): This method returns a list of all occurrences of a pattern found in each string.

Output

0 [a]
1 [a]
2 []
3 [a, a]
4 [a]
5 NaN
6 [a]
Name: Name, dtype: object

15. islower(): This method checks whether all characters in each string are lowercase.

Output

0 False
1 False
2 False
3 False
4 False
5 NaN
6 False
Name: Name, dtype: object

16. isupper(): This method checks whether all characters in each string are uppercase.

Output

0 False
1 False
2 False
3 False
4 False
5 NaN
6 False
Name: Name, dtype: object

17. isnumeric(): This method checks whether each string contains only numeric characters.

Output

0 False
1 False
2 False
3 False
4 False
5 NaN
6 False
Name: Name, dtype: object

18. swapcase(): This method swaps uppercase letters to lowercase and lowercase letters to uppercase for each string.

Output

0 lUKAS
1 sOFIA
2 hIROSHI
3 mARTA
4 yANNIS
5 NaN
6 eLENA
Name: Name, dtype: object

Comment