Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages.
Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas
str.partition() works in a similar way like
str.split(). Instead of splitting the string at every occurrence of separator/delimiter, it splits the string
only at the first occurrence. In the split function, the separator is not stored anywhere, only the text around it is stored in a new list/Dataframe. But in the
str.partition() method, the separator is also stored.
.str has to be prefixed every time before calling this method to differentiate it from the Python's default function otherwise, it will throw an error.
Syntax: Series.str.partition(pat=' ', expand=True)
Parameters:
pat: String value, separator or delimiter to separate string at. Default is ' ' (whitespace)
expand: Boolean value, returns a data frame with different value in different columns if True. Else it returns a series with list of strings. Default is True.
Return Type: Series of list or Data frame depending on expand Parameter
To download the CSV used in code, click
here.
In the following examples, the data frame used contains data of some employees. The image of data frame before any operations is attached below.
👁 Image
Example #1: Splitting String into List
In this example, the Name column is splitted at the first occurrence of ', '. The expand parameter is kept False as to expand it into a list instead of Data Frame.
Output:
As shown in the output image, the Name column was splitted into list at first occurrence of ', '. As it can be seen, ', ' is also stored as an separate element of list.
Note: Do not get confused by two commas in the list, one is element and the other is element separator.
👁 Image
Example #2: Splitting String into Data frame
In this example, the First Name and Last name is separated from the Name column and stored into separate columns in the data frame.