![]() |
VOOZH | about |
In this article, we will discuss how to join multiple columns in PySpark Dataframe using Python.
Let's create the first dataframe:
Output:
👁 ImageLet's create the second dataframe:
Output:
👁 Imagewe can join the multiple columns by using join() function using conditional operator
Syntax: dataframe.join(dataframe1, (dataframe.column1== dataframe1.column1) & (dataframe.column2== dataframe1.column2))
where,
- dataframe is the first dataframe
- dataframe1 is the second dataframe
- column1 is the first matching column in both the dataframes
- column2 is the second matching column in both the dataframes
Output:
👁 Image