VOOZH about

URL: https://www.geeksforgeeks.org/python/convert-pyspark-rdd-to-dataframe/

⇱ Convert PySpark RDD to DataFrame - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Convert PySpark RDD to DataFrame

Last Updated : 2 Nov, 2022

In this article, we will discuss how to convert the RDD to dataframe in PySpark. There are two approaches to convert RDD to dataframe.

  1. Using createDataframe(rdd, schema)
  2. Using toDF(schema)

But before moving forward for converting RDD to Dataframe first let's create an RDD

Example:

Output:

<class 'pyspark.rdd.RDD'>

Method 1: Using createDataframe() function. 

After creating the RDD we have converted it to Dataframe using createDataframe() function in which we have passed the RDD and defined schema for Dataframe.

Syntax:

spark.CreateDataFrame(rdd, schema)

Output:

👁 Image

Method 2: Using toDF() function.

After creating the RDD we have converted it to Dataframe using the toDF() function in which we have passed the defined schema for Dataframe.

Syntax:

df.toDF(schema)

Output:

👁 Image

Comment
Article Tags: