We can do it by making use of the
createDataFrame()
method of the SparkSession.
data = [('Harry', 20),
('Ron', 20),
('Hermoine', 20)]
columns = ["Name","Age"]
df = spark.createDataFrame(data=data, schema = columns)?
This creates the dataframe as shown below:
+-----------+----------+
| Name | Age |
+-----------+----------+
| Harry | 20 |
| Ron | 20 |
| Hermoine | 20 |
+-----------+----------+?
We can get the schema of the dataframe by using
df.printSchema()
>> df.printSchema()
root
|-- Name: string (nullable = true)
|-- Age: integer (nullable = true)?