How to define Schema to Spark Dataframe

admin

6/18/2023
All Articles

  #spark #scala #bigdata

How to define  Schema to Spark Dataframe

How to define  Schema to Spark Dataframe

Dataframe are the table structured object, which makes user to perform SQL kind of operation such as select, filter, group by, aggregate etc, very easily.

 

from pyspark.sql.types import *​
data_schema=[ StructField("ID", IntegerType(), True),
              StructField("NAME", StringType(), True),
              StructField("EXPERTISE", StringType(), True),
              StructField("ADDRESS", StringType(), True),
              StructField("MOBILE", StringType(), True) ]
​struct_schema=StructType(fields=data_schema)​
print(struct_schema)