How to define Schema to Spark Dataframe
Dataframe are the table structured object, which makes user to perform SQL kind of operation such as select, filter, group by, aggregate etc, very easily.
from pyspark.sql.types import *
data_schema=[ StructField("ID", IntegerType(), True),
StructField("NAME", StringType(), True),
StructField("EXPERTISE", StringType(), True),
StructField("ADDRESS", StringType(), True),
StructField("MOBILE", StringType(), True) ]
struct_schema=StructType(fields=data_schema)
print(struct_schema)