How to build a sparkSession in Spark and scala

admin

6/17/2023
All Articles

#scala #spark #scala #bigdata

How to build a sparkSession in Spark and scala

 

How to build a sparkSession in Spark and scala

class pyspark.sql.SparkSession(sparkContext, jsparkSession=None) The entry point to Spark  job with the Dataset and DataFrame API.
 A SparkSession can be used create DataFrame, register DataFrame as tables, 
execute SQL over tables, cache tables, and read parquet files. To create a SparkSession:
 
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('abc').getOrCreate()
df=spark.read.csv('filename.csv',header=True)
 
Spark is to create the SparkSession instance
We can create in scala :
spark  = SparkSession.builder\
                  .master("local")\
                  .enableHiveSupport()\
                  .getOrCreate()