Building a SparkSession object
In the Scala and Python programs, you build a SparkSession object with the following build pattern:
val sparkSession = new SparkSession.builder.master(master_path).appName("application name").config("optional configuration parameters").getOrCreate()
Tip
While you can hardcode all these values, it's better to read them from the environment with reasonable defaults. This approach provides maximum flexibility to run the code in a changing environment without having to recompile. Using local as the default value for the master makes it easy to launch your application in a test environment locally. By carefully selecting the defaults, you can avoid having to overspecify this.
The spark-shell/pyspark creates the SparkSession object automatically and assigns to the spark variable.
The SparkSession object has the SparkContext object, which you can access with spark.sparkContext.
As we will see later, the SparkSession object unifies more than the context; it also unifies...