sparkSpark’s primary abstraction is a distributed collection of items called a Dataset. Datasets can be created from Hadoop InputFormats (such as HDFS files) or by transforming other Datasets.Some Spark runtime environments come with pre-instantiated Spark Sessions. The getOrCreate() method will use an existing Spark Session or create a new Spark Session if one does