How do we create RDDs in Spark?

Spark - Interview Questions

Spark provides two methods to create RDD :

* By parallelizing a collection in your Driver program.

* This makes use of SparkContext’s ‘parallelize’

method val DataArray = Array(2,4,6,8,10)
 
val DataRDD = sc.parallelize(DataArray)

* By loading an external dataset from external storage like HDFS, HBase, shared file system.

More Technologies Interview Questions