What are the different levels of persistence in Spark?

Spark - Interview Questions

DISK_ONLY : Stores the RDD partitions only on the disk

MEMORY_ONLY_SER : Stores the RDD as serialized Java objects with a one-byte array per partition

MEMORY_ONLY : Stores the RDD as deserialized Java objects in the JVM. If the RDD is not able to fit in the memory available, some partitions won’t be cached

OFF_HEAP : Works like MEMORY_ONLY_SER but stores the data in off-heap memory

MEMORY_AND_DISK : Stores RDD as deserialized Java objects in the JVM. In case the RDD is not able to fit in the memory, additional partitions are stored on the disk

MEMORY_AND_DISK_SER : Identical to MEMORY_ONLY_SER with the exception of storing partitions not able to fit in the memory to the disk