Google News
logo
Spark - Interview Questions
What is a Parquet file?
Parquet is a columnar format file supported by many other data processing systems. Spark SQL performs both read and write operations with Parquet file and consider it be one of the best big data analytics formats so far. 
 
Parquet is a columnar format, supported by many data processing systems. The advantages of having a columnar storage are as follows :
 
* Columnar storage limits IO operations.
* It can fetch specific columns that you need to access.
* Columnar storage consumes less space.
* It gives better-summarized data and follows type-specific encoding.
Advertisement