Features |
Spark |
Hadoop |
Data processing |
Part of Hadoop, hence batch processing |
Batch Processing even for high volumes |
Streaming Engine |
Apache spark streaming - micro-batches |
Map-Reduce |
Data Flow |
Direct Acyclic Graph-DAG |
Map-Reduce |
Computation Model |
Collect and process |
Map-Reduce batch-oriented model |
Performance |
Slow due to batch processing |
Slow due to batch processing |
Memory Management |
Automatic memory management in the latest release |
Dynamic and static - Configurable |
Fault Tolerance |
Recovery available without extra code |
Highly fault-tolerant due to Map-Reduce |
Scalability |
Highly scalable - spark Cluster(8000 Nodes) |
Highly scalable - Produces a large number of nodes |