Data Warehouse | Data Lake |
Data is relational from transactional systems and operational databases. | Data is both non-relational and relational from various sources such as IoT devices, mobile apps, websites, and social media. |
Provides fastest query results at high cost of storage. | Provides faster query results at low storage cost. |
Used by Business analysts. | Used by Data scientists, Data developers, and Business analysts. |
Helps in Batch reporting, BI and visualizations | Helps to perform various analytics such as Machine Learning, Predictive analytics, data discovery and profiling |
Horizontal Scale |
Vertical Scale |
Provides new resources along with new hardware devices to support the infrastructure |
You would need to increase power resources by upgrading the current machine |
Used in distributed systems |
Used in virtualization |
Resilient to system failure |
Single point of failure |
Utilizes network calls |
Interprocess communication |
Increases the capacity of existing hardware or software by adding additional resources |
Connects multiple system entities, both hardware, and software such that they work as a single logical unit |
Difficult to implement |
Easy to implement |
EC2 | S3 |
|
|
|
|
Spot Instance | On-demand Instance |
With Spot Instance, customers can purchase compute capacity with no upfront commitment at all. | With On-demand Instance, users can launch instances at any time based on the demand. |
Spot Instances are spare Amazon instances that you can bid for. | On-demand Instances are suitable for the high-availability needs of applications. |
When the bidding price exceeds the spot price, the instance is automatically launched, and the spot price fluctuates based on supply and demand for instances. | On-demand Instances are launched by users only with the pay-as-you-go model. |
When the bidding price is less than the spot price, the instance is immediately taken away by Amazon. | On-demand Instances will remain persistent without any automatic termination from Amazon. |
Spot Instances are charged on an hourly basis. | On-demand Instances are charged on a per-second basis |