| Best of Hive | Best of Presto |
|---|---|
| Large data aggregations | Interactive queries (where you want to wait for the answer) |
| Large Fact-to-Fact joins | Quickly exploring the data (e.g. what types of records are found in the table) |
| Large distincts (aka de-duplication jobs) | Joins with a large Fact table and many smaller Dimension tables |
| Batch jobs that can be scheduled |
| Hive | Presto | |
|---|---|---|
| Optimized for | Throughput | Interactivity |
| SQL Standardized fidelity | HiveQL (subset of common data warehousing SQL) | Designed to comply with ANSI SQL |
| Window functions | Yes | Yes |
| Large JOINs | Very good for large Fact-to-Fact joins | Optimized for star schema joins (1 large Fact table and many smaller dimension tables) |