Hive Note

Hive Basics

ACID vs non-ACID tables, non-ACID table is preferred (ACID tables also need compaction, combine delta folder with base folder, cannot be read properly by Spark)
Choose proper format (compression) for tables
Enable storage index, ‘orc.create.index’=’true’
Bucketing
Bloom filter
Partition, avoid too many partitions
Record insertion should use sort by
For non-ACID tables, concatenate small files for Hive tables
Analyze table to update statistics