Partitioning in Hive

By Sai Kumar on August 25, 2017

Partition improves query performance
The way Hive structures data storage changes with Partitioning
Partitions are stored as sub-directories in the table directory
Over Partitioning to be avoided
– Each partition creates an HDFS directory with many files in it
– It increases large number of small sized files in HDFS
– It eventually consume the capacity of namenode as the metadata is kept in main memory by Hadoop
Use a partition scheme that creates partitions with size in multiples of the HDFS block size
Hive supports dynamic partitions also where partitions are created from query parameters
Static partitions are created by the ‘PARTITIONED BY’ clause
Dynamic partitions are not enabled by default and if enabled , it works in ‘strict’ mode.
The maximum number of dynamic partitions are limited by default
Dynamic and static partitions can be mixed
– INSERT OVERWRITE TABLE employees PARTITION (country = ‘IN’, state)
SELECT …, se.cnty, se.st FROM staged_employees se WHERE se.cnty = ‘IN’’;

Static Partitions

create table emplocpartitions ( empno int, empname string , empsal float )
PARTITIONED BY ( emploc STRING) ;

INSERT OVERWRITE TABLE EMPLOCPARTITIONS
PARTITION (emploc=’TVM’)
select empno,empname,empsal from emplocations
where emploc LIKE ‘TVM’;

Dynamic Partitions

set hive.exec.dynamic.partition.mode=nonstrict
set hive.exec.dynamic.partition=true

insert overwrite table emplocpartitions
PARTITION ( emploc )
Select e.empno,e.empname,e.empsal,e.emploc from emplocations e;

– Setting the property hive.mapred.mode=strict; prohibits executing
queries with out WHERE clause on partitioned tables

Sai Kumar

An Ambivert, music lover, enthusiast, artist, designer, coder, gamer, content writer. He is Professional Software Developer with hands-on experience in Spark, Kafka, Scala, Python, Hadoop, Hive, Sqoop, Pig, php, html,css. Know more about him at www.24tutorials.com/sai

Static Partitions

Dynamic Partitions

Share This Post

Related Articles

Hive Tables

Hive Databases

Hive Architecture

Login

Lost Password

Register