Hive

Data Model and Datatypes in Hive

Data  in Hive is organised into –

  • Databases –  Namespace to separate table and other data
  • Tables – Homogeneous collection of data having same schema
  • Partitions – Divisions in table data based on key value
  • Buckets – Divisions in partitions based on hash value of a particular column

Hive Data Types:

    • Hive supports primitive data types and three collection types.
    • Primitive type –
      tinyint,   smallint,  int,
      bigint,   boolean,   string,
      timestamp, float,   double ,  binary
    • Collection Types –

1. Struct
address struct <city:STRING; state:STRING>
– Eg: struct (‘Bengaluru’, ‘Karnataka’) and address.city = ‘Bengaluru’

2. Array
names array(‘Hari’, ’Sai’)
– Eg: name[1] = Sai

3. Maps
name map(‘first’, ‘Mahendra’, ‘last’, ‘Dhoni‘)
– Eg: name[‘first’] => Mahendra

4. Union

  • All data types are implemented in Java
  • Type casting of the data types are available as in Java

Share This Post

An Ambivert, music lover, enthusiast, artist, designer, coder, gamer, content writer. He is Professional Software Developer with hands-on experience in Spark, Kafka, Scala, Python, Hadoop, Hive, Sqoop, Pig, php, html,css. Know more about him at www.24tutorials.com/sai

Lost Password

Register

24 Tutorials