VOOZH about

URL: https://www.analyticsvidhya.com/blog/2020/10/apache-hive-table-types/

⇱ Types of Tables in Apache Hive | Apache Hive Tables


India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

Types of Tables in Apache Hive – A Quick Overview

Siddharth Sonkar Last Updated : 28 Dec, 2020
5 min read

Overview

  • Apache Hive is a must-know tool for anyone interested in data science and data engineering
  • Learn about the different types of tables un Apache Hive

Introduction

I’ve spent over half a decade working with the Big Data Technology stack and consulting with clients across various domains. One thing I have noticed is how frequently Hive is used as a warehousing solution across business domains.

You simply can’t ignore Apache Hive when you are learning Apache Hadoop.

👁 apache hive

Hive is a part of the large Hadoop Ecosystem that lets you provide a schema to large data residing in HDFS. Most of you will be aware of RDBMS and its tables. We use them so often that it has become a part of our lives now. And here’s the thing – tables in Hive are no different.

Have you ever wondered what might be the different types of tables in Hive? That’s what we’ll discuss in this article!

Table of Contents

  1. What is Apache Hive?
  2. Types of Table in Apache Hive #1: Managed Tables
  3. Types of Table in Apache Hive #2: External Tables
  4. Managed vs External Table – What’s the Difference?
  5. Identify the Type of Apache Hive Table

What is Apache Hive?

Apache Hive is a data warehouse system for Apache Hadoop. It provides SQL-like access for data in HDFS so that Hadoop can be used as a warehouse structure. Hive allows you to provide structure on largely unstructured data. After you define the structure, you can use Hive to query the data without knowledge of Java or Map Reduce.

The Hive Query Language (HQL) has similar semantics and functions as standard SQL in the relational database so that experienced database analysts can easily access the data.

👁 apache hive tables

What are the features provided by Hive?

Apache Hive provides the following features:

  1. Apache Hive provides a simpler query model with less coding than Map Reduce
  2. HQL and SQL have similar syntax
  3. It provides lots of functions that lead to easier analytics usage
  4. The response time is typically much faster than other types of queries on the same of huge datasets
  5. Apache Hive supports running on different computing frameworks
  6. It supports ad hoc querying data on HDFS
  7. Apache Hive supports user-defined functions, scripts, and a customized I/O format to extend its functionality
  8. Is scalable and extensible to various types of data and bigger datasets
  9. Matured JDBC and ODBC drivers allow many applications to pull Hive data for seamless reporting
  10. Hive allows users to read data in arbitrary formats, using SerDes and Input/Output formats
  11. Hive has a well-defined architecture for metadata management, authentication, and query optimizations
  12. There is a big community of practitioners and developers working on and using Hive

Types of Tables in Apache Hive

Here are the types of tables in Apache Hive:

Managed Tables

In a managed table, both the table data and the table schema are managed by Hive. The data will be located in a folder named after the table within the Hive data warehouse, which is essentially just a file location in HDFS.

The location is user-configurable when Hive is installed. By managed or controlled we mean that if you drop (delete) a managed table, then Hive will delete both the Schema (the description of the table) and the data files associated with the table. Default location is /user/hive/warehouse).

Syntax to Create Managed Table

CREATE TABLE IF NOT EXISTS stocks (exchange STRING,
symbol STRING,
price_open FLOAT,
price_high FLOAT,
price_low FLOAT,
price_adj_close FLOAT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ;

As for managed tables, you can also copy the schema (but not the data) of an existing table:

CREATE EXTERNAL TABLE IF NOT EXISTS mydb.employees3
LIKE mydb.employees
LOCATION '/path/to/data';

External Tables
An external table is one where only the table schema is controlled by Hive. In most cases, the user will set up the folder location within HDFS and copy the data file(s) there. This location is included as part of the table definition statement. When an external table is deleted, Hive will only delete the schema associated with the table. The data files are not affected.

Syntax to Create External Table

CREATE EXTERNAL TABLE IF NOT EXISTS stocks (exchange STRING,
symbol STRING,
price_open FLOAT,
price_high FLOAT,
price_low FLOAT,
price_adj_close FLOAT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/data/stocks';

Managed vs. External Table – What’s the Difference?

Managed Table External Table
Hive assumes that it owns the data for managed tables. For external tables, Hive assumes that it does not manage the data.
If a managed table or partition is dropped, the data and metadata associated with that table or partition are deleted. Dropping the table does not delete the data, although the metadata for the table will be deleted.
For Managed tables, Hive stores data into its warehouse directory For External Tables, Hive stores the data in the LOCATION specified during creation of the table(generally not in warehouse directory)
Managed table provides ACID/transnational action support. External Table does not provide ACID/transactional action support.
Statements: ARCHIVE, UNARCHIVE, TRUNCATE, MERGE, CONCATENATE supported Not supported.
Query Result Caching supported(saves the results of an executed Hive query for reuse ) Not Supported

Identify the Type of Apache Hive Table

You can tell whether or not a table is managed or external using the output of DESCRIBE EXTENDED table name.

Near the end of the Detailed Table Information output, you will see the following for managed tables:

... tableType: MANAGED_TABLE)

For external tables, you will see the following:

... tableType: EXTERNAL_TABLE)

Note: If you omit the EXTERNAL keyword and the original table is external, the new table will also be external. If you omit EXTERNAL and the original table is managed, the new table will also be managed. However, if you include the EXTERNAL keyword and the original table is managed, the new table will be external. Even in this scenario, the LOCATION clause will still be optional.

End Notes

In this article, we learned about Apache Hive and its table types. Hopefully, you might have got a good overview of the types of tables in Hive. The differences stated are not exhaustive. Please feel free to add more in the comment section below.

The following are some additional Data Engineering resources that I recommend you explore-

I hope you might have liked the article. If you have any questions related to this article do let me know in the comments section below.

Recommended reading-

https://cwiki.apache.org/confluence/display/Hive/Home

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Responses From Readers

amarnath

very good explanation, tq.

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
👁 Av Logo White

Continue your learning for FREE

Forgot your password?
👁 Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

👁 Popup Banner
👁 AI Popup Banner