Set Hive in Local / Auto Mode

Setting Hive in “AUTO/LOCAL MODE” comes very handy when you are dealing with small amount of data.
How this works?

Well, suppose you are running some complex hive query, then you know that it will trigger MapReduce job in background and will give you the output.

This approach works well if the data size is huge in hive, but becomes a problem if the data is less in hive table.
The main reason this problem occurs is when MapReduce job is triggered, it is triggered at the server/cluster level and hence every time you run complex queries, it will go to server and launch MapReduce there and then will give the output.

In case of LOCAL/AUTO MODE, if your complex query is launching less than 4 mapper jobs, then that MapReduce job is run locally and you get output of your complex query in less time.

Following are the three commands that you need to set on hive command to run hive in local/auto mode.


SET hive.exec.mode.local.auto=true;

SET hive.exec.mode.local.auto.inputbytes.max=50000000;

SET hive.exec.mode.local.auto.input.files.max=5;

In conclusion, if you want the the output of your query in less time, you should go for “LOCAL/AUTO MODE”.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s