Setting Hive in “AUTO/LOCAL MODE” comes very handy when you are dealing with small amount of data.
How this works?
Well, suppose you are running some complex hive query, then you know that it will trigger MapReduce job in background and will give you the output.
This approach works well if the data size is huge in hive, but becomes a problem if the data is less in hive table.
The main reason this problem occurs is when MapReduce job is triggered, it is triggered at the server/cluster level and hence every time you run complex queries, it will go to server and launch MapReduce there and then will give the output.
In case of LOCAL/AUTO MODE, if your complex query is launching less than 4 mapper jobs, then that MapReduce job is run locally and you get output of your complex query in less time.
Following are the three commands that you need to set on hive command to run hive in local/auto mode.
In conclusion, if you want the the output of your query in less time, you should go for “LOCAL/AUTO MODE”.