In this tutorial, we are going to install and configure Apache Drill 1.3.0 on Ubuntu 16.04.

But, before starting with the installation and configuration, let us get to know about Apache Drill.

Following is the minimum information we should know before going ahead with Apache Drill Installation and Configuration.

  • Drill converts CSV files, NoSQL Databases on-the-fly to query them using SQL.
  • It reduces most of the overhead for extracting analysis out of the data and enables users to focus more on important and things that matter.
  • It is a revolutionary project by Apache, enabling millions of users to ping files and NoSQL tables to extract useful analysis.

Now, let us dive into the installation document we are going to follow to install Apache Drill 1.3.0.

Please go through below document which I have uploaded on my GitHub profile.

As it is mentioned in the closing about the process running for Apache Drill – SqlLine. The following screenshot depicts the same thing. Please have a look.

Drill Process Execution Confirmation
Drill Process Execution Confirmation

Now it is to log into Apache Drill.

There are two ways in which you can do this. The first one is command prompt approach and the second and the recommended one is through the web interface.

We will see each approach one by one.

  1. COMMAND PROMPT

We use the following command to log into Drill Command Prompt.

$ sqlline -u jdbc:drill:zk=local

Following screenshot shows you the expected output of above command.

Apache Drill Successful Login
Apache Drill Successful Login

 

2. WEB INTERFACE

You can put following address in your browser window to access the Apache Drill Web Interface. This is the most preferred way of accessing drill.

The address pattern is as follows.

http://<IP ADDRESS>:8047

Now, there are plenty of ways to find out the IP Addresses. I am showing the most convenient way as below.

You can run command $hostname -I to get the IP Address. Below screenshot gives you the expected output.

hostname command
hostname command

You take this IP Address and insert it into the web browser. Voila, you will get the Apache Drill Web Interface. It looks something like this.

Apache Drill Web Interface
Apache Drill Web Interface

Now, before verifying whether drill works fine or not. Let us take a close look at the tabs present shown in above screenshot.

The Query tab is the one on which we are going to write our SQL queries to get the useful information out of unstructured data.

It looks something like this.

Apache Drill Query Window
Apache Drill Query Window

The Storage tab is the one which shows the various storage types and nodes available for use. We are going to use dfs storage type which enables us to access any file loaded into the system.

It looks something like this.

Apache Drill Storage Types
Apache Drill Storage Types

Now, it is time to test our Drill Interface.

For that, we are going to use below JSON file.

Input JSON file for Drill Query
Input JSON file for Drill Query

Our job is to query the above data with the help of drill interface.

For that, we are going to follow below steps.

  1. Load this file in your system.
  2. Go to Apache Drill Query Web Interface.
  3. Write desired query in the query window.
  4. Hit on Submit
  5. Observe the output

Below screenshot depicts this step-by-step process.

Input File Transfer
Input File Transfer

I am using FileZilla client to copy above mentioned file from my Windows system to the Ubuntu system on which drill is installed.

Note that the complete path of the input file is ‘/home/hduser/drill_input.json

Apache Drill Query Output
Apache Drill Query Output

As you can see from above screenshot, I am running a simple select * query to get the output from the input JSON file which we uploaded a few moments ago.

I think, this explains how the drill works quite clearly. It took the file from the path mentioned in SQL query and on-the-fly converted into a table to give us the result in tabular format as shown above.

The last thing that we should always do is to close the Apache Drill connection. This can be done from the command prompt !quit command. Do not forget the preceding exclamation symbol.

Below screenshot might be helpful.

quit command in drill
quit command in drill

 

This completed Apache Drill Installation and Configuration.

Hope you people had a good read.

Thanks.

Cheers!

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s