Hi everyone, hope you are finding these tutorials quite helpful.

Today, we are going to target the 4th objective in data ingestion category of the HDPCD certification.

We are going to perform the Sqoop Export operation. In this tutorial, we saw the sqoop import operation, which is the reverse of the sqoop export operation.

So, let us begin.

The sqoop export operation helps us to transfer data from an HDFS location to the MySQL table. There are few conditions which must be satisfied before performing the sqoop export operation.

  • MySQL table should already exist.
  • MySQL table must follow same schema (columns and datatypes) as the HDFS data.
  • The user, which is used to perform sqoop export operation must have sufficient privileges to perform the sqoop export task.

Once above three conditions are satisfied, we can perform the sqoop export operation.

We will consider that these conditions are satisfied in our case and start off with the sqoop export operation.

Let us first create the input data in HDFS.

You can download this file by clicking here. It looks something like this.

You can use the following commands to load this file to HDFS.

hadoop fs -mkdir /user/horton/weather

hadoop fs -put sfo_weather.csv /user/horton/weather

Once the data is pushed to HDFS successfully, you can use the following commands to get the sense of how the data looks like.

hadoop fs -ls /user/horton/weather

hadoop fs -cat /user/horton/weather/sfo_weather.csv

Following screenshot shows the output given by above commands.

HDFS directory to export
HDFS directory to export

As can be seen from above screenshot, our HDFS data contains 7 columns. The first column is the string column and the rest of the columns are integer columns, separated by a comma(,).

As the output of our sqoop export command, our expectation should be that the target MySQL table should also contain 7 columns with their correct respective datatypes.

So, let us first check whether the MySQL table exists or not. We will run show tables; command to see the weather table exists or not.

mysql> show tables;

Following screenshot shows the table existence.

Table existence in MySQL
Table existence in MySQL

As can be seen from above screenshot, you can see that weather table does not exist.

Let us create the MySQL table with the required schema. We will log into MySQL table and use the following schema to create weather table.

create table weather (
airport_name varchar(3),
year int,
day int,
month int,
mintemp int,
maxtemp int,
precipitation int);

The output of above command looks as shown in the following screenshot.

MySQL Table Creation
MySQL Table Creation

Now, MySQL table is created. Let us export some data into it using Sqoop Export command.

The command looks something like this.

sqoop export \
–connect jdbc:mysql://sandbox.hortonworks.com:3306/retail_db \
–username retail_dba \
–password hadoop \
–export-dir /user/horton/weather \
–table weather

Let us see how this command executes in below screenshot.

Sqoop Export Command
Sqoop Export Command

This is the initialization part of the sqoop export command. Here is the output part of this command.

Sqoop Export Command Output
Sqoop Export Command Output

As you can see from above screenshot, a total of 5 records are exported into a MySQL table called weather. Let us log into the MySQL database server and check the total number of records in weather table in the retail_db database.

mysql> select * from weather;

This is the output of above command.

MySQL After Export - 1

As you can see, 5 records were successfully loaded into weather table. This confirms the successful execution of Sqoop Export operation.

Now, it is time to test what happens if we run Sqoop Export command one more time.

Let us check that out, then.

Sqoop Export Output - 2
Sqoop Export Output – 2

Let us check MySQL table. The weather should have 10 records. Following screenshot explains the outcome.

MySQL Table Records - 2
MySQL Table Records – 2

From above step, we can conclude that exported records by Sqoop Export command will be appended to already existing MySQL records, making 5 records to increase to 10.

After this, the last test we want to do is to check what happens if the MySQL table does not exist.

Let us drop the MySQL table and re-run Sqoop Export command.

mysql> drop table weather;

mysql> show tables;

After Dropping MySQL Table
After Dropping MySQL Table

As can be seen from above screenshot, weather table is dropped. Now, let us see how Sqoop Export command works.

Sqoop Export command Without MySQL Table
Sqoop Export command Without MySQL Table

As it is evident, the Sqoop Export command fails to run successfully. It gives out an error saying “Table ‘retail_db.weather’ doesn’t exist”. This confirms that, before running Sqoop Export command, MySQL table must exist in the mentioned database.

This ends the Sqoop Export tutorial, the last objective in Hive for HDPCD.

I hope, this clears all concepts related to this task.

Please stay tuned for the further updates. In the next tutorial, we are going to start a Flume Agent.

You can subscribe to my YouTube channel here for video tutorials.




One thought on “Post 6 | HDPCD | Sqoop Export

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s