Hi everyone, hope you are finding these tutorials quite helpful.
Today, we are going to target the 4th objective in data ingestion category of the HDPCD certification.
We are going to perform the Sqoop Export operation. In this tutorial, we saw the sqoop import operation, which is the reverse of the sqoop export operation.
So, let us begin.
The sqoop export operation helps us to transfer data from an HDFS location to the MySQL table. There are few conditions which must be satisfied before performing the sqoop export operation.
- MySQL table should already exist.
- MySQL table must follow same schema (columns and datatypes) as the HDFS data.
- The user, which is used to perform sqoop export operation must have sufficient privileges to perform the sqoop export task.
Once above three conditions are satisfied, we can perform the sqoop export operation.
We will consider that these conditions are satisfied in our case and start off with the sqoop export operation.
Let us first create the input data in HDFS.
You can download this file by clicking here. It looks something like this.
You can use the following commands to load this file to HDFS.
hadoop fs -mkdir /user/horton/weather
hadoop fs -put sfo_weather.csv /user/horton/weather
Once the data is pushed to HDFS successfully, you can use the following commands to get the sense of how the data looks like.
hadoop fs -ls /user/horton/weather
hadoop fs -cat /user/horton/weather/sfo_weather.csv
Following screenshot shows the output given by above commands.
As can be seen from above screenshot, our HDFS data contains 7 columns. The first column is the string column and the rest of the columns are integer columns, separated by a comma(,).
As the output of our sqoop export command, our expectation should be that the target MySQL table should also contain 7 columns with their correct respective datatypes.
So, let us first check whether the MySQL table exists or not. We will run show tables; command to see the weather table exists or not.
mysql> show tables;
Following screenshot shows the table existence.
As can be seen from above screenshot, you can see that weather table does not exist.
Let us create the MySQL table with the required schema. We will log into MySQL table and use the following schema to create weather table.
create table weather (
The output of above command looks as shown in the following screenshot.
Now, MySQL table is created. Let us export some data into it using Sqoop Export command.
The command looks something like this.
sqoop export \
–connect jdbc:mysql://sandbox.hortonworks.com:3306/retail_db \
–username retail_dba \
–password hadoop \
–export-dir /user/horton/weather \
Let us see how this command executes in below screenshot.
This is the initialization part of the sqoop export command. Here is the output part of this command.
As you can see from above screenshot, a total of 5 records are exported into a MySQL table called weather. Let us log into the MySQL database server and check the total number of records in weather table in the retail_db database.
mysql> select * from weather;
This is the output of above command.
As you can see, 5 records were successfully loaded into weather table. This confirms the successful execution of Sqoop Export operation.
Now, it is time to test what happens if we run Sqoop Export command one more time.
Let us check that out, then.
Let us check MySQL table. The weather should have 10 records. Following screenshot explains the outcome.
From above step, we can conclude that exported records by Sqoop Export command will be appended to already existing MySQL records, making 5 records to increase to 10.
After this, the last test we want to do is to check what happens if the MySQL table does not exist.
Let us drop the MySQL table and re-run Sqoop Export command.
mysql> drop table weather;
mysql> show tables;
As can be seen from above screenshot, weather table is dropped. Now, let us see how Sqoop Export command works.
As it is evident, the Sqoop Export command fails to run successfully. It gives out an error saying “Table ‘retail_db.weather’ doesn’t exist”. This confirms that, before running Sqoop Export command, MySQL table must exist in the mentioned database.
This ends the Sqoop Export tutorial, the last objective in Hive for HDPCD.
I hope, this clears all concepts related to this task.
Please stay tuned for the further updates. In the next tutorial, we are going to start a Flume Agent.
You can subscribe to my YouTube channel here for video tutorials.