June 5, 2017 by milindjagre
Hi, everyone. Thank you for returning again to this certification series.
In the last tutorial, we saw the process of registering the jar file in the Apache PIG session. This tutorial is an extension to the previous one and in this, we are going to see how to define an alias for the UDF present in that JAR file.
The important reason for assigning an alias for an UDF is that instead of using the fully qualified class name, you can use an alias and save both time and efforts.
Let me show you how it works. We are going to follow the steps mentioned in the following infographics.
As you can see, we have already performed the first step as a part of the last tutorial activity. For the sake of this tutorial, we will execute all the steps one-by-one in the following way.
- FINDING JAR FILE LOCATION
We can use the following command to find out the location of the jar file piggybank.jar, as we have already seen in the previous tutorial in this certification series.
find / | grep “piggybank.jar”
The output of the above command looks as follows.
As you can see from the above screenshot that the jar file is stored in /usr/hdp/22.214.171.124-2557/pig/piggybank.jar. Therefore, we will use this path to register the piggybank.jar file to the pig session.
- EXTRACTING JAR FILE CONTENTS
Since we got the jar file location, it is time to extract the contents of the jar file to make sure that the required JAVA class exists in that jar file. We are going to use the following command to extract the contents of the jar file.
jar tvf /usr/hdp/126.96.36.199-2557/pig/piggybank.jar
The execution of the above command is shown in the below video. Please have a look.
Jar File Extraction Process from Milind Jagre on Vimeo.
As you can see in the above video, the class org.apache.pig.piggybank.evaluation.string.UPPER exists in the piggybank.jar file. This is the class we want to use for defining the ALIAS, which can be used for converting any string to all UPPERCASE letter. You will see this in a while.
Now, let us jump to the next step.
- DEFINING ALIAS NAME FOR FULLY QUALIFIED CLASS NAME
You can use the following commands to define an ALIAS for the fully qualified class name of UPPER.java class.
DEFINE upper org.apache.pig.piggybank.evaluation.string.UPPER;
REGISTER command does the registration of the piggybank.jar file in the pig session.
DEFINE command is actually used for defining the ALIAS name for the fully qualified class name present in the piggybank.jar file. In simple words, we can say that, instead of using org.apache.pig.piggybank.evaluation.string.UPPER every time, you can trigger the same function using an ALIAS upper which is mentioned in the DEFINE command.
Hope this explanation helps.
The following screenshot shows you the execution of these commands.
The screenshot shows that there are no error messages after the execution of these commands, therefore we can say that the commands ran successfully and the ALIAS upper was defined successfully for the Java class org.apache.pig.piggybank.evaluation.string.UPPER present in the piggybank.jar file.
We can conclude this tutorial here. In the next tutorial, we are going to extend this tutorial and see how to invoke this defined ALIAS to apply to pig tuples and strings.
Hope you are enjoying the contents.
I have recently launched my website. You can check it out at www.milindjagre.com