Single Node Hadoop Installation on Ubuntu 14.04

Following are the steps which should be followed in order to install Hadoop 2.6.0 on ubuntu 14.04

open terminal window
——————————————————————————————————————————–
step 1 : install java
run following commands
command 1 : sudo apt-get update
command 2 : sudo apt-get install default-jdk
this will install JAVA in your system
command 1 : javac -version
command 2 : which java
above command should give this output => /usr/bin/java
command 3 : which javac
above command should give this output => /usr/bin/javac
——————————————————————————————————————————–
step 2 : add new user with name hduser for hadoop operations
run following commands
command 1 : sudo addgroup hadoop
command 2 : sudo adduser –ingroup hadoop hduser
——————————————————————————————————————————–
step 3 : install ssh
command 1 : sudo apt-get install ssh
——————————————————————————————————————————–
step 4 : verify ssh installed correctly
command 1 : which ssh
above command should give this output => /usr/bin/ssh
command 2 : which sshd
above command should give this output => /usr/sbin/sshd
——————————————————————————————————————————–
step 5 : use our hduser
command 1 : su hduser
——————————————————————————————————————————–
step 6 : generate ssh key for hduser
command 1 : cd
command 2 : ssh-keygen -t rsa -P “”
do not give any file name and press enter
command 3 : cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
——————————————————————————————————————————–
step 7 : verify ssh installed correctly
command 1 : ssh localhost
above command should give this output => asking for yes/no
——————————————————————————————————————————–
step 8 : add hduser as sudo user
command 1 : su <username>
command 2 : sudo adduser hduser sudo
command 3 : sudo su hduser
command 4 : cd
——————————————————————————————————————————–
step 9 : install hadoop
command 1 : wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
command 2 : sudo mv hadoop-2.6.0.tar.gz /usr/local/
command 3 : cd /usr/local/
command 4 : sudo tar xvzf hadoop-2.6.0.tar.gz
command 5 : sudo mv hadoop-2.6.0 hadoop
command 6 : sudo chown -R hduser:hadoop /usr/local/hadoop
command 7 : update-alternatives –config java
——————————————————————————————————————————–
step 10 : configure environment variable PATH in .bashrc file
command 1 : cd

before pasting, just verify JAVA_HOME path exists or not.
sudo ls /usr/lib/jvm/java-7-openjdk-amd64
if above command gives you files, then go ahead
otherwise, check sudo ls /usr/lib/jvm/ and add appropriate path in JAVA_HOME below.

command 2 : sudo nano ~/.bashrc
paste following content in .bashrc file

#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS=”-Djava.library.path=$HADOOP_INSTALL/lib”
#HADOOP VARIABLES END

ctrl+X
Y
command 3 : source ~/.bashrc
——————————————————————————————————————————–
step 11 : verify configuration done properly
command 1 : javac -version
command 2 : which java
command 3 : which javac
——————————————————————————————————————————–
step 12 : configuring hadoop
command 1 : readlink -f /usr/bin/javac

command 2 : sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh
paste following path in hadoop-env.sh
if you change this path above, then do change iot over here also.
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

ctrl+X
Y

command 3 : sudo mkdir -p /app/hadoop/tmp
command 4 : sudo chown hduser:hadoop /app/hadoop/tmp

command 5 : sudo nano /usr/local/hadoop/etc/hadoop/core-site.xml
paste following code between <configuration> and </configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri’s scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri’s authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>

ctrl+X
Y

command 6 : cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml

command 7 : sudo nano /usr/local/hadoop/etc/hadoop/mapred-site.xml
paste following code between <configuration> and </configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If “local”, then jobs are run in-process as a single map
and reduce task.
</description>
</property>

ctrl+X
Y

command 8 : sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
command 9 : sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
command 10 : sudo chown -R hduser:hadoop /usr/local/hadoop_store

command 11 : sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml
paste following code between <configuration> and </configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>

ctrl+X
Y

command 12 : cd /usr/local/hadoop_store/hdfs/namenode/
command 13 : hadoop namenode -format
——————————————————————————————————————————–
step 13 : start hadoop cluster
command 1 : cd /usr/local/hadoop/sbin
command 2 : sudo su hduser
command 3 : start-all.sh
——————————————————————————————————————————–
step 14 : verify hadoop cluster started or not
command 1 : jps
command 2 : open below addresses in browser window
URL 1 : localhost:50070 => namonode information
URL 2 : localhost:50090 => datanode information
URL 3 : localhost:8088 => resource manager
——————————————————————————————————————————–
step 15 : stop hadoop cluster
command 1 : stop-all.sh

Advertisements

2 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s