Sunday 18 September 2016

Hadoop Cluster Practice Commands

















How to Start / Stop Hadoop Process:

Name Node: 

hadoop-daemon.sh start namenode
hadoop-daemon.sh stop namenode

Data Node: 

hadoop-daemon.sh start datanode
hadoop-daemon.sh stop datanode

Secondary Name Node: 

hadoop-daemon.sh start secondarynamenode
hadoop-daemon.sh stop secondarynamenode

Job Tracker: 

hadoop-daemon.sh start jobtracker
hadoop-daemon.sh stop jobtracker

Task Tracker:

hadoop-daemon.sh start tasktracker
hadoop-daemon.sh stop tasktracker

Resource Manager: 

yarn-daemon.sh start resourcemanager
yarn-daemon.sh stop resourcemanager

Node Manager: 

yarn-daemon.sh start nodemanager
yarn-daemon.sh stop nodemanager


Job History Server:
mr-jobhistory-daemon.sh start historyserver
mr-jobhistory-daemon.sh stop historyserver






Hadoop Required URLS
Name Node :                http://orienit1:50070
Resource Manager :     http://orienit2:8088


1. Create a new folder with your hostname (i.e orienit1 / orienit2 / orienit3) using below command.

hadoop fs -mkdir /orienit


2. Put some files from Local File System to HDFS using below command

hadoop fs -put <local file system path> <hdfs path>

hadoop fs -put /etc/hosts /orienit/hosts


3. Read the Data from HDFS using below command

hadoop fs -cat /orienit/hosts


4. Change the Replication factor using below commands

Increase the replication number:

hadoop fs -setrep 5 /orienit/hosts


Decrease the replication number:

hadoop fs -setrep 3 /orienit/hosts


5. Transfer the data from one cluster to other cluster using below command

hadoop distcp hdfs://nn1:8020/<src path> hdfs://nn2:8020/<dst path>

where nn1 is first cluster namenode ip or hostname

where nn2 is second cluster namenode ip or hostname


6. Commissioning and Decommissioning the Nodes in Hadoop Cluster

In Name Node machine modify below changes


1. create include file in /home/kalyan/work folder

2. create exclude file in /home/kalyan/work folder

3. update hdfs-site.xml with below configurations


<property>
<name>dfs.hosts</name>
<value>/home/kalyan/work/include</value>
</property>

<property>
<name>dfs.hosts.exclude</name>
<value>/home/kalyan/work/exclude</value>
</property>

4. execute below command to reflect the hdfs changes

hadoop dfsadmin -refreshNodes

5. update yarn-site.xml with below configurations



<property>
<name>yarn.resourcemanager.nodes.include-path</name>
<value>/home/kalyan/work/include</value>
</property>


<property>
<name>yarn.resourcemanager.nodes.exclude-path</name>
<value>/home/kalyan/work/exclude</value>
</property>

6. execute below command to reflect the mr changes

yarn rmadmin -refreshNodes

7. verify the changes in browser





Share this article with your friends.

No comments :

Post a Comment

Related Posts Plugin for WordPress, Blogger...