What is the full form of YARN?

admin

7/2/2023
All Articles

#What is the full form of YARN?

What is the full form of YARN?

What is the full form of YARN?

The abbreviation YARN stands for "Yet Another Resource Negotiator." As the framework for Hadoop's resource management and task scheduling, YARN is a part of the Apache Hadoop ecosystem. The prior MapReduce-specific resource management system was replaced by it in Hadoop 2.x.
 
YARN is a framework that manages resources and schedules jobs in a Hadoop cluster, providing a flexible and scalable platform for various data processing applications.
 
 

Step 1: Download Apache Hadoop

 
Download the latest distribution from the Hadoop website (http://hadoop.apache.org/). For example, as root do the following:
 
# cd /root
# wget http://mirrors.ibiblio.org/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
 
Next create and extract the package in /opt/yarn:
 
# mkdir –p /opt/yarn
# cd /opt/yarn
# tar xvzf /root/hadoop-2.2.0.tar.gz
 
 

Step 2: Set JAVA_HOME

 
# echo "export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/" > /etc/ profile.d/java.sh
 

Step 3: Create Users and Groups

 
It is best to run the various daemons with separate accounts. Three accounts (yarn, hdfs, mapred) in the group hadoop can be created as follows:
 
# groupadd hadoop
# useradd -g hadoop yarn
# useradd -g hadoop hdfs
# useradd -g hadoop mapred
 

Step 4: Make Data and Log Directories

 
# mkdir -p /var/data/hadoop/hdfs/nn
# mkdir -p /var/data/hadoop/hdfs/snn
# mkdir -p /var/data/hadoop/hdfs/dn
# chown hdfs:hadoop /var/data/hadoop/hdfs –R
# mkdir -p /var/log/hadoop/yarn
# chown yarn:hadoop /var/log/hadoop/yarn -R
 
Next, move to the YARN installation root and create the log directory and set the owner and group as follows:
 
# cd /opt/yarn/hadoop-2.2.0
# mkdir logs
# chmod g+w logs
# chown yarn:hadoop . -R
 

Step 5: Configure core-site.xml

 
From the base of the Hadoop installation path (e.g., /opt/yarn/hadoop-2.2.0), edit the etc/hadoop/core-site.xml file. 
 
<configuration>
       <property>
               <name>fs.default.name</name>
               <value>hdfs://localhost:9000</value>
       </property>
       <property>
               <name>hadoop.http.staticuser.user</name>
               <value>hdfs</value>
       </property>
</configuration>
 

Step 6: Configure hdfs-site.xml

 
From the base of the Hadoop installation path, edit the etc/hadoop/hdfs-site.xml file. 
 
<configuration>
 <property>
   <name>dfs.replication</name>
   <value>1</value>
 </property>
 <property>
   <name>dfs.namenode.name.dir</name>
   <value>file:/var/data/hadoop/hdfs/nn</value>
 </property>
 <property>
   <name>fs.checkpoint.dir</name>
   <value>file:/var/data/hadoop/hdfs/snn</value>
 </property>
 <property>
   <name>fs.checkpoint.edits.dir</name>
   <value>file:/var/data/hadoop/hdfs/snn</value>
 </property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:/var/data/hadoop/hdfs/dn</value>
 </property>
</configuration>
 

Step 7: Configure mapred-site.xml

 
From the base of the Hadoop installation, edit the etc/hadoop/mapred-site.xml file.
 
<configuration>
<property>
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
 </property>
</configuration>
 

Step 8: Configure yarn-site.xml

 
From the base of the Hadoop installation, edit the etc/hadoop/yarn-site.xml file.
 
<configuration>
<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
 </property>
 <property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
 </property>
</configuration>
 

Step 9: Modify Java Heap Sizes

 
The Hadoop installation uses several environment variables that determine the heap sizes for each Hadoop process. 
 
HADOOP_HEAPSIZE="500"
HADOOP_NAMENODE_INIT_HEAPSIZE="500"
 
HADOOP_JOB_HISTORYSERVER_HEAPSIZE=250
 
JAVA_HEAP_MAX=-Xmx500m
 
YARN_HEAPSIZE=500
 

Step 10: Format HDFS

 
For the HDFS NameNode to start, it needs to initialize the directory where it will hold its data.
# su - hdfs
$ cd /opt/yarn/hadoop-2.2.0/bin
$ ./hdfs namenode -format
 

Step 11: Start the HDFS Services

 
$ jps
15140 SecondaryNameNode
15015 NameNode
15335 Jps
15214 DataNode
 
 
The same can be done for the NameNode and SecondaryNameNode.
 

Step 12: Start YARN Services

 
 
# su - yarn
$ cd /opt/yarn/hadoop-2.2.0/sbin
$ ./yarn-daemon.sh start resourcemanager
$ ./yarn-daemon.sh start nodemanager
starting nodemanager, logging to /opt/yarn/hadoop-2.2.0/logs/yarn-yarn-nodemanager-limulus.out