Hadoop High Availability – FATAL ha.ZKFailoverController: Unable to start failover controller. Parent znode does not exist

 

Recently on working hadoop (version 2.5.1) cluster we got this issue while starting ZookeepreFailOverController(zkfc).

After debugging we found its due to missing/corrupted parent znode in zookeepre cluster.

This fix this we have used following command:

$ bin/hdfs zkfc -formatZK

 

We ran this command before starting hdfs.

After formating, zkfc started and everything started working smoothly.

Hadoop – Decommission datanode

 

Decommissioning is process to remove one or multiple datanodes from Hadoop.

To decommission a datanode you need to use following process:

Login to namenode host.

Add follwoing configuration in/home/guest/hadoop-2.5.1/etc/hadoop/hdfs-site.xml

<property>
 <name>dfs.hosts.exclude</name>
 <value>/home/guest/hadoop-2.5.1/etc/hadoop/decommission-nodes</value>
</property>

After adding dfs.hosts.exclude property you need to restart HDFS.

Then add datanode hostname that you want to remove in /home/guest/hadoop-2.5.1/etc/hadoop/decommission-nodes

Now run follwoing command to start decommissioning datanode

$ hadoop-2.5.1/bin/hdfs dfsadmin -refreshNodes

This process will run for few minutes depending on data size in datanode, keep monitor decommissioning status in on http://namenode:50070

After finishing decommissioning you can remove that datanode.

 

Configure hadoop job failure percent

 

For some applications, it is undesirable to abort the job when few tasks fail, as it may be possible to use the results of the job despite of some failures.

In this case the maximum percentage of tasks that are allowed to fail without triggering job failure can be set for the job.

Map tasks are controlled by using mapred.max.map.failures.percent property. If we set this value as 50, map tasks will get finished even though 50% of tasks are killed without failing the job.

Reduce tasks are controlled by using property mapred.max.reduce.failures.percent property. If we set this value as 30, reduce tasks will get finished even though 30% of tasks are killed without failing the job.

-Sany

By Sandeep Posted in Hadoop

Linux SSH without password

 

Whenever we login into remote server with ssh, it requires password authentication.

Let’s try to ssh check that you can ssh to the local host.

$ ssh localhost

By default it should ask for password.

To login without password we need to generate ssh keys, to generate use following command:

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

Above command will create 2 files id_dsa, and id_dsa.pub in .ssh directory which is located in home directory.

Now copy id_dsa.pub to authorized_keys.

$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Then try to ssh to your local host, it shouldn’t ask for password.

Similarly to login into a remote server copy id_dsa.pub content into remote server ~/.ssh/authorized_key file. Use following command to copy.

$ cat id_dsa.pub | ssh user@serverName/IP 'cat >> .ssh/authorized_keys'

Some recent version’s of ssh requires following permissions and authorized_keys2:

  • Put the public key in .ssh/authorized_keys2 
    • $ cp .ssh/authorized_keys .ssh/authorized_keys2
  • Change permission of .ssh directory to 700
    • $ chmod 700 .ssh
  • Change permission of .ssh/authorized_keys2 to 640
    • $ chmod 640 .ssh/authorized_keys2

-Sany

Hadoop No Class Definition Found Exception

What Is Apache Hadoop?

The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.

No class definition found exception in Hadoop while running mapreduce job:

Hadoop is used to process big data sets.

To process big data sets we need to write some maprereduce classes using hadoop api.

To process data in hadoop, it needs a job file which should contain mapreduce classes, dependent lib’s, and a hadoop cluster.

Even though we include all lib’s and successfully execute our class from command line without hodoop cluster, hadoop may throw no class definition found exception if we run in cluster.

Some tips to avoid this problem:

Never include mapreduce class files jar in library of job file if there is a dependency between your mapreduce class and other libraries in your lib. Instead of including your mapreduce class in library, keep mapreduce classes as it in your package directory structure in job.

If we make a jar of maprereduce class files and include it in lib of job file the problem we face is hadoop won’t load other dependent lib jar form job. The reason for this is since we do call only required mapreduce class from job it loads only jar which contain the mapreduce code. So other jar’s wont loaded while we running mapreduce job in hadoop cluster that will give us no class definition found exception even though we included all jar files required in lib of our job file.

Structure of hadoop job file and creating a job file:

For example lets consider job file name as Test.job

Extract content from Test.job in a temporary directory using jar command, and copy Test.job into temporary directory.

$ mkdir temporary

$ cd temporary

$ jar xf Test.job

$ ls 

Output:

com    lib

where com is package name structure which contain all required mapreduce classes, and lib should contain all library files required by the mapreduce classes.

Update job file after making any changes again with jar command as shown below

$ jar cf Test.job * (This will create new Test.job file in your temporary directory)

Lets say our mapreduce class name is TestMapReduce in package com.test.abc

To run TestMapReduce class in hadoop cluster:

$ HADOOP_HOME/bin/hadoop jar Test.job com.test.abc.TestMapReduce <InputPath> <OutputPath>

-Sany

By Sandeep Posted in Hadoop