Hadoop MapReduce Architecture

Hadoop MapReduce Architecture in this matter will be clarified in advance what the Hadoop MapReduce. Hadoop MapReduce is a framework that is intended to process data in a distributed manner giant and parallel in hadoop cluster. In terms of architecture, Hadoop MapReduce consists of a computer that serves as JobTracker. The working relationship between JobTracker and TaskTracker is like a master and slaves working relationship. As a master, JobTracker holds information on the processing of data blocks on HDFS in parallel, JobTracker know exactly which computer processes each of these data blocks. JobTracker oversee computers are processing data blocks in parallel and control it. Meanwhile, TaskTracker is computer workers who process data blocks that are part of the function. In this context, the Task is a process running on each computer in processing each data block. In a certain time, TaskTraker will manage and control a number of Task and regularly provide reports to JobTracker. Periodic reports of TaskTracker to JobTracker is called heartbeat. Based on this heartbeat JobTracker can determine the condition TaskTracker and control it.

At the Hadoop cluster, where Namenode and JobTracker not necessarily in the same computer, while Datanode and TaskTracker made to be in the same computer to preserve data locality. So a few brief explanation of Hadoop MapReduce Architecture that may be helpful in developing or learn more on Hadoop.

Hadoop MapReduce Architecture

0 Response to "Hadoop MapReduce Architecture"

Post a Comment