Hadoop Distributed Framework is designed to handle large data sets. It can scale out to several thousands of nodes and process enormous amount of data in Parallel Distributed Approach. Apache Hadoop consists of two components. First one is HDFS (Hadoop Distributed File System) and the second component is Map Reduce (MR). Hadoop is write once and read many times.
HDFS is a scalable distributed storage file system and MapReduce is designed for parallel processing of data. If we look at the High Level Architecture of Hadoop, HDFS and Map Reduce components present inside each layer. The Map Reduce layer consists of job tracker and task tracker. HDFS layer consists of Name Node and Data Nodes.

 
No comments:
Post a Comment