Hadoop MapReduce

About Hadoop – MapReduce Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner. A MapReduce job usually splits the input data-set into independent chunks which are processed…

Continue reading

Hadoop-HDFS

HDFS :Hadoop Distributed File System   It is a special designed file system for storing huge data with Cluster of Commodity Hardware with Streaming Access Pattern. Now lets understand the terminology Cluster and Streaming Access Pattern. Cluster – is a special type of computational cluster designed specifically for storing and analyzing huge…

Continue reading

Introduction of Hadoop

What is Big Data Data which is beyond our storage capacity and beyond our Processing is Big Data For example data generated by Sensors CCTV Social Network like Facebook,linkedin..etc Online Shoping and many others…         Currently,of the total amount of data that we have today 90% is generated…

Continue reading

Apache-Hadoop

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware…

Continue reading