S. Pandey, M. Supriya, Abhilash Shrivastava
Jun 2, 2017
2017 International Conference on Computational Intelligence in Data Science(ICCIDS)
MapReduce is one of the famous programming methods used by the developers and researchers for “big data”. MapReduce basically runs on Hadoop distributed framework and works efficiently to give better results for large data set. It uses two functional algorithms to process chunks of data. Map function collects the data from local Hadoop Distributed File System (HDFS) and further divides it into number of small chunks for parallel processing. Shuffling process sorts the intermediate results and sends the key and value pairs to the reducer phase. So, when the same key and value pairs are sent by the shuffler to the same reducer, a high volume of network blockage occurs which in turn impose a severe constraint on the processing of the data application. This paper proposes an aggregation algorithm to overcome such traffic in using MapReduce.