Turkish Journal of Electrical Engineering and Computer Sciences
DOI
10.3906/elk-2001-106
Abstract
As the data-driven paradigm for intelligent systems design is gaining prominence, performance requirements have become very stringent, leading to numerous fine-tuned versions of Hadoop and its MapReduce programming model. However, very few researchers have investigated the effect of intelligent reducer placement on Hadoop's performance. This paper delves into this much ignored reducer placement phase for improving Hadoop's performance and proposes to spawn reduce phase of Hadoop tasks in an asynchronous fashion across nodes in a Hadoop cluster. The main contributions of this paper are: (i) to track when map phase of tasks are completed, (ii) to count the number of maps completed, and finally (iii) assign reducers to Hadoop nodes based on map counts such that run-time data copying is minimized. To this end, this paper presents a novel counter based reducer placement (CBRP) algorithm based on the counter values maintained by JobTracker at the rack and node levels. Experiments conducted demonstrate the merit of the proposed reducer placement with average improvements ranging between 5% and 17% experienced across different benchmarks with both late shuffle and early shuffle.
Keywords
MapReduce, HDFS, Hadoop rack awareness, reducer placement
First Page
437
Last Page
453
Recommended Citation
HUSSAIN, MIR WAJAHAT; REDDY, K HEMANT; and ROY, DIPTENDU SINHA
(2021)
"A counter based approach for reducer placement with augmented Hadoop rackawareness,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 29:
No.
1, Article 28.
https://doi.org/10.3906/elk-2001-106
Available at:
https://journals.tubitak.gov.tr/elektrik/vol29/iss1/28
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons