Visible to the public Improving Energy Efficiency of Hadoop Clusters using Approximate Computing

TitleImproving Energy Efficiency of Hadoop Clusters using Approximate Computing
Publication TypeConference Paper
Year of Publication2019
AuthorsTaneja, Shubbhi, Zhou, Yi, Chavan, Ajit, Qin, Xiao
Conference Name2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS)
KeywordsApache Hadoop, approximate computing, data handling, energy conservation, Energy-efficiency, energy-efficient solutions, Hadoop, Hadoop clusters, MapReduce, Metrics, multicore computing platforms, multicore computing security, multiprocessing systems, parallel processing, pattern clustering, Pi, Pi MapReduce application, power aware computing, pubcrawl, Resiliency, Scalability, tHadoop2, thermal-aware approximate computing, thermal-aware data placement, Thermal-efficiency
AbstractThere is an ongoing search for finding energy-efficient solutions in multi-core computing platforms. Approximate computing is one such solution leveraging the forgiving nature of applications to improve the energy efficiency at different layers of the computing platform ranging from applications to hardware. We are interested in understanding the benefits of approximate computing in the realm of Apache Hadoop and its applications. A few mechanisms for introducing approximation in programming models include sampling input data, skipping selective computations, relaxing synchronization, and user-defined quality-levels. We believe that it is straightforward to apply the aforementioned mechanisms to conserve energy in Hadoop clusters as well. The emerging trend of approximate computing motivates us to systematically investigate thermal profiling of approximate computing strategies in this research. In particular, we design a thermal-aware approximate computing framework called tHadoop2, which is an extension of tHadoop proposed by Chavan et al. We investigated the thermal behavior of a MapReduce application called Pi running on Hadoop clusters by varying two input parameters - number of maps and number of sampling points per map. Our profiling results show that Pi exhibits inherent resilience in terms of the number of precision digits present in its value.
DOI10.1109/BigDataSecurity-HPSC-IDS.2019.00048
Citation Keytaneja_improving_2019