AEGEUS++: an energy-aware online partition skew mitigation algorithm for mapreduce in cloud

Vimalkumar Kumaresan1, R. Baskaran1, P. Dhavachelvan2
1College of Engineering Guindy, Anna University, Chennai, India
2Department of Computer Science, Pondicherry University, Pondicherry, India

Tóm tắt

Từ khóa


Tài liệu tham khảo

Ahmad, F., Lee, S., Thottethodi, M., Vijaykumar, T.: Puma: Purdue mapreduce benchmarks suite (2012)

Ananthanarayanan, G., Kandula, S., Greenberg, A.G., Stoica, I., Lu, Y., Saha, B., Harris, E.: Reining in the outliers in map-reduce clusters using mantri. In: OSDI, vol. 10, p. 24 (2010)

Bulmer, M.G.: Principles of Statistics. Courier Corporation, Mineola (1979)

Chen, Q., Yao, J., Xiao, Z.: Libra: lightweight data skew mitigation in mapreduce. IEEE Trans Parallel Distrib. Syst. 26(9), 2520–2533 (2015)

Company, M. http://www.mckinsey.com/business-functions/business-technology/our-insights/the-need-to-lead-in-data-and-analytics . Accessed 10 May 2016 (2016)

Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

Dhawalia, P., Kailasam, S., Janakiram, D.: Chisel: A resource savvy approach for handling skew in mapreduce applications. In: 2013 IEEE Sixth International Conference on Cloud Computing, pp. 652–660. IEEE (2013)

Dhawalia, P., Kailasam, S., Janakiram, D.: Chisel++: handling partitioning skew in mapreduce framework using efficient range partitioning technique. In: Proceedings of the Sixth International Workshop on Data Intensive Distributed Computing, pp. 21–28. ACM (2014)

Elmeleegy, K., Olston, C., Reed, B.: Spongefiles: Mitigating data skew in mapreduce using distributed memory. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 551–562. ACM (2014)

Greenberg, A., Hamilton, J., Maltz, D.A., Patel, P.: The cost of a cloud: research problems in data center networks. ACM SIGCOMM Comput. Commun. Rev. 39(1), 68–73 (2008)

Hackenberg, D., Schöne, R., Ilsche, T., Molka, D., Schuchart, J., Geyer, R.: An energy efficiency feature survey of the intel haswell processor. In: Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International, pp. 896–904. IEEE (2015)

Hadoop, A. https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/hdfsdesign.html

Hammoud, M., Sakr, M.F.: Locality-aware reduce task scheduling for mapreduce. In: Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on, pp. 570–576. IEEE (2011)

Hartog, J., Dede, E., Govindaraju, M.: Mapreduce framework energy adaptation via temperature awareness. Cluster Comput. 17(1), 111–127 (2014)

Ibrahim, S., Jin, H., Lu, L., Wu, S., He, B., Qi, L.: Leen: Locality/fairness-aware key partitioning for mapreduce in the cloud. In: Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on, pp. 17–24. IEEE (2010)

Ibrahim, S., Moise, D., Chihoub, H.E., Carpen-Amarie, A., Bougé, L., Antoniu, G.: Towards efficient power management in mapreduce: investigation of cpu-frequencies scaling on power efficiency in hadoop. In: International Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, pp. 147–164. Springer, Berlin (2014)

Intel: Intel xeon e5-e3 v3 spec update. Accessed 4 Jan 2017 (2017)

Jain, R., Chiu, D.M., Hawe, W.R.: A quantitative measure of fairness and discrimination for resource allocation in shared computer system, vol. 38. Eastern Research Laboratory, Digital Equipment Corporation, Hudson (1984)

Kaushik, R.T., Bhandarkar, M.: Greenhdfs: towards an energy-conserving, storage-efficient, hybrid hadoop compute cluster. In: Proceedings of the USENIX annual technical conference, p. 109 (2010)

Kim, W., Shin, D., Yun, H.S., Kim, J., Min, S.L.: Performance comparison of dynamic voltage scaling algorithms for hard real-time systems. In: Real-Time and Embedded Technology and Applications Symposium, 2002. Proceedings. Eighth IEEE, pp. 219–228. IEEE (2002)

Kumaresan, V., Baskaran, R.: Aegeus: An online partition skew mitigation algorithm for mapreduce. In: Proceedings of the International Conference on Informatics and Analytics, p. 100. ACM (2016)

Komarasamy, D., Muthuswamy, V.: Deadline constrained adaptive multilevel scheduling system in cloud environment. KSII Trans. Internet Inf. Syst. (TIIS) 9(4), 1302–1320 (2015)

Kwon, Y., Balazinska, M., Howe, B., Rolia, J.: Skewtune: mitigating skew in mapreduce applications. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 25–36. ACM (2012)

Le, Y., Liu, J., Ergün, F., Wang, D.: Online load balancing for mapreduce with skewed data input. In: IEEE INFOCOM 2014-IEEE Conference on Computer Communications, pp. 2004–2012. IEEE (2014)

Leverich, J., Kozyrakis, C.: On the energy (in) efficiency of hadoop clusters. ACM SIGOPS Oper. Syst. Rev. 44(1), 61–65 (2010)

Li, P., Ju, L., Jia, Z., Sun, Z.: Sla-aware energy-efficient scheduling scheme for hadoop yarn. In: High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conference on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on, pp. 623–628. IEEE (2015)

Liu, Z., Zhang, Q., Boutaba, R., Liu, Y., Wang, B.: Optima: on-line partitioning skew mitigation for mapreduce with resource adjustment. J. Netw. Syst. Manag. 25, 859–883 (2016)

Liu, Z., Zhang, Q., Zhani, M.F., Boutaba, R., Liu, Y., Gong, Z.: Dreams: dynamic resource allocation for mapreduce with data skew. In: 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), pp. 18–26. IEEE (2015)

Payberah, A.H., Kavalionak, H., Kumaresan, V., Montresor, A., Haridi, S.: Clive: cloud-assisted p2p live streaming. In: Peer-to-Peer Computing (P2P), 2012 IEEE 12th International Conference on, pp. 79–90. IEEE (2012)

Riquelme, C., Zhang, B., Johari, R.: Online active linear regression via thresholding. arXiv:1602.02845 (2016)

Stack, O. https://www.openstack.org/

Van Heddeghem, W., Lambert, S., Lannoo, B., Colle, D., Pickavet, M., Demeester, P.: Trends in worldwide ict electricity consumption from 2007 to 2012. Comput. Commun. 50, 64–76 (2014)

Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., et al.: Apache hadoop yarn: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, p. 5. ACM (2013)

vCloud. http://www.vcloudnews.com/every-day-big-data-statistics-2-5-quintillion-bytes-of-data-created-daily . Accessed 10 May 2016 (2016)

Verma, A., Cherkasova, L., Campbell, R.H.: Aria: automatic resource inference and allocation for mapreduce environments. In: Proceedings of the 8th ACM International Conference on Autonomic Computing, pp. 235–244. ACM (2011)

Wang, G., Wang, S., Luo, B., Shi, W., Zhu, Y., Yang, W., Hu, D., Huang, L., Jin, X., Xu, W.: Increasing large-scale data center capacity by statistical power control. In: Proceedings of the Eleventh European Conference on Computer Systems, p. 8. ACM (2016)

Wirtz, T., Ge, R.: Improving mapreduce energy efficiency for computation intensive workloads. In: Green Computing Conference and Workshops (IGCC), 2011 International, pp. 1–8. IEEE (2011)

Zaheilas, N., Kalogeraki, V.: Real-time scheduling of skewed mapreduce jobs in heterogeneous environments. In: 11th International Conference on Autonomic Computing (ICAC 14), pp. 189–200 (2014)

Zhang, Z., Feng, X.: New methods for deviation-based outlier detection in large database. In: Fuzzy Systems and Knowledge Discovery, 2009. FSKD’09. Sixth International Conference on, vol. 1, pp. 495–499. IEEE (2009)