Multi-prediction based scheduling for hybrid workloads in the cloud data center
Tóm tắt
Cloud computing can leverage over-provisioned resources that are wasted in traditional data centers hosting production applications by consolidating tasks with lower QoS and SLA requirements. However, the dramatic fluctuation of workloads with lower QoS and SLA requirements may impact the performance of production applications. Frequent task eviction, killing and rescheduling operations also waste CPU cycles and create overhead. This paper aims to schedule hybrid workloads in the cloud data center to reduce task failures and increase resource utilization. The multi-prediction model, including the ARMA model and the feedback based online AR model, is used to predict the current and the future resource availability. Decision to accept or reject a new task is based on the available resources and task properties. Evaluations show that the scheduler can reduce the host overload and failed tasks by nearly 70%, and increase effective resource utilization by more than 65%. The task delay performance degradation is also acceptable.
Tài liệu tham khảo
Beloglazov, A., Buyya, R.: Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Concurr. Comput. 24, 1397–1420 (2012)
Guenter, B., Jain, N., Williams, C.: Managing cost, performance, and reliability tradeoffs for energy-aware server provisioning. In: IEEE INFOCOM, pp. 702–710 (2011)
Bhattacharya, A.A., Culler, D., Friedman, E., Ghodsi, A., Shenker, S., Stoica, I.: Hierarchical scheduling for diverse datacenter workloads. In: Proceedings of the 4th Annual Symposium on Cloud Computing, pp. 1–15 (2013)
Apache: Yarn. https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/index.html (2013)
Apache: Mesos. http://mesos.apache.org/ (2011)
Google: Kubernetes. http://kubernetes.io/ (2015)
Apache: Fair Scheduler. http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html (2016)
Apache: Capacity Scheduler. https://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html (2016)
Google: Google Cluster Data. http://code.google.com/p/googleclusterdata/wiki/ClusterData2011_1 (2011)
Reiss, C., Tumanov, A.: Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In: Proceedings of the Third ACM Symposium on Cloud Computing, p. 7 (2012)
Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H., Kozuch, M.A.: Towards understanding heterogeneous clouds at scale: Google trace analysis. http://www.pdl.cmu.edu/PDL-FTP/CloudComputing/ISTC-CC-TR-12-101.pdf (2012)
Abdul-Rahman, O.A., Aida, K.: Towards understanding the usage behavior of Google cloud users: the mice and elephants phenomenon. In: 2014 IEEE 6th International Conference on Cloud Computing Technology and Science, pp. 272–277 (2014)
Box, G.E.P., Jenkins, G.M., Reinsel, G.C.: Time series analysis: forecasting and control, 4th edn, pp. 56–81. China Machine Press, Beijing (2011)
Hua, L., Qiying, H.: Forecasting and Decision Making, pp. 131–168. China Machine Press, Beijing (2012)
Jiang, H., E, H., Song, M.: Hierarchical prediction based task scheduling in hybrid data center. In: 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 17–24 (2014)
Apache: Spark. http://spark.apache.org/docs/latest/index.html (2013)
Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., Stoica, I.: Dominant resource fairness: fair allocation of multiple resource types. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI), pp. 323–336 (2011)
Schwarzkopf, M., Konwinski, A., Abd-El-Malek, M., Wilkes, J.: Omega: flexible, scalable schedulers for large compute clusters. In: European Conference on Computer Systems (EuroSys), pp. 351–364 (2013)
Boutin, E., Ekanayake, J., Lin, W., Shi, B., Zhou, J.: Apollo: scalable and coordinated scheduling for cloud-scale computing. In: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 285–300 (2014)
Delimitrou, C., Kozyrakis, C.: Quasar: resource-efficient and QoS-aware cluster management. In: Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 127–144 (2014)
Carrera, D., Steinder, M., Whalley, I., Torres, J., Ayguad, E.: Enabling resource sharing between transactional and batch workloads using dynamic application placement. In: Middleware’08, Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware archive, pp. 203–222 (2008)
Carrera, D., Steinder, M., Whalley, I., Torres, J., Ayguad, E.: Managing SLAs of heterogeneous workloads using dynamic application placement. In: HPDC’08 2008, Proceedings of the 17th International Symposium on High Performance Distributed Computing, pp. 217–218 (2008)
Garg, S.K., Gopalaiyengar, S.K., Buyya, R.: SLA-based resource provisioning for heterogeneous workloads in a virtualized cloud datacenter. In: 11th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2011, pp. 371–384 (2011)
Garg, S.K., Toosi, A.N., Gopalaiyengar, S.K., Buyya, R.: SLA-based virtual machine management for heterogeneous workloads in a cloud datacenter. J. Netw. Comput. Appl. 45, 108–120 (2014)
Dodonov, E., Mello, R.F.: A novel approach for distributed application scheduling based on prediction of communication events. Future Gener. Comput. Syst. 26, 740–752 (2010)
Curinom, C., Difallahu, D.E., Douglasm, C., Krishnanm, S., Ramakrishnanm, R., Raom, S.: Reservation-based scheduling: if you’re late don’t blame us! In: SOCC ’14 Proceedings of the ACM Symposium on Cloud Computing, pp. 1–14 (2014)
Sharma, B., Wood, T., Das, C.R.: HybridMR: a hierarchical MapReduce scheduler for hybrid data centers. In: IEEE 33rd International Conference on Distributed Computing Systems, pp. 102–111 (2013)
Farahat, M.A., Talaat, M.: Short-term load forecasting using curve fitting prediction optimized by genetic algorithms. Int. J. Enegry Eng. 2, 23–38 (2012)
Khan, A., Yan, X., Tao, S., Nikos, A.: Workload characterization and prediction in the cloud: a multiple time series approach. In: IEEE Network Operations and Management Symposium (NOMS), pp. 1287–1294 (2012)
Yang, Q., Peng, C., Yu, Y., Zhao, H., Zhou, Y., Wang, Z., Du, S.: Host load prediction based on PSR and EA-GMDH for cloud computing system. In: IEEE Third International Conference on Cloud and Green Computing, pp. 9–15 (2013)
Yang, D., Cao, J., Yu, C., Xiao, J.: A multi-step-ahead CPU load prediction approach in distributed system. In: Second International Conference on Cloud and Green Computing, pp. 206–213 (2012)
Di, S., Kondo, D., Cirne, W.: Host load prediction in a Google compute cloud with a Bayesian model. In: Proceedings of the IEEE/ACM Conference on High Performance Computing Networking, Storage and Analysis, SC, pp. 1–11 (2012)
Di, S., Kondo, D., Cirne, W.: Google hostload prediction based on bayesian model with optimized feature combincation. J. Parallel Distrib. Comput. 74, 1820–1832 (2014)
Zhang, Q., Zhani, M.F., Zhang, S., Zhu, Q., Boutaba, R., Hellerstein, J.L.: Dynamic energy-aware capacity provisioning for cloud computing environments. In: ICAC’12 Proceedings of the 9th International Conference on Autonomic Computing, pp. 145–154 (2012)
Cheng, L., Zhang, Q., Boutaba, R.: Mitigating the negative impact of preemption on heterogeneous mapreduce workloads. In: CNSM ’11, Proceedings of the 7th International Conference on Network and Services Management, pp. 189–197 (2011)
Verma, A., Pedrosa, L., Korupolu, M.: Large-scale cluster management at Google with Borg. In: Proceedings of the Tenth European Conference on Computer (Systems EuroSys), p. 18 (2015)
ARMA: Auto-Regressive and Moving Average. https://en.wikipedia.org/wiki/Autoregressive%E2%80%93moving-average_model (2015)
Reiss, C., Wilkes, J.: Google cluster-usage traces: format + schema (version of 2011.10.27, for trace version 2). http://code.google.com/p/googleclusterdata/wiki/ClusterData2011_1 (2011)
Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A., Buyya, R.: CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw. Pract. Exp. 41, 23–50 (2011)
IBM: IBM SPSS Statistics. http://www.spss.co.in/ (2014)