Data-Efficient Performance Modeling for Configurable Big Data Frameworks by Reducing Information Overlap Between Training Examples

Big Data Research - Tập 30 - Trang 100358 - 2022
Zhiqiang Liu1, Xuanhua Shi1, Hai Jin1
1Services Computing Technology and System Lab, Cluster and Grid Computing Lab, National Engineering Research Center for Big Data Technology and System, Huazhong University of Science and Technology, Wuhan 430074, China

Tài liệu tham khảo

Plageras, 2018, Efficient IoT-based sensor big data collection-processing and analysis in smart buildings, Future Gener. Comput. Syst., 82, 349, 10.1016/j.future.2017.09.082 Gupta, 2018, Advances in security and privacy of multimedia big data in mobile and cloud computing, Multimed. Tools Appl., 77, 9203, 10.1007/s11042-017-5301-x Gupta, 2020, Soft computing techniques for big data and cloud computing, Soft Comput., 24, 5483, 10.1007/s00500-020-04766-2 Stergiou, 2020, Secure Machine Learning Scenario from Big Data in Cloud Computing via Internet of Things Network, 525 Dean, 2008, Mapreduce: simplified data processing on large clusters, Commun. ACM, 51, 107, 10.1145/1327452.1327492 Zaharia, 2012, Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing, 15 Xu, 2015, Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software, 307 Herodotou, 2020, A survey on automatic parameter tuning for big data processing systems, ACM Comput. Surv., 53, 10.1145/3381027 Lu, 2019, Speedup your analytics: automatic parameter tuning for databases and big data systems, Proc. VLDB Endow., 12, 1970, 10.14778/3352063.3352112 Koch, 2018, Autotune: a derivative-free optimization framework for hyperparameter tuning, 443 Jamshidi, 2018, Learning to sample: exploiting similarities across environments to learn performance models for configurable systems, 71 Trotter, 2019, Forecasting a storm: divining optimal configurations using genetic algorithms and supervised learning, 136 Herodotou, 2011, Profiling, what-if analysis, and cost-based optimization of mapreduce programs, Proc. VLDB Endow., 4, 1111, 10.14778/3402707.3402746 Bei, 2016, Rfhoc: a random-forest approach to auto-tuning hadoop's configuration, IEEE Trans. Parallel Distrib. Syst., 27, 1470, 10.1109/TPDS.2015.2449299 Yu, 2018, Datasize-aware high dimensional configurations auto-tuning of in-memory cluster computing, 564 Li, 2020, Statically inferring performance properties of software configurations, 10:1 Velez, 2020, Configcrusher: towards white-box performance analysis for configurable systems, Autom. Softw. Eng., 27, 265, 10.1007/s10515-020-00273-8 Ha, 2019, Performance prediction for configurable software with deep sparse neural network, 1095 Guo, 2013, Variability-aware performance prediction: a statistical learning approach, 301 Venkataraman, 2016, Ernest: efficient performance prediction for large-scale advanced analytics, 363 Nair, 2017, Using bad learners to find good configurations, 257 Bao, 2018, Autoconfig: automatic configuration tuning for distributed message systems, 29 Marathe, 2017, Performance modeling under resource constraints using deep transfer learning, 31:1 Bao, 2019, Actgan: automatic configuration tuning for software systems with generative adversarial networks, 465 Van Aken, 2017, Automatic database management system tuning through large-scale machine learning, 1009 Alipourfard, 2017, Cherrypick: adaptively unearthing the best cloud configurations for big data analytics, 469