Towards Resilient Method: An exhaustive survey of fault tolerance methods in the cloud computing environment

Computer Science Review - Tập 40 - Trang 100398 - 2021
Muhammad Asim Shahid1,2, Noman Islam1,3, Muhammad Mansoor Alam1,4, M.S. Mazliham1, Shahrulniza Musa1
1Universiti Kuala Lumpur - Malaysian Institute of Information Technology, Malaysia
2Sir Syed University of Engineering and Technology, Karachi, Pakistan
3Iqra University, Pakistan
4Institute of Business Management, Karachi, Pakistan

Tài liệu tham khảo

Mukwevho, 2018, Toward a smart cloud: A review of fault-tolerance methods in cloud systems, IEEE Trans. Serv. Comput., 1 Prasad, 2020 Rathore, 2015, 5 Alzakholi, 2020, Comparison among cloud technologies and cloud performance, JASTT, 1, 40, 10.38094/jastt1219 2020 Shukla, 2019, Fault tolerance based load balancing approach for web resources, J. Chinese Inst. Eng., 42, 583, 10.1080/02533839.2019.1638307 Gupta, 2019, An efficient method for fault tolerance in cloud environment using encryption and classification, Soft Comput., 23, 13591, 10.1007/s00500-019-03896-6 Kumar, 2019, 6 Talwani, 2019, Comparison of various fault tolerance techniques for scientific workflows in cloud computing, 454 Jain, 2019, 10 2021 2021 2020 Kumar, 2015, 6 Sarmila, 2015, Survey on fault tolerant—Load balancing algorithmsin cloud computing, 1715 Arabnejad, 2017, A fuzzy load balancer for adaptive fault tolerance management in cloud platforms, 109 M.K. Edemo, Developing fault tolerance architecture for real-time systems of cloud computing, 94. Amiri, 2020, SeeMoRe: A fault-tolerant protocol for hybrid cloud environments, 1345 Sana, 2020, 15 Abdulhamid, 2018, Fault tolerance aware scheduling technique for cloud computing environment using dynamic clustering algorithm, Neural Comput. Appl., 29, 279, 10.1007/s00521-016-2448-8 Belgacem, 2020, Efficient dynamic resource allocation method for cloud computing environment, Clust. Comput., 10.1007/s10586-020-03053-x Zhilenkov, 2020, Enhanced fault tolerance in software and hardware network control systems using soft cloud storage, Autom. Doc. Math. Linguist., 54, 36, 10.3103/S0005105520010021 Gupta, 2020 Madani, 2018, 9 Hasan, 2018, Fault tolerance in cloud computing environment: A systematic survey, Comput. Ind., 99, 156, 10.1016/j.compind.2018.03.027 Khaldi, 2020, Fault tolerance for a scientific workflow system in a cloud computing environment, Int. J. Comput. Appl., 42, 705 Xie, 2020, Quantitative fault-tolerance for reliable workflows on heterogeneous IaaS clouds, IEEE Trans. Cloud Comput., 1 Goundar, 2018, Efficient fault tolerance on cloud environments, Int. J. Cloud Appl. Comput., 8, 20 2020, Preemptive fault tolerance in DDS based distributed system using application migration, IJRASET, 8, 963, 10.22214/ijraset.2020.29240 Sarmila, 2015, Survey on fault tolerant—load balancing algorithmsin cloud computing, 1715 2021 Dhingra, 2019, Algorithms to enhance the reliability of virtual nodes using adaptive fault tolerance techniques, Comput. Sci., 6 Spichkova, 2016 Sharma, 2016, Reliability and energy efficiency in cloud computing systems: Survey and taxonomy, J. Netw. Comput. Appl., 74, 66, 10.1016/j.jnca.2016.08.010 AbdElfattah, 2017, A reactive fault tolerance approach for cloud computing, 190 Boranbayev, 2018, Methods of ensuring the reliability and fault tolerance of information systems, 729 Schagaev, 2020 Tian, 2020, Cloud reliability and efficiency improvement via failure risk based proactive actions, J. Syst. Softw., 163, 10.1016/j.jss.2020.110524 Sun, 2020, QoS-aware task placement with fault-tolerance in the edge-cloud, IEEE Access., 8, 77987, 10.1109/ACCESS.2020.2977089 Shahid, 2020, A comprehensive study of load balancing approaches in the cloud computing environment and a novel fault tolerance approach, IEEE Access., 8, 130500, 10.1109/ACCESS.2020.3009184 Hosseini, 2015, Fault-tolerance techniques in cloud storage: A survey, IJDTA, 8, 183, 10.14257/ijdta.2015.8.4.19 Kaur, 2015, Fault tolerance techniques and architectures in cloud computing - a comparative analysis, 1090 Slimani, 2020, Service-oriented replication strategies for improving quality-of-service in cloud computing: a survey, Clust. Comput. Goundar, 2018, Efficient fault tolerance on cloud environments, Int. J. Cloud Appl. Comput., 8, 20 Amin, 2015, Review on fault tolerance techniques in cloud computing, IJCA, 116, 11, 10.5120/20435-2768 Peng, 2020 Priya, 2019, 5 Ataallah, 2015, Fault tolerance in cloud computing - survey, 241 Guerron, 2020, A taxonomy of quality metrics for cloud services, IEEE Access., 8, 131461, 10.1109/ACCESS.2020.3009079 Jeevarani, 2019, Load balancing and fault tolerance in cloud SHADE, J. Inf. Comput. Sci., 9, 9 2020 Han, 2020, Switch-centric Byzantine fault tolerance mechanism in distributed software defined networks, IEEE Commun. Lett., 1 Diouf, 2020, On Byzantine fault tolerance in multi-master Kubernetes clusters, Future Gener. Comput. Syst., 109, 407, 10.1016/j.future.2020.03.060 Guo, 2020 Netti, 2020, A machine learning approach to online fault classification in HPC systems, Future Gener. Comput. Syst., 110, 1009, 10.1016/j.future.2019.11.029 Nazari Cheraghlou, 2016, A survey of fault tolerance architecture in cloud computing, J. Netw. Comput. Appl., 61, 81, 10.1016/j.jnca.2015.10.004 Zhang, 2018, Overview on fault tolerance strategies of composite service in service computing, Wirel. Commun. Mob. Comput., 2018, 1 Alfandi, 2020, Blockchain solution for IoT-based critical infrastructures: Byzantine fault tolerance, 1 Nguyen, 2020 Chinnathambi, 2019, Scheduling and checkpointing optimization algorithm for Byzantine fault tolerance in cloud clusters, Clust. Comput., 22, 14637, 10.1007/s10586-018-2375-9 Ataallah, 2015, Fault tolerance in cloud computing - survey, 241 Mohammed, 2016, Optimising fault tolerance in real-time cloud computing IaaS environment, 363 Mohammed, 2016, An integrated virtualized strategy for fault tolerance in cloud computing environment, 542 Angarita, 2015, Dynamic composite web service execution by providing fault-tolerance and QoS monitoring, 371 Kaur, 2017, 7 Dhingra, 2017, Comparative analysis of fault tolerance models and their challenges in cloud computing, IJET, 6, 36, 10.14419/ijet.v6i2.7565 Sastry, 2017, 7 V. Kumar, D.S. Sharma, A Comparative Review on Fault Tolerance methods and models in Cloud Computing, 02, 7. Chiang, 2020, Analysis of a fault-tolerant framework for reliability prediction of service-oriented architecture systems, IEEE Trans. Rel., 1 Vinicius Cardoso, 2020, Employment of optimal approximations on apache hadoop checkpoint technique for performance improvements, 1 Jayasekara, 2020, A utilization model for optimization of checkpoint intervals in distributed stream processing systems, Future Gener. Comput. Syst., 110, 68, 10.1016/j.future.2020.04.019 Parasyris, 2020, Checkpoint restart support for heterogeneous HPC applications, 242 Samani, 2019, 14 Posner, 2020, A comparison of application-level fault tolerance schemes for task pools, Future Gener. Comput. Syst., 105, 119, 10.1016/j.future.2019.11.031 B. Talwar, S. Bharany, A. Arora, Proactive Detection of Deteriorating Node Based Migration For Energy-Aware Fault Tolerance, 22, 25. Noor, 2019, Novelty circular neighboring technique using reactive fault tolerance method, IJECE, 9, 5211, 10.11591/ijece.v9i6.pp5211-5217 2019 Malik, 2020, Smart routing: Towards proactive fault handling of software-defined networks, Comput. Netw., 170, 10.1016/j.comnet.2020.107104 Ragmani, 2020, Adaptive fault-tolerant model for improving cloud computing performance using artificial neural network, Procedia Comput. Sci., 170, 929, 10.1016/j.procs.2020.03.106 K.R. Kalantari, A. Ebrahimnejad, H. Motameni, Dynamic software rejuvenation in web services: a whale optimization algorithm-based approach, 14. Kumar, 2019, 6 Rezaeipanah, 2020, Providing a new approach to increase fault tolerance in cloud computing using fuzzy logic, Int. J. Comput. Appl., 1 Dauwe, 2017, An analysis of resilience techniques for exascale computing platforms, 914 Alanazi, 2020, A systematic literature review of recent trends in replication techniques, 1 Yao, 2020, A hybrid fault-tolerant scheduling for deadline-constrained tasks in cloud systems, IEEE Trans. Serv. Comput., 1 Almezeini, 2016, An enhanced workflow scheduling algorithm in cloud computing, 67 Shah, 2017, 8 Mishra, 2020, Load balancing in cloud computing: A big picture, J. King Saud Univ. - Comput. Inf. Sci., 32, 149 Rathore, 2020, Efficient hybrid load balancing algorithm, Natl. Acad. Sci. Lett., 43, 177, 10.1007/s40009-019-00834-w 2020 2020 Chinnaiah, 2018, Fault tolerant software systems using software configurations for cloud computing, J. Cloud Comput., 7, 3, 10.1186/s13677-018-0104-9 Rezaei Kalantari, 2020, Presenting a new fuzzy system for web service selection aimed at dynamic software rejuvenation, Complex Intell. Syst., 6, 697, 10.1007/s40747-020-00168-x Tamilvizhi, 2019, A novel method for adaptive fault tolerance during load balancing in cloud computing, Clust. Comput., 22, 10425, 10.1007/s10586-017-1038-6 New Fuzzy-Based Fault Tolerance Evaluation Framework for Cloud Computing | Request PDF, ResearchGate. https://doi.org/10.1007/s10922-019-09491-2. Khalil, 2019, Self-healing hardware systems: A review, Microelectron. J., 93, 10.1016/j.mejo.2019.104620 Mohammed, 2019, Failure prediction using machine learning in a virtualised HPC system and application, Clust. Comput., 22, 471, 10.1007/s10586-019-02917-1 Battula, 2020, An efficient resource monitoring service for fog computing environments, IEEE Trans. Serv. Comput., 13, 709, 10.1109/TSC.2019.2962682 B. Mohammed, A framework for efficient management of fault tolerance in cloud data centres and high- performance computing systems, 192. Moradi, 2020, Exploring fault parameter space using reinforcement learning-based fault injection, 102 Nazari Cheraghlou, 2019, New fuzzy-based fault tolerance evaluation framework for cloud computing, J. Netw. Syst. Manage., 27, 930, 10.1007/s10922-019-09491-2 Prathiba, 2017, Survey of failures and fault tolerance in cloud, 169 2020 Zhu, 2020, FT-PBLAS: PBLAS-based fault-tolerant linear algebra computation on high-performance computing systems, IEEE Access., 8, 42674, 10.1109/ACCESS.2020.2975832 Adebola, 2020 Setlur, 2020, An efficient fault tolerant workflow scheduling approach using replication heuristics and checkpointing in the cloud, J. Parallel Distrib. Comput., 136, 14, 10.1016/j.jpdc.2019.09.004 Gorbenko, 2020, Analysis of trade-offs in fault-tolerant distributed computing and replicated databases, 1 Guedes, 2020, Provenance-based fault tolerance technique recommendation for cloud-based scientific workflows: a practical approach, Clust. Comput., 23, 123, 10.1007/s10586-019-02920-6 Ledmi, 2018, Fault tolerance in distributed systems: A survey, 1 Kumari, 2018, A survey of fault tolerance in cloud computing, J. King Saud Univ. - Comput. Inf. Sci. D.K. Baruah, L. Saikia, 2015. A Review on Fault Tolerance Techniques and Algorithms in Cloud Computing Environment, Undefined. (2015). /paper/A-Review-on-Fault-Tolerance-Techniques-and-in-Cloud-Baruah-Saikia/0f00259437c7182320e0299c1f19911e2f40f5e0 (accessed March 26, 2021). D. Kochhar, A. Kumar, J. Hilda, An approach for fault tolerance in cloud computing using machine learning technique, 8. Bukhari, 2017, Dynamic ACO-based fault tolerance in grid computing, IJGDC, 10, 117, 10.14257/ijgdc.2017.10.12.11 Kumari, 2016, A study on fault tolerance solution, Int. J. Eng. Res., 4, 5 Y.M., 2016, A survey of cloud computing fault tolerance: Techniques and implementation, IJCA, 138, 34, 10.5120/ijca2016909055 Poola, 2017, A taxonomy and survey of fault-tolerant workflow management systems in cloud and distributed computing environments, 285 Pandita, 2018, Fault tolerance based comparative analysis of scheduling algorithms in cloud computing, 1 Kumar, 2019, Issues and challenges of load balancing techniques in cloud computing: A survey, ACM Comput. Surv., 51, 1, 10.1145/3281010 I.J. of S.M., 2020 Afzal, 2019, Load balancing in cloud computing – A hierarchical taxonomical classification, J. Cloud Comput., 8, 22, 10.1186/s13677-019-0146-7 Fatima, 2019, Cloud computing and load balancing, IJARET, 10, 10.34218/IJARET.10.2.2019.019 Arulkumar, 2020, Performance analysis of nature inspired load balancing algorithm in cloud environment, J. Amb. Intell. Hum. Comput. Talwani, 2017, 5 Colman-Meixner, 2016, A survey on resiliency techniques in cloud computing infrastructures and applications, IEEE Commun. Surv. Tutor., 18, 2244, 10.1109/COMST.2016.2531104 IJARCSSE. Marcotte, 2019, Multiple fault-tolerance mechanisms in cloud systems: A systematic review, 414 Ghahremani, 2020, Evaluation of self-healing systems: An analysis of the state-of-the-art and required improvements, Computers, 9, 16, 10.3390/computers9010016 K.B. Thanh, L.P. Dieu, S.D.T. Hong, T.V. Pham, H.T. Cong, A proactive fault tolerance approach for cloud computing based on takagi-sugeno fuzzy system and simulated annealing algorithm, 12. Amoon, 2019, On the design of reactive approach with flexible checkpoint interval to tolerate faults in cloud computing systems, J. Amb. Intell. Hum. Comput., 10, 4567, 10.1007/s12652-018-1139-y Fang, 2019, A multi-factor monitoring fault tolerance model based on a GPU cluster for big data processing, Inform. Sci., 496, 300, 10.1016/j.ins.2018.04.053 Demirci, 2015, A survey of machine learning applications for energy-efficient resource management in cloud computing environments, 1185 Zhang, 2017, A comparison of distributed machine learning platforms, 1 Amruthnath, 2018, A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance, 355 Xing, 2020, Multi-source fault identification based on combined deep learning, MATEC Web Conf., 309, 03037, 10.1051/matecconf/202030903037 2020 Wang, 2020, Leveraging energy function virtualization with game theory for fault-tolerant smart grid, IEEE Trans. Ind. Inf., 1 Abapour, 2020, Game theory approaches for the solution of power system problems: A comprehensive review, Arch. Comput. Methods Eng., 27, 81, 10.1007/s11831-018-9299-7 Stoicescu, 2017, Architecting resilient computing systems: A component-based approach for adaptive fault tolerance, J. Syst. Archit., 73, 6, 10.1016/j.sysarc.2016.12.005 Ma, 2016, Adaptive fault tolerant control of cooperative heterogeneous systems with actuator faults and unreliable interconnections, IEEE Trans. Automat. Control, 61, 3240, 10.1109/TAC.2015.2507864 Lee, 2019, Adaptive fault-tolerant scheduling strategies for mobile cloud computing, J. Supercomput., 75, 4472, 10.1007/s11227-019-02745-5 Baraza-Calvo, 2020, Proposal of an adaptive fault tolerance mechanism to tolerate intermittent faults in RAM, Electronics, 9, 2074, 10.3390/electronics9122074 J. Soniya, M. Tech, P. Scholar, J.A.J. Sujana, D.T. Revathi, Dynamic Fault Tolerant Scheduling Mechanism for Real Time Tasks in Cloud Computing, 6. Belgaum, 2018, Cloud service ranking using checkpoint-based load balancing in real-time scheduling of cloud computing, 667 Nicolae, 2013, BlobCR: Virtual disk based checkpoint-restart for HPC applications on IaaS clouds, J. Parallel Distrib. Comput., 73, 698, 10.1016/j.jpdc.2013.01.013 Prasad, 2020 Zhu, 2020, Blockchain based consensus checking in decentralized cloud storage, Simul. Model. Pract. Theory, 102, 10.1016/j.simpat.2019.101987 Devi, 2017, Multi level fault tolerance in cloud environment, 824 Edwin, 2019, An efficient and improved multi-objective optimized replication management with dynamic and cost aware strategies in cloud computing data center, Clust. Comput., 22, 11119, 10.1007/s10586-017-1313-6 Haider, 2017, Dynamic and adaptive fault tolerant scheduling with QoS consideration in computational grid, IEEE Access., 5, 7853, 10.1109/ACCESS.2017.2690458 Setaouti, 2017, Fault tolerance model based on service delivery quality levels in cloud computing, 84 Bu, 2017, CAMAS: A cluster-aware multiagent system for attributed graph clustering, Inf. Fus., 37, 10, 10.1016/j.inffus.2017.01.002 Sivagami, 2019, An improved dynamic fault tolerant management algorithm during VM migration in cloud data center, Future Gener. Comput. Syst., 98, 35, 10.1016/j.future.2018.11.002 Jhawar, 2017, Fault tolerance and resilience in cloud computing environments, 165 Dewangan, 2019, Self-characteristics based energy-efficient resource scheduling for cloud, Procedia Comput. Sci., 152, 204, 10.1016/j.procs.2019.05.044 A, 2016, A review on scheduling in cloud computing, IJU, 7, 09, 10.5121/iju.2016.7302 Hasan, 2019, Flexible fault tolerance in cloud through replicated cooperative resource group, Comput. Commun., 145, 176, 10.1016/j.comcom.2019.06.005 Wang, 2015, FESTAL: Fault-tolerant elastic scheduling algorithm for real-time tasks in virtualized clouds, IEEE Trans. Comput., 64, 2545, 10.1109/TC.2014.2366751 V. Sharma, Pattern Recognition based Scheduling in Cloud Computing, 2, 5. Ashu, 2017 D. Yang, J. Weidendorfer, C. Trinitis, T.K. Stner, S. Ziegler, Enabling Application-Integrated Proactive Fault Tolerance, 10. Dongarra, 2015, Fault tolerance techniques for high-performance computing, 3 Akram, 2018, Security, privacy and trust of user-centric solutions, Future Gener. Comput. Syst., 80, 417, 10.1016/j.future.2017.11.026 2017, Study on fault tolerance method in cloud platform based on workload consolidation model of virtual machine, JESTR, 10, 41, 10.25103/jestr.105.05 J. Liu, J. Zhou, R. Buyya, Software Rejuvenation Based Fault Tolerance Scheme for Cloud Applications, 4. Sun, 2017, Building a fault tolerant framework with deadline guarantee in big data stream computing environments, J. Comput. Syst. Sci., 89, 4, 10.1016/j.jcss.2016.10.010