Tuning lock-based multicore program based on sliding windows to tolerate data race

Springer Science and Business Media LLC - Tập 75 - Trang 7872-7894 - 2019
Suxia Zhu1, Zhigang Chen2, Guanglu Sun1
1School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
2School of Mechatronics Engineering, Harbin Institute of Technology, Harbin, China

Tóm tắt

Because in-house debugging and test are difficult to discover all potential data races in multicore programs, it is necessary and significant to tolerate the potential data races in the production-run phase to secure the correct execution. However, the existing tolerating methods are limited to some kinds of data races. This paper proposes a new data-race tolerating approach, which can detect and adjust the data races whether it is in the protection of critical section or lack of protection to improve the correctness of multicore programs. It uses sliding windows to accommodate the memory instructions in critical section or recent memory instructions lack of protection and detects the potential data races which are more likely to cause errors. Then, by delaying the critical reversion points, data races are adjusted to reduce the probability of software failure. To implement the tolerating approach, the current multicore processor need not change its original cache coherence protocol and just adds very little hardware. Simulation results show that it brings low hardware, low bandwidth overhead, and negligible slowdown.

Tài liệu tham khảo

Netzer RHB, Miller BP (1992) What are race conditions?: some issues and formalizations. ACM Lett Program Lang Syst (LOPLAS) 1(1):74–88 Wu J, Cui H, Yang J (2010) Bypassing races in live applications with execution filters. OSDI 10:1–3 Ratanaworabhan P et al (2012) Efficient runtime detection and toleration of asymmetric races. IEEE Trans Comput 61(4):548–562 Rajamani S, Ramalingam G, Ranganath VP, Vaswani K (2009) ISOLATOR: dynamically ensuring isolation in concurrent programs. ASPLOS 44:181–192 Qi, S et al (2012) Pacman: tolerating asymmetric data races with unintrusive hardware. In: IEEE 18th International Symposium on High Performance Computer Architecture (HPCA), IEEE Qi, S et al (2014) Dynamically detecting and tolerating if-condition data races. In: IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), IEEE, 2014 Orosa L, Lourenço J (2016) A hardware approach to detect, expose and tolerate high level data races. In: The 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP). IEEE, pp 159–167 Lucia B, Ceze L, Strauss K (2010) ColorSafe: architectural support for debugging and dynamically avoiding multi-variable atomicity violations. ACM SIGARCH Comput Arch News 38(3):222–233 Marathe VJ, Dice D (2014) Systems and methods for detecting and tolerating atomicity violations between concurrent code blocks. U.S. Patent No. 8,732,682 Abadi M, Harris T, Mehrara M (2009) Transactional memory with strong atomicity using off-the-shelf memory protection hardware. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Lucia B, Devietti J, Strauss K, Ceze L (2008) Atom-aid: detecting and surviving atomicity violations. In: International Symposium on Computer Architecture Jin G et al (2012) Automated concurrency-bug fixing. OSDI 12(2012):221–236 Yu J, Narayanasamy S (2010) Tolerating concurrency bugs using transactions as lifeguards. In: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) The Splash-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp 24–36 SDTimes. Testers spend too much time testing. http://www.sdtimes.com/SearchResult/31134. Accessed 2012 Muzahid A, Suárez D, Qi S et al (2009) SigRace: signature-based data race detection. ACM SIGARCH Comput Arch News 37(3):337–348 Savage S, Burrows M, Nelson G et al (1997) Eraser: a dynamic data race detector for multithreaded programs. ACM Trans Comput Syst (TOCS) 15(4):391–411 Bloom BH (1970) Space/time trade-offs in hash coding with allowable errors. CACM 13(7):422–426 Lusk E, Boyle J, Butler R, Disz T, Glickfeld B, Overbeek R, Patterson J, Stevens R (1988) Portable programs for parallel processors. Rinehart & Winston, Holt Martin MM, Sorin DJ, Beckmann BM, Marty MR, Xu M, Alameldeen AR, Moore KE, Hill MD, Wood DA (2005) Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. Comput Arch News 33:92–99 Orosa L, Lourenço J (2014) A hardware approach for detecting, exposing and tolerating high level atomicity violations. In: Workshop on Dependable Multicore and Transactional Memory Systems (DMTM) Lucia B, Ceze L (2013) Cooperative empirical failure avoidance for multithreaded programs. ACM SIGPLAN Notices 48(4):39–50 Krena B, Letko Z, Tzoref R, Ur S, Vojnar T (2007) Healing data races on-the-fly. In: ACM Workshop on Parallel and Distributed Systems: Testing and Debugging Ratanaworabhan P et al (2012) Hardware support for enforcing isolation in lock-based parallel programs. In: Proceedings of the 26th ACM International Conference on Supercomputing. ACM Zhang W et al (2013) ConAir: featherweight concurrency bug recovery via single-threaded idempotent execution. ACM SIGARCH Comput Arch News 41(1):113–126 Liu P, Tripp O, Zhang C (2014) Grail: context-aware fixing of concurrency bugs. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM Tchamgoue GM, Kim KH, Jun YK (2016) EventHealer: bypassing data races in event-driven programs. J Syst Softw 118:208–220