Combining Loop Transformations Considering Caches and Scheduling

International Journal of Computer & Information Sciences - Tập 26 - Trang 479-503 - 1998

Michael E. Wolf, Dror E. Maydan, Ding-Kai Chen

Tóm tắt

The performance of modern microprocessors is greatly affected by cache behavior, instruction scheduling, register allocation and loop overhead. High-level loop transformations such as fission, fusion, tiling, interchanging and outer loop unrolling (e.g., unroll and jam) are well known to be capable of improving all these aspects of performance. Difficulties arise because these machine characteristics and these optimizations are highly interdependent. Interchanging two loops might, for example, improve cache behavior but make it impossible to allocate registers in the inner loop. Similarly, unrolling or interchanging a loop might individually hurt performance but doing both simultaneously might help performance. Little work has been published on how to combine these transformations into an efficient and effective compiler algorithm. In this paper, we present a model that estimates total machine cycle time taking into account cache misses, software pipelining, register pressure and loop overhead. We then develop an algorithm to intelligently search through the various, possible transformations, using our machine model to select the set of transformations leading to the best overall performance. We have implemented this algorithm as part of the MIPSPro commercial compiler system. We give experimental results showing that our approach is both effective and efficient in optimizing numerical programs.

Tài liệu tham khảo

M. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley (1996). M. E. Wolf, A loop transformation theory and improving locality and parallelism in nested loops, Ph.D. Thesis, Stanford University (August 1992). S. Carr, K. S. McKinley, and C. Tseng, Compiler optimizations for improving data locality, Proc. Sixth Intl. Conf. Archit. Support for Progr. Lang. Oper. Syst. (October 1994). S. Coleman and K. S. McKinley, Tile size selection using cache organization and data layout, SIGPLAN '95 Conf. Progr. Lang. Design and Implementation (June 1995). S. Carr, Combining optimization for cache and instruction-level parallelism, Proc. Conf. Parallel Architectures and Compiler Techniques, Boston, Massachusetts (October 20–23, 1996). S. Carr and K. Kennedy, Improving the ration of memory operations in floating-point operations in loops, ACM Trans. Progr. Lang. Syst. 16(6):1768–1810 (November 1994). Y. Guan, Unroll-and-jam guided by a linear-algebra-based data-reuse model, Masters Thesis, Computer Science Department, Michigan Technological University (1995). D. Maydan, J. Hennessy, and M. Lam, Efficient and exact data dependence analysis, SIGPLAN '91 Conf. Progr. Lang. Design and Implementation (June 1991). M. Lam, Software pipelining: An effective scheduling technique for VLIW machines, Proc. SIGPLAN '88 Conf. Progr. Lang. Design and Implementation (June 1988). J. Ruttenberg, G. R. Gao, A. Stoutchinin, and W. Lichtenstein, Software pipelining showdown: Optimal vs. heuristic methods in a production compiler, SIGPLAN '96 Conf. Progr. Lang. Design and Implementation (June 1996). A. Aho, R. Sethi, and J. Ullman, J. Compilers, Principles, Techniques, and Tools, Addison-Wesley (1986). D. Gannon, W. Jalby, and K. Gallivan, Strategies for cache and local memory management by global program transformation, J. Parallel and Distrib. Computing 5:587–616 (1988). M. lam, E. Rothberg, and M. E. Wolf, The cache performance and optimizations of blocked algorithms, Proc. Fourth Intl. Conf. Archit. Support for Progr. Lang. Oper. Syst. (April 1991). D. Callahan, J. Cocke, and K. Kennedy, Estimating interlock and improving balance for pipelined machines, J. Parallel Distrib. Computing 5:334–358 (1988).

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA