Elastic computing: A portable optimization framework for hybrid computers

Parallel Computing - Tập 38 - Trang 438-464 - 2012

John R. Wernsing¹, Greg Stitt¹

¹Department of Electrical & Computer Engineering, University of Florida, Gainesville, FL, United States

Tài liệu tham khảo

Advanced Micro Devices, Inc., AMD Accelerated Processing Units, 2012. <http://fusion.amd.com/>. J. Ansel, C. Chan, Y.L. Wong, M. Olszewski, Q. Zhao, A. Edelman, S. Amarasinghe, Petabricks: a language and compiler for algorithmic choice, in: Proceedings of ACM SIGPLAN Conference Programming Language Design and Implementation, 2009, pp. 38–49. C. Augonnet, S. Thibault, R.Namyst, P. Wacrenier, StarPU: a unified platform for task scheduling on heterogeneous multicore architectures, Concurrency Comput.: Practice Experience 23(2) (2011) pp. 187–198. Austin, 2002, Simplescalar: an infrastructure for computer system modeling, Computer, 35, 59, 10.1109/2.982917 Barron, 1994, Performance of optical flow techniques, Int. J. Comput. Vision, 12, 43, 10.1007/BF01420984 Bentley, 1979, Algorithms for reporting and counting geometric intersections, IEEE Trans. Comput., C-28, 643, 10.1109/TC.1979.1675432 B. Bond, K. Hammil, L. Litchev, S. Singh, FPGA circuit synthesis of accelerator data-parallel programs, in: Proceeding of 18th IEEE Annual International Symposium Field-Programmable Custom Computing Machines, 2010, pp. 167–170. Buck, 2004, Brook for GPUs: stream computing on graphics hardware, ACM Trans. Graphics., 23, 777, 10.1145/1015706.1015800 Cooper, 2001, Adaptive optimizing compilers for the 21st century, J. Supercomputing, 23, 7, 10.1023/A:1015729001611 Craven, 2007, Examining the viability of FPGA supercomputing, EURASIP J. Embedded Syst., 2007, 13, 10.1186/1687-3963-2007-093652 R. Datta, J. Li, J. Z. Wang, Content-based image retrieval: approaches and trends of the new age, in: Proceedings.of 7th ACM SIGMM International. Workshop Multimedia, Information Retrieval, 2005 pp. 253–262. Davis, 1975, A comparison of heuristic and optimum solutions in resource-constrained project scheduling, Manage. Sci., 21, 944, 10.1287/mnsc.21.8.944 Dean, 2008, Mapreduce. simplified data processing on large clusters, Comm. ACM, 51, 107, 10.1145/1327452.1327492 DeHon, 2000, The density advantage of configurable computing, Computer, 33, 41, 10.1109/2.839320 Y. Dong, Y. Dou, J. Zhou, Optimized generation of memory structure in compiling window operations onto reconfigurable hardware, in Proceedings of third International Conference Reconfigurable Computing: Architectures, Tools, and Applications, 2007, pp. 110–121. A.E. Eichenberger, K. O’Brien, P. Wu, T. Chen, P.H. Oden, D.A. Prener, J.C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, M. Gschwind, Optimizing Compiler for the Cell Processor, in: Proceedings of 14th International Conference Parallel Architectures and Compilation, Techniques, 2005, pp. 161–172. Eker, 2003, Taming heterogeneity – the ptolemy approach, Proc. IEEE, 91, 127, 10.1109/JPROC.2002.805829 Eles, 1997, System level hardware/software partitioning based on simulated annealing and Tabu search, Design Autom. Embedded Syst., 2, 5, 10.1023/A:1008857008151 Feng, 2007, The green500 list: encouraging sustainable supercomputing, Computer, 40, 50, 10.1109/MC.2007.445 M. Frigo, S.G. Johnson, FFTW: an adaptive software architecture for the FFT, in: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 1998, pp. 1381–1384. George, 2011, Novo-G: at the forefront of scalable reconfigurable supercomputing, Comput. Sci. Eng., 13, 82, 10.1109/MCSE.2011.11 Girkar, 1995, Extracting task-level parallelism, ACM T. Progr. Lang. Syst., 17, 600, 10.1145/210184.210189 B. Grattan, G. Stitt, F. Vahid, Codesign-Extended Applications, in: Proceedings of Tenth International Symposium on Hardware/Software, Codesign, 2002, pp. 1–6. E. Grobelny, C. Reardon, A. Jacobs, A. George, Simulation framework for performance prediction in the engineering of reconfigurable systems and applications, in: Proceedings of International Conference Engineering Reconfigurable Systems and Algorithms, 2007, pp. 124–130. Z. Guo, W. Najjar, F. Vahid, K. Vissers, A quantitative analysis of the speedup factors of fpgas over processors, in: Proceedings ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays, 2004, pp. 162–170. S. Gupta, N. Dutt, R. Gupta, A. Nicolau, SPARK: a high-level synthesis framework for applying parallelizing compiler transformations, in: Proceedings of 16th International Conference VLSI Design, 2003, pp. 461–466. B. Holland, K. Nagarajan, C. Conger, A. Jacobs, A.D. George, RAT: a methodology for predicting performance in application design migration to FPGAs, in: Proceedings of First International Workshop High-Performance Reconfigurable Computing Technology and Applications, 2007, pp. 1–10. P. Husbands, C. Iancu, K. Yelick, A performance analysis of the Berkeley UPC compiler, in: Proceedings of 17th Annual International. Conference Supercomputing, 2003, pp. 63–73. IBM, The Cell Project, 2012<http://www.research.ibm.com/cell/>. Ierotheou, 2001, The semi-automatic parallelisation of scientific application codes using a computer aided parallelisation toolkit, Scientific Programming, 9, 163, 10.1155/2001/327048 Impulse Accelerated Technologies, C-to-FPGA Tools, 2012. <http://www.impulsec.com/products_universal.htm>, . Intel Corporation, Intel Software Network – Code & Downloads, 2012. <http://software.intel.com/en-us/articles/code-downloads/>. Intel Corporation, Many Integrated Core (MIC) Architecture,2012. <http://www.intel.com/content/www/us/en/architecture-and-technology/many-integrated-core/intel-many-integrated-core-architecture.html>. A. Ismail, L. Shannon, FUSE: front-end user framework for O/S abstraction of hardware accelerators, in: Proceedings of IEEE 19th Annual International Symposium Field-Programmable Custom Computing Machines, 2011, pp. 170–177. Khronos Group, OpenCL, 2012<http://www.khronos.org/opencl/>. Knijnenburg, 2002 Li, 1999, Performance estimation of embedded software with instruction cache modeling, ACM Trans.Design Autom. Electronic Syst., 4, 257, 10.1145/315773.315778 C. Luk, S. Hong, H. Kim, Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping, in: Proceedings of 42nd Annual IEEE/ACM International Symposium on Microarchitecture, 2009 pp. 45–55. Macedonia, 2003, The GPU enters computing’s mainstream, Computer, 36, 106, 10.1109/MC.2003.1236476 G. Madl, N. Dutt, S. Abdelwahed, Performance estimation of distributed real-time embedded systems by discrete event simulations, in: Proceedings of Seventh ACM & IEEE International Confonference on Embedded Software, 2007, pp. 183–192. M.D. McCool, RapidMind Inc., Data-Parallel Programming on the Cell BE and the GPU Using the RapidMind Development Platform, Presented at the GSPx Multicore Applications Conference, Santa Clara, CA, October/November 2006. Mentor Graphics, Catapult C Synthesis Overview, 2012<http://www.mentor.com/products/c-based_design/catapult_c_synthesis/index.cfm>. S.G. Merchant, B.M. Holland, C. Reardon, A.D. George, H. Lam, G. Stitt, M.C. Smith, N. Alam, I. Gonzalez, E. El-Araby, P. Saha, T. El-Ghazawi, H. Simmler, Strategic challenges for application development productivity in reconfigurable computing, in: Proceedings of IEEE National Aerospace and Electronics Conference, 2008 pp. 209–218. Mercury Federal Systems, Inc., OpenCPI – Open Component Portability Infrastructure, 2012. <http://opencpi.org/>. Micheli, 1994 Moore, 2007, Vforce: an extensible framework for reconfigurable supercomputing, Computer, 40, 39, 10.1109/MC.2007.110 Musser, 1997, Introspective sorting and selection algorithms, Software Pract. Exper., 27, 983, 10.1002/(SICI)1097-024X(199708)27:8<983::AID-SPE117>3.0.CO;2-# NSF Center for High-Performance Reconfigurable Computing (CHREC), FPGA Tool-Flow Studies Workshop, 2012. <http://www.chrec.org/ftsw/>. Nudd, 2000, Pace–a toolset for the performance prediction of parallel and distributed systems, Int. J. High Perform. Comput. Appl., 14, 228, 10.1177/109434200001400306 NVIDIA Corporation, NVIDIA Developer Zone – CUDA Downloads, 2012. <http://www.nvidia.com/object/cuda_develop.html> NVIDIA Corporation, NVIDIA Developer Zone – CUDA Toolkit 3.2 Downloads, 2012. <http://developer.nvidia.com/cuda-toolkit-32-downloads>. I. Ouaiss, S. Govindarajan, V. Srinivasan, M. Kaul, R. Vemuri, An integrated partitioning and synthesis system for dynamically reconfigurable multi-FPGA architectures, in: Proceedings of 12th International Parallel Processing Symposium, and Ninth Symposium Parallel and Distributed Processing, 1998, pp. 31–36. S. Pai, R. Govindarajan, M.J. Thazhuthaveetil, PLASMA: portable programming for simd heterogeneous accelerators, in: Presented in First Workshop on Language, Compiler, and Architecture Support for GPGPU, 2010. M. Palesi, T. Givargis, Multi-objective design space exploration using genetic algorithms, in: Proceedings of Tenth International Symposium on Hardware/Software, Codesign, 2002, pp. 67–72. P.R. Panda, SystemC – a modeling platform supporting multiple design abstractions, in: Proceedings of 14th International Symposium on System, Synthesis, 2001, pp. 75–80. W. Pfeiffer, N. J. Wright, Modeling and predicting application performance on parallel computers using HPC challenge benchmarks, in: Proceedings of IEEE International Symposium on Parallel and Distributed Processing, 2008, pp. 1–12. Puschel, 2005, SPIRAL: code generation for DSP transforms, Proc. IEEE, 93, 232, 10.1109/JPROC.2004.840306 H. Quinn, L.A.S. King, M. Leeser, W. Meleis, Runtime assignment of reconfigurable hardware components for image processing pipelines, in: Proceedings of 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003, pp. 173–182. C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, C. Kozyrakis, Evaluating Mapreduce for multi-core and multiprocessor systems, in: Proceedings of IEEE 13th International Symposium on High Performance Computer, Architecture, 2007, pp. 13–24. Reardon, 2010, A simulation framework for rapid analysis of reconfigurable computing systems, ACM Trans. Reconfigurable Technol. Syst., 3, 25:1, 10.1145/1862648.1862655 Semeria, 2001, Synthesis of hardware models in c with pointers and complex data structures, IEEE Trans Very Large Scale Integration Syst., 9, 743, 10.1109/92.974889 A. Snavely, L. Carrington, N.Wolter, J. Labarta, R. Badia, A. Purkayastha, A framework for performance modeling and prediction, in: Proceedings of ACM/IEEE Conference on Supercomputing, 2002, pp. 21–21. Stitt, 2002, Energy advantages of microprocessor platforms with on-chip configurable logic, IEEE Design Test Comput., 19, 36, 10.1109/MDT.2002.1047742 G. Stitt, F. Vahid, W. Najjar, A code refinement methodology for performance-improved synthesis from C, in: Proceedings of IEEE/ACM International Conference on Computer-Aided Design, 2006, pp. 716–723. TOP500.Org, Power Consumption of Supercomputers – June 2008, <http://www.top500.org/lists/2008/06/highlights/power>. 2012. TOP500.Org, TOP500 List – June 2010, 2012.<http://www.top500.org/list/2010/06/100> TOP500.Org, Tianhe-1 – NUDT TH-1 Cluster, 2012<http://www.top500.org/system/10186>. Vuduc, 2005, OSKI: a library of automatically tuned sparse matrix kernels, J. Phys.: Conf. Ser., 16, 521, 10.1088/1742-6596/16/1/071 J.R. Wernsing, G. Stitt, Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing, in: Proceedings of ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, 2010, pp. 115–124. Whaley, 2001, Automated empirical optimization of software and the ATLAS project, Parallel Comput., 27, 3, 10.1016/S0167-8191(00)00087-9 Williams, 2010, Characterization of fixed and reconfigurable multi-core devices for application acceleration, ACM Trans. Reconfigurable Technol. Syst., 3, 19:1, 10.1145/1862648.1862649 Xilinx Inc., Intellectual Property (IP) Cores, 2012. <http://www.xilinx.com/products/intellectual-property/index.htm>.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA