Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Khám Phá Khả Năng Song Song Của GPU: Nghiên Cứu Trường Hợp Thuật Toán Berlekamp-Massey

Springer Science and Business Media LLC - Tập 23 - Trang 1007-1024 - 2019

Hanan Ali¹, Ghada M. Fathy¹, Zeinab Fayez¹, Walaa Sheta¹

¹Informatic Research Institute, City for Scientific Research, Alexandria, Egypt

Tóm tắt

Các kiến trúc bộ xử lý đồ họa (GPU) đang ngày càng trở nên có thể lập trình, mang lại tiềm năng gia tăng tốc độ đáng kể cho nhiều ứng dụng mục đích tổng quát so với các bộ xử lý mục đích tổng quát (CPU) hiện tại. Tuy nhiên, có một số kỹ thuật tối ưu hóa được sử dụng để tối đa hóa lợi ích của tài nguyên GPU. Nghiên cứu này khai thác các kỹ thuật tối ưu hóa cho kiến trúc GPU hỗ trợ CUDA nhằm đạt được hiệu suất tốt nhất cho Thuật Toán Berlekamp-Massey (BMA) như một ví dụ nghiên cứu. Thuật Toán Berlekamp-Massey (BMA) là một trong những giải pháp tốt nhất để tìm kiếm bộ dịch chuyển hồi tiếp tuyến tính ngắn nhất, điều này rất quan trọng cho một số ứng dụng như xử lý số và mật mã. Kết quả thử nghiệm cho thấy cài đặt BMA tối ưu hóa nhanh hơn gần 160 lần so với cài đặt tuần tự không bit trên CPU, nhanh hơn 7 lần so với cài đặt tuần tự bit và nhanh hơn 4 lần so với cài đặt song song bit ban đầu.

Từ khóa

#Bộ xử lý đồ họa #GPU #CUDA #Thuật toán Berlekamp-Massey #Tối ưu hóa #Hiệu suất tính toán.

Tài liệu tham khảo

Ali, H., Ouyang, M., Soliman, A., Sheta, W.: Parallelizing the Berlekamp-Massey algorithm. Int. J. Comput. Sci. Inf. Secur. 13(11), 42 (2015) Berlekamp, E.: Algebraic Coding Theory. World Scientific Publishing, Singapore (2015) Bradley, T.: Assess, parallelize, optimize, deploy. https://devblogs.nvidia.com/assess-parallelize-optimize-deploy/ (2012) Chen, N., Yan, Z.: Complexity analysis of Reed-Solomon decoding over GF (2 m) without using syndromes. EURASIP J. Wirel. Commun. Netw. 2008(1), 843634 (2008) Didier, F.: Efficient erasure decoding of Reed-Solomon codes. http://arXiv.org/arXiv:0901.1886 (2009) Elsaid, H.A.E.A.: Design and Implementation of Reed-Solomon Decoder Using Decomposed Inversion less Berlekamp-Massey Algorithm. Faculty of Engineering, Cairo University, Giza (2010) Greenberg, S., Feldblum, N., Melamed, G.: Implementation of the Berlekamp-Massey algorithm using a DSP. In: Proceedings of the 2004 11th IEEE International Conference on Electronics, Circuits and Systems. ICECS, pp. 358–361. IEEE (2004) Harris, M.: Optimizing CUDA. SC07: High performance computing with CUDA (2007) Henkel, W.: Another description of the Berlekamp-Massey algorithm. IEE Proc. I-Commun. Speech Vision 136(3), 197–200 (1989) Katz, J., Shacham, H.: Advances in cryptology–CRYPTO 2017. In: Proceedings of the 37th Annual International Cryptology Conference, Santa Barbara, CA, USA, vol. 10401, 20–24 Aug 2017, Springer (2017) Kotter, R.: A fast parallel implementation of a Berlekamp-Massey algorithm for algebraic-geometric codes. IEEE Trans. Inf. Theory 44(4), 1353–1368 (1998) Mark, H.: Optimizing parallel reduction in CUDA. NVIDIA CUDA SDK 2, 15 (2008) Massey, J.: Shift-register synthesis and bch decoding. IEEE Trans. Inf. Theory 15(1), 122–127 (1969) Mohebbi, H.: Parallel SIMD CPU and GPU implementations of Berlekamp–Massey algorithm and its error correction application. Int. J. Parallel Program. 47(1), 137–160 (2018) Nvidia, C.: Programming guide (2010) Nvidia, W.: Whitepaper NVIDIAS next generation CUDA compute architecture. ReVision, pp. 1–22 (2009) Sarwate, D.V., Shanbhag, N.R.: High-speed architectures for Reed-Solomon decoders. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 9(5), 641–655 (2001) Schmidt, G., Sidorenko, V.R., Bossert, M.: Syndrome decoding of Reed–Solomon codes beyond half the minimum distance based on shift-register synthesis. IEEE Trans. Inf. Theory 56(10), 5245–5252 (2010) Spinner, J., Freudenberger, J.: A decoder with soft decoding capability for high-rate generalized concatenated codes with applications in non-volatile flash memories. In: Proceedings of the 30th Symposium on Integrated Circuits and Systems Design (SBCCI), pp. 185–190. IEEE (2017) Tilavat, V., Shukla, Y.: Simplification of procedure for decoding reed-solomon codes using various algorithms: an introductory survey. Int. J. Eng. Dev. Res. 2(1) (2014) Xiao, S., Feng, W.C.: Inter-block GPU communication via fast barrier synchronization. In: Proceedings of the IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–12. IEEE (2010)

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA