Optimizing DSP and media benchmarks for Pentium 4: hardware and software issues

D. Eliemble1
1Dept. of Electr. & Comput. Eng., Univ. of Toronto, Ont., Canada

Tóm tắt

By examining the speed-up resulting from using Pentium 4 SIMD instructions for DSP kernels (FFT) and two different multimedia programs (the MPEG-2 codec and a matching pursuit video codec), we discuss the hardware and software issues that limit performance. The cost of unaligned memory accesses and the lack of instructions summing the different parts of an XMM register in the present implementation of Intel SIMD instructions limit the efficiency of dot products. C programmer's habits often prevent compiler vectorization or complicate in-lining of assembly code in many DSP and multimedia applications.

Từ khóa

#Digital signal processing #Hardware #Kernel #Matching pursuit algorithms #Video codecs #Software performance #Costs #Registers #Program processors #Assembly

Tài liệu tham khảo

0 1999, Intel Technology Journal 0, Using the RDTSC Instruction for Performance Monitoring 1997, Intel Technology Journal 0 embree, 1995, C Algorithms for Real-Time DSP 0 10.1109/MICRO.1998.742767 0, Matching Pursuit Experimental Video Codec 0, Http //standard Dictel com/ftR/video-site/seQuences/ frande, 2001, Compiler transformation of pointers to explicit array accesses in DSP applications, Proc of the ETAPS Cod on Compiler Construction LNCS 2027, 69