On augmenting trace cache for high-bandwidth value prediction

IEEE Transactions on Computers - Tập 51 Số 9 - Trang 1074-1088 - 2002
Sang-Jeong Lee1, Pen-Chung Yew2
1Division of Information Technology Engineering, Soonchunhyang University, ChungNam, South Korea
2Department of Computer Science and Engineering, University of Minnesota, MN, USA

Tóm tắt

Value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction and speculatively executes its data-dependent instructions based on the predicted outcome. As the instruction fetch rate and issue rate of processors increase, the potential data dependences among instructions issued in the same cycle also increase. Value prediction and speculative execution become critical to keep the issue rate high. Unfortunately, most of the proposed value prediction schemes focused only on the accuracy of the prediction. They have yet to consider the bandwidth required to access the value prediction tables. In this paper, we focus on the bandwidth issues of the value prediction. We propose augmenting the trace cache (which was proposed to provide the required fetch bandwidth for wide-issue ILP processors) with a copy of the predicted values and moving the generation of those predicted values (which require accessing the value prediction tables) from the instruction fetch stage to a later stage, e.g., the writeback stage. Such a change will allow "selective value prediction," i.e., only those instructions which require value prediction will access the value prediction tables. It can significantly reduce the bandwidth requirement of value prediction tables. We also use a dynamic classification scheme to steer predictor updates to behavior-specific tables (such as last-value, stride, two-level, etc.). A relatively even split among such table accesses further moderates the bandwidth requirement of those tables.

Từ khóa

#Bandwidth #Hardware #Performance gain #Accuracy #Registers #Clocks #Prediction algorithms #Decoding

Tài liệu tham khảo

lipasti, 1997, Value Locality and Speculative Execution lipasti, 1996, Exceeding the Limit via Value Prediction, Proc 29th Ann Int l Symp Microarchitecture (MICRO-29), 10.1109/MICRO.1996.566464 10.1109/MICRO.1997.645819 10.1109/PACT.2000.888339 10.1145/123465.123475 lee, 2000, Decoupled Value Prediction on Trace Processors, Proc Sixth Int'l Symp High Performance Computer Architecture (HPCA-6) 10.1109/MICRO.1997.645815 10.1109/MICRO.1995.476833 sazeides, 1997, Implementations of Context-Based Value Predictors 10.1109/MICRO.1997.645794 burger, 1997, The Simplescalar Tool Set, Version 2.0, 10.1145/268806.268810 10.1109/12.752661 10.1109/ISCA.1999.765940 mcfarling, 1993, Combining Branch Predictors klein osowski, 2000, Adapting the SPEC 2000 Benchmark Suite for Simulation-Based Computer Architecture Research, Proc Workshop Workload Characterization Int l Conf Computer Design johnson, 1991, Superscalar Microprocessor Design 10.1109/MICRO.1997.645793 10.1109/HPCA.1999.744342 10.1109/12.752652 10.1109/MICRO.1997.645805 10.1109/PACT.1998.727186 rychlik, 1998, Efficient and Accurate Value Prediction Using Dynamic Classification sato, 1998, Analyzing Overhead of Reissued Instructions on Data Speculative Processors, Proc Workshop Performance Analysis and Its Impaction on Design (ISCA-25) postiff, 1999, Performance Limits of Trace Caches, J Instruction Level Parallelism 10.1145/342001.339654 zhao, 2001, Using Hyperprediction to Compensate for Delayed Updates in Value Predictors 10.1109/MICRO.1996.566447 reinman, 1999, A Scalable Front-End Architecture for Fast InstructionDelivery, Proc 26th Ann Int'l Symp Computer Architecture (ISCA-26), 10.1145/307338.300999 gabbay, 1996, Speculative Execution Based on Value Prediction, TR 1080 Electrical Eng Dept Technion Israel Inst of Technology 10.1109/MICRO.1998.742779 10.1109/ICPP.1999.797385 10.1109/ISCA.1998.694787 10.1145/277830.277840