Architecture and Code Optimization (TACO)


Search Issue
enter search term and/or author name


ACM Transactions on Architecture and Code Optimization (TACO), Volume 8 Issue 3, October 2011

Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs
Xi E. Chen, Tor M. Aamodt
Article No.: 10
DOI: 10.1145/2019608.2019609

This article proposes techniques to predict the performance impact of pending cache hits, hardware prefetching, and miss status holding register resources on superscalar microprocessors using hybrid analytical models. The proposed models focus on...

CATCH: A mechanism for dynamically detecting cache-content-duplication in instruction caches
Marios Kleanthous, Yiannakis Sazeides
Article No.: 11
DOI: 10.1145/2019608.2019610

Cache-content-duplication (CCD) occurs when there is a miss for a block in a cache and the entire content of the missed block is already in the cache in a block with a different tag. Caches aware of content-duplication can have lower miss penalty...

Managing SMT resource usage through speculative instruction window weighting
Hans Vandierendonck, André Seznec
Article No.: 12
DOI: 10.1145/2019608.2019611

Simultaneous multithreading processors dynamically share processor resources between multiple threads. In general, shared SMT resources may be managed explicitly, for instance, by dynamically setting queue occupation bounds for each thread as in...

Power gating strategies on GPUs
Po-Han Wang, Chia-Lin Yang, Yen-Ming Chen, Yu-Jung Cheng
Article No.: 13
DOI: 10.1145/2019608.2019612

As technology continues to shrink, reducing leakage is critical to achieving energy efficiency. Previous studies on low-power GPUs (Graphics Processing Units) focused on techniques for dynamic power reduction, such as DVFS (Dynamic Voltage and...

Dynamic access distance driven cache replacement
Min Feng, Chen Tian, Changhui Lin, Rajiv Gupta
Article No.: 14
DOI: 10.1145/2019608.2019613

In this article, we propose a new cache replacement policy that makes the replacement decision based on the reuse information of the cache lines and the requested data. We present the architectural support and evaluate the performance of our...

Evaluating placement policies for managing capacity sharing in CMP architectures with private caches
Ahmad Samih, Yan Solihin, Anil Krishna
Article No.: 15
DOI: 10.1145/2019608.2019614

Chip Multiprocessors (CMP) with distributed L2 caches suffer from a cache fragmentation problem; some caches may be overutilized while others may be underutilized. To avoid such fragmentation, researchers have proposed capacity sharing mechanisms...

Maintaining performance on power gating of microprocessor functional units by using a predictive pre-wakeup strategy
Chang-Ching Yeh, Kuei-Chung Chang, Tien-Fu Chen, Chingwei Yeh
Article No.: 16
DOI: 10.1145/2019608.2019615

Power gating is an effective technique for reducing leakage power in deep submicron CMOS technology. Microarchitectural techniques for power gating of functional units have been developed by detecting suitable idle regions and turning them off to...

DEFCAM: A design and evaluation framework for defect-tolerant cache memories
Hyunjin Lee, Sangyeun Cho, Bruce R. Childers
Article No.: 17
DOI: 10.1145/2019608.2019616

Advances in deep submicron technology call for a careful review of existing cache designs and design practices in terms of yield, area, and performance. This article presents a Design and Evaluation Framework for defect-tolerant Cache Memories...