ACM DL

Architecture and Code Optimization (TACO)

Menu

Search Issue
enter search term and/or author name

Archive


ACM Transactions on Architecture and Code Optimization (TACO), Volume 1 Issue 2, June 2004

Removing communications in clustered microarchitectures through instruction replication
Alex Aletà, Josep M. Codina, Antonio González, David Kaeli
Pages: 127-151
DOI: 10.1145/1011528.1011529
The need to communicate values between clusters can result in a significant performance loss for clustered microarchitectures. In this work, we describe an optimization technique that removes communications by selectively replicating an appropriate...

A low-power in-order/out-of-order issue queue
Yu Bai, R. Iris Bahar
Pages: 152-179
DOI: 10.1145/1011528.1011530
To better address power concerns, a good design strategy should be flexible enough to dynamically reconfigure available resources according to the application's needs such that extra power is dissipated only when it is really needed. In this work, we...

Implementing branch-predictor decay using quasi-static memory cells
Philo Juang, Kevin Skadron, Margaret Martonosi, Zhigang Hu, Douglas W. Clark, Philip W. Diodato, Stefanos Kaxiras
Pages: 180-219
DOI: 10.1145/1011528.1011531
With semiconductor technology advancing toward deep submicron, leakage energy is of increasing concern, especially for large on-chip array structures such as caches and branch predictors. Recent work has suggested that larger, aggressive branch...

A low-complexity fetch architecture for high-performance superscalar processors
Oliverio J. Santana, Alex Ramirez, Josep L. Larriba-Pey, Mateo Valero
Pages: 220-245
DOI: 10.1145/1011528.1011532
Fetch engine performance is a key topic in superscalar processors, since it limits the instruction-level parallelism that can be exploited by the execution core. In the search of high performance, the fetch engine has evolved toward more efficient...