enter search term and/or author name
Removing communications in clustered microarchitectures through instruction replication
Alex Aletà, Josep M. Codina, Antonio González, David Kaeli
The need to communicate values between clusters can result in a significant performance loss for clustered microarchitectures. In this work, we describe an optimization technique that removes communications by selectively replicating an appropriate...
A low-power in-order/out-of-order issue queue
Yu Bai, R. Iris Bahar
To better address power concerns, a good design strategy should be flexible enough to dynamically reconfigure available resources according to the application's needs such that extra power is dissipated only when it is really needed. In this work, we...
Implementing branch-predictor decay using quasi-static memory cells
Philo Juang, Kevin Skadron, Margaret Martonosi, Zhigang Hu, Douglas W. Clark, Philip W. Diodato, Stefanos Kaxiras
With semiconductor technology advancing toward deep submicron, leakage energy is of increasing concern, especially for large on-chip array structures such as caches and branch predictors. Recent work has suggested that larger, aggressive branch...
A low-complexity fetch architecture for high-performance superscalar processors
Oliverio J. Santana, Alex Ramirez, Josep L. Larriba-Pey, Mateo Valero
Fetch engine performance is a key topic in superscalar processors, since it limits the instruction-level parallelism that can be exploited by the execution core. In the search of high performance, the fetch engine has evolved toward more efficient...