Architecture and Code Optimization (TACO)


Search Issue
enter search term and/or author name


ACM Transactions on Architecture and Code Optimization (TACO), Volume 12 Issue 1, April 2015

NoCMsg: A Scalable Message-Passing Abstraction for Network-on-Chips
Christopher Zimmer, Frank Mueller
Article No.: 1
DOI: 10.1145/2701426

The number of cores of contemporary processors is constantly increasing and thus continues to deliver ever higher peak performance (following Moore’s transistor law). Yet high core counts present a challenge to hardware and software alike....

Accelerating Divergent Applications on SIMD Architectures Using Neural Networks
Beayna Grigorian, Glenn Reinman
Article No.: 2
DOI: 10.1145/2717311

The purpose of this research is to find a neural-network-based solution to the well-known problem of branch divergence in Single Instruction Multiple Data (SIMD) architectures. Our approach differs from existing techniques that handle branch (or...

Performance-Energy Considerations for Shared Cache Management in a Heterogeneous Multicore Processor
Anup Holey, Vineeth Mekkat, Pen-Chung Yew, Antonia Zhai
Article No.: 3
DOI: 10.1145/2710019

Heterogeneous multicore processors that integrate CPU cores and data-parallel accelerators such as graphic processing unit (GPU) cores onto the same die raise several new issues for sharing various on-chip resources. The shared last-level cache...

Dynamic MIPS Rate Stabilization for Complex Processors
Jinho Suh, Chieh-Ting Huang, Michel Dubois
Article No.: 4
DOI: 10.1145/2714575

Modern microprocessor cores reach their high performance levels with the help of high clock rates, parallel and speculative execution of a large number of instructions, and vast cache hierarchies. Modern cores also have adaptive features to...

MAGIC: Malicious Aging in Circuits/Cores
Naghmeh Karimi, Arun Karthik Kanuparthi, Xueyang Wang, Ozgur Sinanoglu, Ramesh Karri
Article No.: 5
DOI: 10.1145/2724718

The performance of an IC degrades over its lifetime, ultimately resulting in IC failure. In this article, we present a hardware attack (called MAGIC) to maliciously accelerate NBTI aging effects in cores. In this attack, we identify the input...

CERE: LLVM-Based Codelet Extractor and REplayer for Piecewise Benchmarking and Optimization
Pablo De Oliveira Castro, Chadi Akel, Eric Petit, Mihail Popov, William Jalby
Article No.: 6
DOI: 10.1145/2724717

This article presents Codelet Extractor and REplayer (CERE), an open-source framework for code isolation. CERE finds and extracts the hotspots of an application as isolated fragments of code, called codelets. Codelets can be modified,...

HRF-Relaxed: Adapting HRF to the Complexities of Industrial Heterogeneous Memory Models
Benedict R. Gaster, Derek Hower, Lee Howes
Article No.: 7
DOI: 10.1145/2701618

Memory consistency models, or memory models, allow both programmers and program language implementers to reason about concurrent accesses to one or more memory locations. Memory model specifications balance the often conflicting needs for precise...

Generalized Task Parallelism
Kevin Streit, Johannes Doerfert, Clemens Hammacher, Andreas Zeller, Sebastian Hack
Article No.: 8
DOI: 10.1145/2723164

Existing approaches to automatic parallelization produce good results in specific domains. Yet, it is unclear how to integrate their individual strengths to match the demands and opportunities of complex software. This lack of integration has both...