Architecture and Code Optimization (TACO)


Search Issue
enter search term and/or author name


ACM Transactions on Architecture and Code Optimization (TACO), Volume 11 Issue 2, June 2014

Virtual Ways: Low-Cost Coherence for Instruction Set Extensions with Architecturally Visible Storage
Theo Kluter, Samuel Burri, Philip Brisk, Edoardo Charbon, Paolo Ienne
Article No.: 15
DOI: 10.1145/2576877

Instruction set extensions (ISEs) improve the performance and energy consumption of application-specific processors. ISEs can use architecturally visible storage (AVS), localized compiler-controlled memories, to provide higher I/O bandwidth than...

A Portable Optimization Engine for Accelerating Irregular Data-Traversal Applications on SIMD Architectures
Bin Ren, Todd Mytkowicz, Gagan Agrawal
Article No.: 16
DOI: 10.1145/2632215

Fine-grained data parallelism is increasingly common in the form of longer vectors integrated with mainstream processors (SSE, AVX) and various GPU architectures. This article develops support for exploiting such data parallelism for a class of...

VGRIS: Virtualized GPU Resource Isolation and Scheduling in Cloud Gaming
Zhengwei Qi, Jianguo Yao, Chao Zhang, Miao Yu, Zhizhou Yang, Haibing Guan
Article No.: 17
DOI: 10.1145/2632216

To achieve efficient resource management on a graphics processing unit (GPU), there is a demand to develop a framework for scheduling virtualized resources in cloud gaming. In this article, we propose VGRIS, a resource management framework for...

A Retargetable Static Binary Translator for the ARM Architecture
Bor-Yeh Shen, Wei-Chung Hsu, Wuu Yang
Article No.: 18
DOI: 10.1145/2629335

Machines designed with new but incompatible Instruction Set Architecture (ISA) may lack proper applications. Binary translation can address this incompatibility by migrating applications from one legacy ISA to a new one, although binary...

Revisiting LP-NUCA Energy Consumption: Cache Access Policies and Adaptive Block Dropping
Darío Suárez Gracia, Alexandra Ferrerón, Luis Montesano Del Campo, Teresa Monreal Arnal, Víctor Viñals Yúfera
Article No.: 19
DOI: 10.1145/2632217

Cache working-set adaptation is key as embedded systems move to multiprocessor and Simultaneous Multithreaded Architectures (SMT) because interthread pollution harms system performance and battery life. Light-Power NUCA (LP-NUCA) is a working-set...

Deadline-Constrained Clustered Scheduling for VLIW Architectures using Power-Gated Register Files
Zhibin Liang, Wei Zhang, Yung-Cheng Ma
Article No.: 20
DOI: 10.1145/2632218

Designing energy-efficient Digital Signal Processor (DSP) cores has become a key concern in embedded systems development. This paper proposes an energy-proportional computing scheme for Very Long Instruction Word (VLIW) architectures. To make the...

Performance Portability Across Heterogeneous SoCs Using a Generalized Library-Based Approach
Shuangde Fang, Zidong Du, Yuntan Fang, Yuanjie Huang, Yang Chen, Lieven Eeckhout, Olivier Temam, Huawei Li, Yunji Chen, Chengyong Wu
Article No.: 21
DOI: 10.1145/2608253

Because of tight power and energy constraints, industry is progressively shifting toward heterogeneous system-on-chip (SoC) architectures composed of a mix of general-purpose cores along with a number of accelerators. However, such SoC...

Hadoop Extensions for Distributed Computing on Reconfigurable Active SSD Clusters
Abdulrahman Kaitoua, Hazem Hajj, Mazen A. R. Saghir, Hassan Artail, Haitham Akkary, Mariette Awad, Mageda Sharafeddine, Khaleel Mershad
Article No.: 22
DOI: 10.1145/2608199

In this article, we propose new extensions to Hadoop to enable clusters of reconfigurable active solid-state drives (RASSDs) to process streaming data from SSDs using FPGAs. We also develop an analytical model to estimate the performance of RASSD...