enter search term and/or author name
ALEA: A Fine-Grained Energy Profiling Tool
Lev Mukhanov, Pavlos Petoumenos, Zheng Wang, Nikos Parasyris, Dimitrios S. Nikolopoulos, Bronis R. De Supinski, Hugh Leather
Article No.: 1
Energy efficiency is becoming increasingly important, yet few developers understand how source code changes affect the energy and power consumption of their programs. To enable them to achieve energy savings, we must associate energy consumption...
Defragmentation of Tasks in Many-Core Architecture
Anuj Pathania, Vanchinathan Venkataramani, Muhammad Shafique, Tulika Mitra, Jörg Henkel
Article No.: 2
Many-cores can execute multiple multithreaded tasks in parallel. A task performs most efficiently when it is executed over a spatially connected and compact subset of cores so that performance loss due to communication overhead imposed by the...
Main Memory in HPC: Do We Need More or Could We Live with Less?
Darko Zivanovic, Milan Pavlovic, Milan Radulovic, Hyunsung Shin, Jongpil Son, Sally A. Mckee, Paul M. Carpenter, Petar Radojković, Eduard Ayguadé
Article No.: 3
An important aspect of High-Performance Computing (HPC) system design is the choice of main memory capacity. This choice becomes increasingly important now that 3D-stacked memories are entering the market. Compared with conventional Dual In-line...
WCET-Aware Dynamic I-Cache Locking for a Single Task
Wenguang Zheng, Hui Wu, Qing Yang
Article No.: 4
Caches are widely used in embedded systems to bridge the increasing speed gap between processors and off-chip memory. However, caches make it significantly harder to compute the worst-case execution time (WCET) of a task. To alleviate this...
Pareto Governors for Energy-Optimal Computing
Rathijit Sen, David A. Wood
Article No.: 6
The original definition of energy-proportional computing does not characterize the energy efficiency of recent reconfigurable computers, resulting in nonintuitive “super-proportional” behavior. This article introduces a new definition...
Micro-Sector Cache: Improving Space Utilization in Sectored DRAM Caches
Mainak Chaudhuri, Mukesh Agrawal, Jayesh Gaur, Sreenivas Subramoney
Article No.: 7
Recent research proposals on DRAM caches with conventional allocation units (64 or 128 bytes) as well as large allocation units (512 bytes to 4KB) have explored ways to minimize the space/latency impact of the tag store and maximize the effective...
Energy Transparency for Deeply Embedded Programs
Kyriakos Georgiou, Steve Kerrison, Zbigniew Chamski, Kerstin Eder
Article No.: 8
Energy transparency is a concept that makes a program’s energy consumption visible, from hardware up to software, through the different system layers. Such transparency can enable energy optimizations at each layer and between layers, as...
LD: Low-Overhead GPU Race Detection Without Access Monitoring
Pengcheng Li, Xiaoyu Hu, Dong Chen, Jacob Brock, Hao Luo, Eddy Z. Zhang, Chen Ding
Article No.: 9
Data race detection has become an important problem in GPU programming. Previous designs of CPU race-checking tools are mainly task parallel and incur high overhead on GPUs due to access instrumentation, especially when monitoring many thousands...