enter search term and/or author name
Debugging parallel programs is a well-known difficult problem. A promising method to facilitate debugging parallel programs is using hardware support to achieve deterministic replay on a Chip Multi-Processor (CMP). As a Design-For-Debug (DFD)...
TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs
Daniel Lustig, Abhishek Bhattacharjee, Margaret Martonosi
Article No.: 2
Translation Lookaside Buffers (TLBs) are critical to overall system performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as Chip MultiProcessors (CMPs) become ubiquitous, TLB design and...
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
Rong Chen, Haibo Chen
Article No.: 3
The prevalence of chip multiprocessors opens opportunities of running data-parallel applications originally in clusters on a single machine with many cores. MapReduce, a simple and elegant programming model to program large-scale clusters, has...
A-DFA: A Time- and Space-Efficient DFA Compression Algorithm for Fast Regular Expression Evaluation
Michela Becchi, Patrick Crowley
Article No.: 4
Modern network intrusion detection systems need to perform regular expression matching at line rate in order to detect the occurrence of critical patterns in packet payloads. While Deterministic Finite Automata (DFAs) allow this operation to be...
The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing
Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, Norman P. Jouppi
Article No.: 5
This article introduces McPAT, an integrated power, area, and timing modeling framework that supports comprehensive design space exploration for multicore and manycore processor configurations ranging from 90nm to 22nm and beyond. At...