HPCA Test of Time Award 2021: Eligible Papers

HPCA 1999
HPCA 2000
HPCA 2001
HPCA 2002
HPCA 2003


 

HPCA 1999

Article Title A study of control independence in superscalar processors
Authors E. Rotenberg; Q. Jacobson; J. Smith
Article Title Impulse: building a smarter memory controller
Authors J. Carter; W. Hsieh; L. Stoller; M. Swanson; Lixin Zhang; E. Brunvand; A. Davis; Chen-Chi Kuo; R. Kuramkote; M. Parker; L. Schaelicke; T. Tateyama
Article Title Sensitivity of parallel applications to large differences in bandwidth and latency in two-layer interconnects
Authors A. Plaat; H. E. Bal; R. F. H. Hofman
Article Title Improving CC-NUMA performance using Instruction-based Prediction
Authors S. Kaxiras; J. R. Goodman
Article Title Distributed modulo scheduling
Authors M. M. Fernandes; J. Llosa; N. Topham
Article Title A performance comparison of homeless and home-based lazy release consistency protocols in software shared memory
Authors A. L. Cox; E. de Lara; C. Hu; W. Zwaenepoel
Article Title The synergy of multithreading and access/execute decoupling
Authors J. -M. Parcerisa; A. Gonzalez
Article Title Supporting fine-grained synchronization on a simultaneous multithreading processor
Authors D. M. Tullsen; J. L. Lo; S. J. Eggers; H. M. Levy
Article Title LAPSES: a recipe for high performance adaptive router design
Authors A. S. Vaidya; A. Sivasubramaniam; C. R. Das
Article Title Using Lamport clocks to reason about relaxed memory models
Authors A. E. Condon; M. D. Hill; M. Plakal; D. J. Sorin
Article Title Memory hierarchy considerations for fast transpose and bit-reversals
Authors K. S. Gatlin; L. Carter
Article Title The impact of link arbitration on switch performance
Authors M. Pirvu; L. Bhuyan; N. Ni
Article Title Limits to the performance of software shared memory: a layered approach
Authors A. Bilas; Dongming Jiang; Yuanyuan Zhou; J. P. Singh
Article Title Dynamically variable line-size cache exploiting high on-chip memory bandwidth of merged DRAM/logic LSIs
Authors K. Inoue; K. Kai; K. Murakami
Article Title Impact of buffer size on the efficiency of deadlock detection
Authors J. M. Martinez; P. Lopez; J. Duato
Article Title Switch cache: a framework for improving the remote memory access latency of CC-NUMA multiprocessors
Authors R. Iyer; L. N. Bhuyan
Article Title Exploiting basic block value locality with block reuse
Authors Jian Huang; D. J. Lilja
Article Title Instruction pre-processing in trace processors
Authors Q. Jacobson; J. E. Smith
Article Title Out-of-order execution may not be cost-effective on processors featuring simultaneous multithreading
Authors S. Hily; A. Seznec
Article Title Comparative evaluation of fine- and coarse-grain approaches for software distributed shared memory
Authors S. Dwarkadas; K. Gharachorloo; L. Kontothanassis; D. J. Scales; M. L. Scott; R. Stets
Article Title WildFire: a scalable path for SMPs
Authors E. Hagersten; M. Koster
Article Title MP-LOCKs: replacing H/W synchronization primitives with message passing
Authors Chen-Chi Kuo; J. Carter; R. Kuramkote
Article Title Dynamically exploiting narrow width operands to improve processor power and performance
Authors D. Brooks; M. Martonosi
Article Title Access order and effective bandwidth for streams on a Direct Rambus memory
Authors S. I. Hong; S. A. McKee; M. H. Salinas; R. H. Klenke; J. H. Aylor; W. A. Wulf
Article Title RAPID-Cache-a reliable and inexpensive write cache for disk I/O systems
Authors Yiming Hu; Qing Yang; T. Nightingale
Article Title Improving the accuracy vs. speed tradeoff for simulating shared-memory multiprocessors with ILP processors
Authors M. Durbhakula; V. S. Pai; S. Adve
Article Title A scalable cache coherent scheme exploiting wormhole routing networks
Authors Yunseok Rhee; Joonwon Lee
Article Title Global context-based value prediction
Authors T. Nakra; R. Gupta; M. L. Soffa
Article Title Parallel Dispatch Queue: a queue-based programming abstraction to parallelize fine-grain communication protocols
Authors B. Falsafi; D. A. Wood
Article Title Hardware for speculative parallelization of partially-parallel loops in DSM multiprocessors
Authors Ye Zhang; L. Rauchwerger; J. Torrellas
Article Title Efficient all-to-all broadcast in all-port mesh and torus networks
Authors Yuanyuan Yang; Jianchao Wang
Article Title Instruction recycling on a multiple-path processor
Authors S. Wallace; D. M. Tullsen; B. Calder
Article Title Permutation development data layout (PDDL)
Authors T. J. E. Schwarz; J. Steinberg; W. A. Burkhard
Article Title Design and performance of directory caches for scalable shared memory multiprocessors
Authors M. M. Michael; A. K. Nanda
Article Title MMR: a high-performance MultiMedia Router-architecture and design trade-offs
Authors J. Duato; S. Yalamanchili; M. B. Caminero; D. Love; F. J. Quiles
Article Title Lightweight hardware distributed shared memory supported by generalized combining
Authors K. Tanaka; T. Matsumoto; K. Hiraki
Article Title Communication studies of single-threaded and multithreaded distributed-memory machines
Authors A. Sohn; Yunheung Paek; Jui-Yuan Ku; Y. Kodama; Y. Yamaguchi

HPCA 2000

Article Title A prefetching technique for irregular accesses to linked data structures
Authors M. Karlsson; F. Dahlgren; P. Stenstrom
Article Title Cache-efficient matrix transposition
Authors S. Chatterjee; S. Sen
Article Title On the performance of hand vs. automatically optimized numerical codes
Authors M. Jimenez; J. M. Llaberia; A. Fernandez
Article Title Evaluation of active disks for decision support databases
Authors M. Uysal; A. Acharya; J. Saltz
Article Title Improving the throughput of synchronization by insertion of delays
Authors R. Rajwar; A. Kagi; J. R. Goodman
Article Title The effect of network total order, broadcast, and remote-write capability on network-based shared memory computing
Authors R. Stets; S. Dwarkadas; L. Kontothanassis; U. Rencuzogullari; M. L. Scott
Article Title Coherence communication prediction in shared-memory multiprocessors
Authors S. Kaxiras; C. Young
Article Title Register organization for media processing
Authors S. Rixner; W. J. Dally; B. Khailany; P. Mattson; U. J. Kapasi; J. D. Owens
Article Title Cache memory design for network processors
Authors Tzi-Cker Chiueh; P. Pradhan
Article Title Flit-reservation flow control
Authors Li-Shiuan Peh; W. J. Dally
Article Title Combining static and dynamic branch prediction to reduce destructive aliasing
Authors H. Patil; J. Emer
Article Title Trace cache redundancy: red and blue traces
Authors A. Ramirez; J. Ll. Larriba-Pey; M. Valero
Article Title Design of a parallel vector access unit for SDRAM memory systems
Authors B. K. Mathew; S. A. McKee; J. B. Carter; A. Davis
Article Title High-throughput coherence controllers
Authors A. K. Nanda; A. -T. Nguyen; M. M. Michael; D. J. Joseph
Article Title Reducing code size with run-time decompression
Authors C. Lefurgy; E. Piccininni; T. Mudge
Article Title Investigating the performance of two programming models for clusters of SMP PCs
Authors F. Cappello; O. Richard; D. Etiemble
Article Title PowerMANNA: a parallel architecture based on the PowerPC MPC620
Authors P. M. Behr; S. Pletner; A. C. Sodan
Article Title Architectural issues in Java runtime systems
Authors R. Radhakrishnan; N. Vijaykrishnan; L. K. John; A. Sivasubramaniam
Article Title Performance evaluation of dynamic reconfiguration in high-speed local area networks
Authors R. Casado; A. Bermudez; F. J. Quiles; J. L. Sanchez; J. Duato
Article Title Modified LRU policies for improving second-level cache behavior
Authors W. A. Wong; J. -L. Baer
Article Title Decoupled value prediction on trace processors
Authors Sang-Jeong Lee; Yuan Wang; Pen-Chung Yew
Article Title Performance analysis and visualization of parallel systems using SimOS and Rivet: a case study
Authors R. Bosch; C. Stolte; G. Stoll; M. Rosenblum; P. Hanrahan
Article Title A DSM architecture for a parallel computer Cenju-4
Authors T. Hosomi; Y. Kanoh; M. Nakamura; T. Hirose
Article Title The best distribution for a parallel OpenGL 3D engine with texture caches
Authors A. Vartanian; J. -L. Bechennec; N. Drach-Temam
Article Title Investigating QoS support for traffic mixes with the MediaWorm router
Authors Ki Hwan Yum; A. Vaidya; C. R. Das; A. Sivasubramaniam
Article Title eXtended block cache
Authors S. Jourdan; L. Rappoport; Y. Almog; M. Erez; A. Yoaz; R. Ronen
Article Title Branch transition rate: a new metric for improved branch classification analysis
Authors M. Haungs; P. Sallee; M. Farrens
Article Title Impact of chip-level integration on performance of OLTP workloads
Authors L. A. Barroso; K. Gharachorloo; A. Nowatzyk; B. Verghese
Article Title Memory dependence speculation tradeoffs in centralized, continuous-window superscalar processors
Authors A. Moshovos; G. S. Sohi
Article Title Quantifying the SMT layout overhead-does SMT pull its weight?
Authors J. Burns; J. -L. Gaudiot
Article Title Toward a cost-effective DSM organization that exploits processor-memory integration
Authors J. Torrellas; Liuxi Yang; A. -T. Nguyen
Article Title A technique for high bandwidth and deterministic low latency load/store accesses to multiple cache banks
Authors H. Neefs; H. Vandierendonck; K. De Bosschere
Article Title Software-controlled multithreading using informing memory operations
Authors T. C. Mowry; S. R. Ramkissoon
Article Title Impact of heterogeneity on DSM performance
Authors R. J. O. Figueiredo; J. A. B. Fortes
Article Title Dynamic cluster assignment mechanisms
Authors R. Canal; J. M. Parcerisa; A. Gonzalez

HPCA 2001

Article Title Stack Value File: Custom Microarchitecture for the Stack
Authors H.-H. S. Lee; M. Smelyanskiy; C. J. Newburn; G. S. Tyson
Article Title Register Renaming and Scheduling for Dynamic Execution of Predicated Code
Authors P. H. Wang; H. Wang; R. M. Kling; K. Ramakrishnan; J. P. Shen
Article Title Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order
Authors P. Michaud; A. Seznec
Article Title Speculative Data-Driven Multithreading
Authors A. Roth; G. S. Sohi
Article Title Towards Virtually-Addressed Memory Hierarchies
Authors X. Qiu; M. Dubois
Article Title Reevaluating Online Superpage Promotion with Hardware Support
Authors Z. Fang; L. Zhang; J. B. Carter; W.C. Hsieh; S. A. McKee
Article Title Performance of Hardware Compressed Main Memory
Authors B. Abali; H. Franke; X. Shen; D. E. Poff; T. B. Smith
Article Title JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers
Authors A. Moshovos; G. Memik; B. Falsafi; A. Choudhary
Article Title A New Scalable Directory Architecture for Large-scale Multiprocessors
Authors M. E. Acacio; J. Gonzalez; J. M. Garcia; J. Duato
Article Title Self-Tuned Congestion Control for Multiprocessor Networks
Authors M. Thottethodi; A. R. Lebeck; S. S. Mukherjee
Article Title Automatically Mapping Code on an Intelligent Memory Architecture
Authors J. Lee; Y. Solihin; J. Torrellas
Article Title An Integrated Circuit/Architecture Approach to Reducing Leakage
in Deep-Submicron High-Performance I-Caches
Authors S.-H. Yang; M. D. Powell; B. Falsafi; K. Roy; T. N. Vijaykumar
Article Title DRAM Energy Management Using Software and Hardware Directed
Power Mode Control
Authors V. Delaluz; M. Kandemir; N. Vijaykrishnan; A. Sivasubramaniam;
M. J. Irwin
Article Title Dynamic Thermal Management for High-Performance Microprocessors
Authors D. Brooks; M. Martonosi
Article Title Dynamic Prediction of Critical Path Instructions
Authors E. Tune; D. Liang; D. M. Tullsen; B. Calder
Article Title Dynamic Branch Prediction with Perceptrons
Authors D.A. Jimenez; C. Lin
Article Title Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency
Authors B. Goeman; H. Vandierendonck; K. De Bosschere
Article Title DLP +TLP Processors for the Next Generation of Media Workloads
Authors J. Corbal; R. Espasa; M. Valero
Article Title An Architectural Evaluation of Java TPC-W
Authors H. W. Cain; R. Rajwar; M. Marden; M. H. Lipasti
Article Title A Programmable Co-Processor for Profiling
Authors C. B. Zilles; G. S. Sohi
Article Title A Delay Model and Speculative Architecture for Pipelined Routers
Authors L. Peh; W.J. Dally
Article Title Quantifying the Impact of Architectural Scaling on Communication
Authors T. Heath; S. Kaw; R. P. Martin; T.D. Nguyen
Article Title Call Graph Prefetching for Database Applications
Authors M. Annavaram; J. M. Patel; E. S. Davidson
Article Title Branch History Guided Instruction Prefetching
Authors V. Srinivasan; E. S. Davidson; G. S. Tyson; M. J. Charney; T. R. Puzak
Article Title Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
Authors W. Lin; S. K. Reinhardt; D. Burger

HPCA 2002

Article Title Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized Dynamic Thermal Management
Authors Kevin Skadron; Tarek Abdelzaher; Mircea R. Stan
Article Title Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling
Authors Greg Semeraro; Grigorios Magklis; Rajeev Balasubramonian; David H. Albonesi; Sandhya Dwarkadas; Michael L. Scott
Article Title A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning
Authors G.Edward Suh; Srinivas Devadas; Larry Rudolph
Article Title Using Complete Machine Simulation for Software Power Estimation: The SoftWatt Approach
Authors Sudhanva Gurumurthi; Anand Sivasubramaniam; Mary Jane Irwin; N. Vijaykrishnan; Mahmut Kandemir; Tao Li; Lizy Kurian John
Article Title Loose Loops Sink Chips
Authors Eric Borch; Srilatha Manne; Joel Emer; Eric Tune
Article Title Exploiting Choice in Resizable Cache Design to Optimize Deep-Submicron Processor Energy-Delay
Authors Se-Hyun Yang; Babak Falsafi; Michael D. Powell; T. N. Vijaykumar
Article Title Power Issues Related to Branch Prediction
Authors Dharmesh Parikh; Kevin Skadron; Yan Zhang; Marco Barcella; Mircea R. Stan
Article Title Improving Value Communication for Thread-Level Speculation
Authors J. Gregory Steffan; Christopher B. Colohan; Antonia Zhai; Todd C. Mowry
Article Title Thread-Spawning Schemes for Speculative Multithreading
Authors Pedro Marcuello; Antonio González
Article Title Eliminating Squashes Through Learning Cross-Thread Violations in Speculative Parallelization for Multiprocessors
Authors Marcelo Cintra; Josep Torrellas
Article Title Microarchitectural Simulation and Control of di/dt-induced Power Supply Voltage Variation
Authors Ed Grochowski; Dave Ayers; Vivek Tiwari
Article Title Let’s Study Whole-Program Cache Behaviour Analytically
Authors Xavier Vera; Jingling Xue
Article Title Bandwidth Adaptive Snooping
Authors Milo M. K. Martin; Daniel J. Sorin; Mark D. Hill; David A. Wood
Article Title Tuning Garbage Collection in an Embedded Java Environment
Authors G. Chen; R. Shetty; M. Kandemir; N. Vijaykrishnan; M.J. Irwin; M. Wolczko
Article Title Evaluation of a Multithreaded Architecture for Cellular Computing
Authors Calin Cascaval; Jose G. Castanos; Luis Ceze; Monty Denneau; Manish Gupta; Derek Lieber; Jose E. Moreira; Karin Strauss; Henry S. Warren Jr
Article Title Memory Latency-Tolerance Approaches for Itanium Processors: Out-of-Order Execution vs. Speculative Precomputation
Authors Perry H. Wang; Hong Wang; Jamison D. Collins; Ed Grochowski; Ralph M. Kling; John P. Shen
Article Title User-Level Communication in Cluster-Based Server
Authors Enrique V. Carrera; Srinath Rao; Liviu Iftode; Ricardo Bianchini
Article Title The Minimax Cache: An Energy-Efficient Framework for Media Processor
Authors Osman S. Unsal; Israel Koren; C. Mani Krishna; Csaba Andras Moritz
Article Title Fine-grain Priority Scheduling on Multi-channel Memory Systems
Authors Zhichun Zhu; Zhao Zhang; Xiaodong Zhang
Article Title Non-vital Loads
Authors Ryan Rakvic; Bryan Black; Deepak Limaye; John P. Shen
Article Title Modeling Value Speculation
Authors Yiannakis Sazeides
Article Title Quantifying Load Stream Behavior
Authors Suleyman Sair, Timothy Sherwood, Brad Calder
Article Title Using Internal Redundant Representations and Limited Bypass to Support Pipelined Adders and Register Files
Authors Mary D. Brown; Yale N. Patt
Article Title The FAB Predictor: Using Fourier Analysis to Predict the Outcome of Conditional Branches
Authors Martin Kampe; Per Stenstrom; Michel Dubois
Article Title Reverse Tracer: A Software Tool for Generating Realistic Performance Test Programs
Authors Larry Brisson; Mariko Sakamot; Akira Katsuno; Aiichiro Inoue; Yasunori Kimura
Article Title CableS: Thread Control and Memory Management Extensions for Shared Virtual Memory Clusters
Authors Peter Jamieson; Angelos Bilas
Article Title CARS: A New Code Generation Framework for Clustered ILP Processors
Authors K. Kailas; K. Ebcioglu; A. Agrawala

HPCA 2003

Article Title Variability in Architectural Simulations of Multi-Threaded Workloads
Authors Alaa R. Alameldeen; David A. Wood
Article Title Front-End Policies for Improved Issue Efficiency in SMT Processors
Authors Ali El-Moursy; David H. Albonesi
Article Title Control Techniques to Eliminate Voltage Emergencies in High Performance Processors
Authors Russ Joseph; David M. Brooks; Margaret Martonosi
Article Title Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks
Authors Li Shang, Li-Shiuan Peh; Niraj K. Jha
Article Title Deterministic Clock Gating for Microprocessor Power Reduction.
Authors Hai Li; Swarup Bhunia; Yiran Chen; T. N. Vijaykumar; Kaushik Roy
Article Title Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors
Authors Onur Mutlu; Jared Stark; Chris Wilkerson; Yale N. Patt
Article Title A Statistically Rigorous Approach for Improving Simulation Methodology
Authors Joshua J. Yi; David J. Lilja; Douglas M. Hawkins
Article Title Caches and Hash Trees for Efficient Memory Integrity Verification
Authors Blaise Gassend; G. Edward Suh; Dwaine E. Clarke; Marten van Dijk; Srinivas Devadas
Article Title TCP: Tag Correlating Prefetchers.
Authors Zhigang Hu; Margaret Martonosi; Stefanos Kaxiras
Article Title Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architecture
Authors Michael Bedford Taylor; Walter Lee; Saman P. Amarasinghe; Anant Agarwal
Article Title Mini-Threads: Increasing TLP on Small-Scale SMT Processors
Authors Joshua Redstone; Susan Eggers; Henry Levy
Article Title Reconsidering Complex Branch Predictors
Authors Daniel A. Jiménez
Article Title Incorporating Predicate Information into Branch Predictors
Authors Beth Simon; Brad Calder; Jeanne Ferrante
Article Title Dynamic Data Dependence Tracking and its Application to Branch Prediction
Authors Lei Chen; Steve Dropsho; David H. Albonesi
Article Title Power-Aware Control Speculation through Selective Throttling
Authors Juan L. Aragón; José González; Antonio González
Article Title Microarchitecture and Performance Analysis of a SPARC-V9 Microprocessor for Enterprise Server Systems
Authors Mariko Sakamoto; Akira Katsuno; Aiichiro Inoue; Takeo Asakawa; Haruhiko Ueno; Kuniki Morita; Yasunori Kimura
Article Title Exploring the VLSI Scalability of Stream Processors
Authors Brucek Khailany; William J. Dally; Scott Rixner; Ujval J. Kapasi; John D. Owens; Brian Towles
Article Title Dynamic Optimization of Micro-Operations
Authors Brian Slechta; David Crowe; Brian Fahs; Michael Fertig; Gregory Muthler; Justin Quek; Francesco Spadini; Sanjay J. Patel; Steven S. Lumetta
Article Title Slipstream Execution Mode for CMP-Based Multiprocessors
Authors Khaled Z. Ibrahim; Gregory T. Byrd; Eric Rotenberg
Article Title Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors
Authors María Jesús Garzarán; Milos Prvulovic; José María Llabería; Víctor Viñals; Lawrence Rauchwerger; Josep Torrellas
Article Title Dynamic Data Replication: An Approach to Providing Fault-Tolerant Shared Memory Clusters
Authors Rosalia Christodoulopoulou; Reza Azimi; Angelos Bilas
Article Title Memory System Behavior of Java-Based Middleware
Authors Martin Karlsson; Kevin E. Moore; Erik Hagersten; David A. Wood
Article Title Evaluating the Impact of Communication Architecture on the Performability of Cluster-Based Services
Authors Kiran Nagaraja; Neeraj Krishnan; Ricardo Bianchini; Richard P. Martin; Thu D. Nguyen
Article Title Hierarchical Backoff Locks for Nonuniform Communication Architectures
Authors Zoran Radović; Erik Hagersten
Article Title Performance Enhancement Techniques for InfiniBand™ Architecture
Authors Eun Jung Kim; Ki Hwan Yum; Chita R. Das; Mazin Yousif; José Duato
Article Title Catching Accurate Profiles in Hardware
Authors Satish Narayanasamy; Timothy Sherwood; Suleyman Sair; Brad Calder; George Varghese
Article Title Just Say No: Benefits of Early Cache Miss Determination
Authors Gokhan Memik; Glenn Reinman; William H. Mangione-Smith
Article Title Cost-Sensitive Cache Replacement Algorithms
Authors Jaeheon Jeong; Michel Dubois
Article Title Inter-Cluster Communication Models for Clustered VLIW Processors
Authors Andrei Terechko; Erwan Le Thenaff; Manish Garg; Jos van Eijndhoven; Henk Corporaal
Article Title Active I/O Switches in System Area Networks
Authors Ming Hao; Mark Heinrich