HPCA Test of Time Award 2019: Eligible Papers

HPCA 1995
HPCA 1996
HPCA 1997
HPCA 1998
HPCA 1999
HPCA 2000
HPCA 2001

HPCA 1995

Article Title	Non-consistent dual register files to reduce register pressure
Authors	J. Llosa; M. Valero; E. Ayguade

Article Title	How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors?
Authors	K. I. Farkas; N. P. Jouppi; P. Chow

Article Title	Reducing communication latency with path multiplexing: in optically interconnected multiprocessor systems
Authors	Chunming Qiao; R. Melhem

Article Title	Creating a wider bus using caching techniques
Authors	D. Citron; L. Rudolph

Article Title	Massively parallel array processor for logic, fault, and design error simulation
Authors	Y. Hur; S. A. Szygenda; E. Scott Fehr; G. E. Ott; Sungho Kang

Article Title	Toward high communication performance through compiled communications on a circuit switched interconnection network
Authors	F. Cappello; C. Germain

Article Title	Thread prioritization: a thread scheduling mechanism for multiple-context parallel processors
Authors	S. Fiske; W. J. Dally

Article Title	Two techniques for improving performance on bus-based multiprocessors
Authors	C. Anderson; J. -L. Baer

Article Title	Origin-based fault-tolerant routing in the mesh
Authors	R. Libeskind-Hadas; E. Brandt

Article Title	Architectural support for inter-stream communication in a MSIMD system
Authors	V. Garg; D. E. Schimmel

Article Title	The Named-State Register File: implementation and performance
Authors	P. R. Nuth; W. J. Dally

Article Title	Abstracting network characteristics and locality properties of parallel systems
Authors	A. Sivasubramaniam; M. Singla; U. Ramachandran; H. Venkateswaran

Article Title	The effects of STEF in finely parallel multithreaded processors
Authors	Yamin Li; Wanming Chu

Article Title	Simulation study of cached RAID5 designs
Authors	R. Treiber; J. Menon

Article Title	U-cache: a cost-effective solution to synonym problem
Authors	Jesung Kim; Sang Lyul Min; Sanghoon Jeon; Byoungchu Ahn; Deog-Kyoon Jeong; Chong Sang Kim

Article Title	Design and performance evaluation of a multithreaded architecture
Authors	R. Govindarajan; S. S. Nemawarkar; P. LeNir

Article Title	Modeling virtual channel flow control in hypercubes
Authors	Y. M. Boura; C. R. Das

Article Title	Implementation of atomic primitives on distributed shared memory multiprocessors
Authors	M. M. Michael; M. L. Scott

Article Title	An argument for simple COMA
Authors	A. Saulsbury; T. Wilkinson; J. Carter; A. Landin

Article Title	Efficient and balanced adaptive routing in two-dimensional meshes
Authors	J. H. Upadhyay; V. Varavithya; P. Mohapatra

Article Title	DASC cache
Authors	A. Seznec

Article Title	Optimizing instruction cache performance for operating system intensive workloads
Authors	J. Torrellas; Chun Xia; R. Daigle

Article Title	Implementing register interlocks in parallel-pipeline, multiple instruction queue, superscalar processors
Authors	S. Weiss

Article Title	Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors
Authors	F. Dahlgren; P. Stenstrom

Article Title	A VLSI architecture for computing the tree-to-tree distance
Authors	R. Sastry; N. Ranganathan

Article Title	Fast barrier synchronization in wormhole k-ary n-cube networks with multidestination worms
Authors	D. K. Panda

Article Title	Access ordering and memory-conscious cache utilization
Authors	S. A. Mckee; W. A. Wulf

Article Title	Fine-grain multi-thread processor architecture for massively parallel processing
Authors	T. Kawano; S. Kusakabe; R. -I. Taniguchi; M. Amamiya

Article Title	An initial evaluation of the Convex SPP-1000 for earth and space science applications
Authors	T. L. Sterling; D. F. Savarese; P. R. Merkey; J. P. Gardner

Article Title	Improving performance by cache driven memory management
Authors	K. Westerholz; S. Honal; J. Plankl; C. Hafer

Article Title	Software cache coherence for large scale multiprocessors
Authors	L. I. Kontothanassis; M. L. Scott

Article Title	Software assistance for data caches
Authors	O. Temam; N. Drach

Article Title	Fault-tolerant adaptive routing for two-dimensional meshes
Authors	C. M. Cunningham; D. R. Avresky

Article Title	Memory access reordering in vector processors
Authors	De-Lei Lee

Article Title	A design framework for hybrid-access caches
Authors	K. B. Theobald; H. H. J. Hum; G. R. Gao

Article Title	Program balance and its impact on high performance RISC architectures
Authors	L. K. John; V. Reddy; P. T. Hulina; L. D. Coraor

HPCA 1996

Article Title	Shuffle-Ring: overcoming the increasing degree of hypercube
Authors	Guihai Chen; F. C. M. Lau

Article Title	On the multiplexing degree required to embed permutations in a class of networks with direct interconnects
Authors	Chunming Qiao; Yousong Mei

Article Title	RMB-a reconfigurable multiple bus network
Authors	H. ElGindy; H. Schroder; A. Spray; A. K. Somani; H. Schmeck

Article Title	Bus-based COMA-reducing traffic in shared-bus multiprocessors
Authors	A. Landin; F. Dahlgren

Article Title	Improving the data cache performance of multiprocessor operating systems
Authors	Chun Xia; J. Torrellas

Article Title	Parallel intersecting compressed bit vectors in a high speed query server for processing postal addresses
Authors	Wen-Jann Yang; R. Sridhar; V. Demjanenko

Article Title	Using memory-mapped network interfaces to improve the performance of distributed shared memory
Authors	L. I. Kontothanassis; M. L. Scott

Article Title	Distance-adaptive update protocols for scalable shared-memory multiprocessors
Authors	A. Raynaud; Zheng Zhang; J. Torrellas

Article Title	Performance study of a multithreaded superscalar microprocessor
Authors	M. Gulati; N. Bagherzadeh

Article Title	The impact of shared-cache clustering in small-scale shared-memory multiprocessors
Authors	B. A. Nayfeh; K. Olukotun; J. P. Singh

Article Title	A cache coherency protocol for optically connected parallel computer systems
Authors	J. A. Reisner; T. S. Wailes

Article Title	Protected, user-level DMA for the SHRIMP network interface
Authors	M. A. Blumrich; C. Dubnicki; E. W. Felten; Kai Li

Article Title	Distributed prefetch-buffer/cache design for high performance memory systems
Authors	T. Alexander; G. Kedem

Article Title	A shared-bus control mechanism and a cache coherence protocol for a high-performance on-chip multiprocessor
Authors	M. Takahashi; H. Takano; E. Kaneko; S. Suzuki

Article Title	Decoupled vector architectures
Authors	R. Espasa; M. Valero

Article Title	Representative traces for processor models with infinite cache
Authors	V. S. Iyengar; L. H. Trevillyan; P. Bose

Article Title	Multitasking and multithreading on a multiprocessor with virtual shared memory
Authors	H. L. Muller; P. W. A. Stallard; D. H. D. Warren

Article Title	A comparison of entry consistency and lazy release consistency implementations
Authors	S. V. Adve; A. L. Cox; S. Dwarkadas; R. Rajamony; W. Zwaenepoel

Article Title	Telegraphos: high-performance networking for parallel processing on workstation clusters
Authors	E. P. Markatos; M. G. H. Katevenis

Article Title	Predictive sequential associative cache
Authors	B. Calder; D. Grunwald; J. Emer

Article Title	Fault-tolerant multicast routing in the mesh with no virtual channels
Authors	R. Libeskind-Hadas; K. Watkins; T. Hehre

Article Title	Two adaptive hybrid cache coherency protocols
Authors	C. Anderson; A. R. Karlin

Article Title	Performance characterization of the Alpha 21164 microprocessor using TP and SPEC workloads
Authors	Z. Cvetanovic; D. Bhandarkar

Article Title	Register file design considerations in dynamically scheduled processors
Authors	K. I. Farkas; N. P. Jouppi; P. Chow

Article Title	A topology-independent generic methodology for deadlock-free wormhole routing
Authors	H. Park; D. P. Agrawal

Article Title	Co-scheduling hardware and software pipelines
Authors	R. Govindarajan; E. R. Altman; G. R. Gao

Article Title	Fault-tolerance with multimodule routers
Authors	S. Chalasani; R. V. Boppana

Article Title	Performance evaluation of a cluster-based multiprocessor built from ATM switches and bus-based multiprocessor servers
Authors	M. Karlsson; P. Stenstrom

Article Title	Improving release-consistent shared virtual memory using automatic update
Authors	L. Iftode; C. Dubnicki; E. W. Felten; Kai Li

HPCA 1997

Article Title	Software-managed address translation
Authors	B. Jacob; T. Mudge

Article Title	Towards a communication characterization methodology for parallel applications
Authors	S. Chodnekar; V. Srinivasan; A. S. Vaidya; A. Sivasubramaniam; C. R. Das

Article Title	Distributed path reservation algorithms for multiplexed all-optical interconnection networks
Authors	X. Yuan; R. Melhem; R. Gupta

Article Title	Design issues and tradeoffs for write buffers
Authors	K. Skadron; D. W. Clark

Article Title	An evaluation of fine-grain producer-initiated communication in cache-coherent multiprocessors
Authors	H. Abdel-Shafi; J. Hall; S. V. Adve; V. S. Adve

Article Title	On the use and performance of explicit communication primitives in cache-coherent multiprocessor systems
Authors	X. Qin; J. -L. Baer

Article Title	Software DSM protocols that adapt between single writer and multiple writer
Authors	C. Amza; A. L. Cox; S. Dwarkadas; W. Zwaenepoel

Article Title	ATM and fast Ethernet network interfaces for user-level communication
Authors	M. Welsh; A. Basu; T. von Eicken

Article Title	Message proxies for efficient, protected communication on SMP clusters
Authors	B. -H. Lim; P. Heidelberger; P. Pattnaik; M. Snir

Article Title	A framework for statistical modeling of superscalar processor performance
Authors	D. B. Noonburg; J. P. Shen

Article Title	Speeding up the memory hierarchy in Flat COMA multiprocessors
Authors	L. Yang; J. Torrellas

Article Title	Architectural support for compiler-synthesized dynamic branch prediction strategies: Rationale and initial results
Authors	D. I. August; D. A. Connors; J. C. Gyllenhaal; W. -M. W. Hwu

Article Title	Multithreaded vector architectures
Authors	R. Espasa; M. Valero

Article Title	Reducing the replacement overhead in bus-based COMA multiprocessors
Authors	F. Dahlgren; A. Landin

Article Title	Scheduling communication on an SMP node parallel machine
Authors	B. Falsafi; D. A. Wood

Article Title	Multicast on irregular switch-based networks with wormhole routing
Authors	R. Kesavan; K. Bondalapati; D. K. Panda

Article Title	Reducing remote conflict misses: NUMA with remote cache versus COMA
Authors	Z. Zhang; J. Torrellas

Article Title	A performance comparison of hierarchical ring- and mesh-connected multiprocessor networks
Authors	G. Ravindran; M. Stumm

Article Title	Reducing the communication overhead of dynamic applications on shared memory multiprocessors
Authors	A. Sivasubramaniam

Article Title	Global address space, non-uniform bandwidth: a memory system performance characterization of parallel systems
Authors	T. Stricker; T. Cross

Article Title	The impact of instruction-level parallelism on multiprocessor performance and simulation methodology
Authors	V. S. Pai; P. Ranganathan; S. V. Adve

Article Title	Multiple branch and block prediction
Authors	S. Wallace; N. Bagherzadeh

Article Title	The memory performance of DSS commercial workloads in shared-memory multiprocessors
Authors	P. Trancoso; J. -L. Larriba-Pey; Z. Zhang; J. Torrellas

Article Title	Performance characterization of the Pentium Pro processor
Authors	D. Bhandarkar; J. Ding

Article Title	Architectural support for reducing communication overhead in multiprocessor interconnection networks
Authors	B. Vien Dao; S. Yalamanchili; J. Duato

Article Title	User-level DMA without operating system kernel modification
Authors	E. P. Markatos; M. G. H. Katevenis

Article Title	Evaluating MPI collective communication on the SP2, T3D, and Paragon multicomputers
Authors	Kai Hwang; Choming Wang; Cho-Li Wang

Article Title	Datapath design for a VLIW video signal processor
Authors	A. Wolfe; J. Fritts; S. Dutta; E. S. T. Fernandes

Article Title	Control flow speculation in multiscalar processors
Authors	Q. Jacobson; S. Bennett; N. Sharma; J. E. Smith

Article Title	Advances of the counterflow pipeline microarchitecture
Authors	K. J. Janik; S. -L. Lu; M. F. Miller

HPCA 1998

Article Title	Supporting highly speculative execution via adaptive branch trees
Authors	Tien-Fu Chen

Article Title	Virtual-physical registers
Authors	A. Gonzalez; J. Gonzalez; M. Valero

Article Title	Hardware for speculative run-time parallelization in distributed shared-memory multiprocessors
Authors	Ye Zhang; L. Rauchwerger; J. Torrellas

Article Title	Enhancing memory use in Simple Coma: Multiplexed Simple Coma
Authors	S. Basu; J. Torrellas

Article Title	PRISM: an integrated architecture for scalable shared memory
Authors	K. Ekanadham; Beng-Hong Lim; P. Pattnaik; M. Snir

Article Title	Architectural implications of a family of irregular applications
Authors	D. O’Hallaron; J. R. Shewchuk; T. Gross

Article Title	The emergence of workstation clusters: Should we continue to build mpps? [panel session]
Authors	D. K. Panda

Article Title	Challenging applications on fast networks
Authors	K. Langendoen; R. Hofman; H. Bal

Article Title	Fine-grain software distributed shared memory on SMP clusters
Authors	D. J. Scales; K. Gharachorloo; A. Aggarwal

Article Title	A very efficient distributed deadlock detection mechanism for wormhole networks
Authors	P. Lopez; J. M. Martinez; J. Duato

Article Title	Home-based SVM protocols for SMP clusters: Design and performance
Authors	R. Samanta; A. Bilas; L. Iftode; J. P. Singh

Article Title	Credit-flow-controlled ATM for MP interconnection: The ATLAS I single-chip ATM switch
Authors	M. Katevenis; D. Serpanos; E. Spyridakis

Article Title	The impact of data transfer and buffering alternatives on network interface design
Authors	S. S. Mukherjee; M. D. Hill

Article Title	The effectiveness of SRAM network caches in clustered DSMs
Authors	A. Moga; M. Dubois

Article Title	Control speculation in multithreaded processors through dynamic loop detection
Authors	J. Tubella; A. Gonzalez

Article Title	The sensitivity of communication mechanisms to bandwidth and latency
Authors	F. T. Chong; R. Barua; F. Dahlgren; J. D. Kubiatowicz; A. Agarwal

Article Title	Partial sampling with reverse state reconstruction: A new technique for branch predictor performance estimation
Authors	D. E. Vengroff; G. R. Gao

Article Title	Using multicast and multithreading to reduce communication in software DSM systems
Authors	E. Speight; J. K. Bennett

Article Title	Speculative versioning cache
Authors	S. Gopal; T. N. Vijaykumar; J. E. Smith; G. S. Sohi

Article Title	The architectural costs of streaming I/O: A comparison of workstations, clusters, and SMPs
Authors	R. H. Arpaci-Dusseau; A. C. Arpaci-Dusseau; D. E. Culler; J. M. Hellerstein; D. A. Patterson

Article Title	Address translation mechanisms in network interfaces
Authors	I. Schoinas; M. D. Hill

Article Title	The potential for using thread-level data speculation to facilitate automatic parallelization
Authors	J. G. Steffan; T. C. Mowry

Article Title	Performance study of a concurrent multithreaded processor
Authors	Jenn-Yuan Tsai; Zhenzhen Jiang; E. Ness; Pen-Chung Yew

Article Title	FPGA based custom computing machines for irregular problems
Authors	D. Abramson; P. Logothetis; A. Postula; M. Randall

Article Title	Exploiting two-case delivery for fast protected messaging
Authors	K. Mackenzie; J. Kubiatowicz; M. Frank; W. Lee; V. Lee; A. Agarwal; M. F. Kaashoek

Article Title	Non-stalling counterflow architecture
Authors	M. F. Miller; K. J. Janik; Shih-Lien Lu

Article Title	Temporal-based procedure reordering for improved instruction cache performance
Authors	J. Kalamationos; D. R. Kaeli

Article Title	Performance evaluation of tiling for the register level
Authors	M. Jimenez; J. M. Llaberia; A. Fernandez

Article Title	Treegion scheduling for wide issue processors
Authors	W. A. Havanki; S. Banerjia; T. M. Conte

Article Title	Communication across fault-containment firewalls on the SGI origin
Authors	K. Ghosh; A. J. Christie

Article Title	Efficiently adapting to sharing patterns in software DSMs
Authors	L. R. Monnerat; R. Bianchini

Article Title	Comparative evaluation of latency tolerance techniques for software distributed shared memory
Authors	T. C. Mowry; C. Q. C. Chan; A. K. W. Lo

HPCA 1999

Article Title	A study of control independence in superscalar processors
Authors	E. Rotenberg; Q. Jacobson; J. Smith

Article Title	Impulse: building a smarter memory controller
Authors	J. Carter; W. Hsieh; L. Stoller; M. Swanson; Lixin Zhang; E. Brunvand; A. Davis; Chen-Chi Kuo; R. Kuramkote; M. Parker; L. Schaelicke; T. Tateyama

Article Title	Sensitivity of parallel applications to large differences in bandwidth and latency in two-layer interconnects
Authors	A. Plaat; H. E. Bal; R. F. H. Hofman

Article Title	Improving CC-NUMA performance using Instruction-based Prediction
Authors	S. Kaxiras; J. R. Goodman

Article Title	Distributed modulo scheduling
Authors	M. M. Fernandes; J. Llosa; N. Topham

Article Title	A performance comparison of homeless and home-based lazy release consistency protocols in software shared memory
Authors	A. L. Cox; E. de Lara; C. Hu; W. Zwaenepoel

Article Title	The synergy of multithreading and access/execute decoupling
Authors	J. -M. Parcerisa; A. Gonzalez

Article Title	Supporting fine-grained synchronization on a simultaneous multithreading processor
Authors	D. M. Tullsen; J. L. Lo; S. J. Eggers; H. M. Levy

Article Title	LAPSES: a recipe for high performance adaptive router design
Authors	A. S. Vaidya; A. Sivasubramaniam; C. R. Das

Article Title	Using Lamport clocks to reason about relaxed memory models
Authors	A. E. Condon; M. D. Hill; M. Plakal; D. J. Sorin

Article Title	Memory hierarchy considerations for fast transpose and bit-reversals
Authors	K. S. Gatlin; L. Carter

Article Title	The impact of link arbitration on switch performance
Authors	M. Pirvu; L. Bhuyan; N. Ni

Article Title	Limits to the performance of software shared memory: a layered approach
Authors	A. Bilas; Dongming Jiang; Yuanyuan Zhou; J. P. Singh

Article Title	Dynamically variable line-size cache exploiting high on-chip memory bandwidth of merged DRAM/logic LSIs
Authors	K. Inoue; K. Kai; K. Murakami

Article Title	Impact of buffer size on the efficiency of deadlock detection
Authors	J. M. Martinez; P. Lopez; J. Duato

Article Title	Switch cache: a framework for improving the remote memory access latency of CC-NUMA multiprocessors
Authors	R. Iyer; L. N. Bhuyan

Article Title	Exploiting basic block value locality with block reuse
Authors	Jian Huang; D. J. Lilja

Article Title	Instruction pre-processing in trace processors
Authors	Q. Jacobson; J. E. Smith

Article Title	Out-of-order execution may not be cost-effective on processors featuring simultaneous multithreading
Authors	S. Hily; A. Seznec

Article Title	Comparative evaluation of fine- and coarse-grain approaches for software distributed shared memory
Authors	S. Dwarkadas; K. Gharachorloo; L. Kontothanassis; D. J. Scales; M. L. Scott; R. Stets

Article Title	WildFire: a scalable path for SMPs
Authors	E. Hagersten; M. Koster

Article Title	MP-LOCKs: replacing H/W synchronization primitives with message passing
Authors	Chen-Chi Kuo; J. Carter; R. Kuramkote

Article Title	Dynamically exploiting narrow width operands to improve processor power and performance
Authors	D. Brooks; M. Martonosi

Article Title	Access order and effective bandwidth for streams on a Direct Rambus memory
Authors	S. I. Hong; S. A. McKee; M. H. Salinas; R. H. Klenke; J. H. Aylor; W. A. Wulf

Article Title	RAPID-Cache-a reliable and inexpensive write cache for disk I/O systems
Authors	Yiming Hu; Qing Yang; T. Nightingale

Article Title	Improving the accuracy vs. speed tradeoff for simulating shared-memory multiprocessors with ILP processors
Authors	M. Durbhakula; V. S. Pai; S. Adve

Article Title	A scalable cache coherent scheme exploiting wormhole routing networks
Authors	Yunseok Rhee; Joonwon Lee

Article Title	Global context-based value prediction
Authors	T. Nakra; R. Gupta; M. L. Soffa

Article Title	Parallel Dispatch Queue: a queue-based programming abstraction to parallelize fine-grain communication protocols
Authors	B. Falsafi; D. A. Wood

Article Title	Hardware for speculative parallelization of partially-parallel loops in DSM multiprocessors
Authors	Ye Zhang; L. Rauchwerger; J. Torrellas

Article Title	Efficient all-to-all broadcast in all-port mesh and torus networks
Authors	Yuanyuan Yang; Jianchao Wang

Article Title	Instruction recycling on a multiple-path processor
Authors	S. Wallace; D. M. Tullsen; B. Calder

Article Title	Permutation development data layout (PDDL)
Authors	T. J. E. Schwarz; J. Steinberg; W. A. Burkhard

Article Title	Design and performance of directory caches for scalable shared memory multiprocessors
Authors	M. M. Michael; A. K. Nanda

Article Title	MMR: a high-performance MultiMedia Router-architecture and design trade-offs
Authors	J. Duato; S. Yalamanchili; M. B. Caminero; D. Love; F. J. Quiles

Article Title	Lightweight hardware distributed shared memory supported by generalized combining
Authors	K. Tanaka; T. Matsumoto; K. Hiraki

Article Title	Communication studies of single-threaded and multithreaded distributed-memory machines
Authors	A. Sohn; Yunheung Paek; Jui-Yuan Ku; Y. Kodama; Y. Yamaguchi

HPCA 2000

Article Title	A prefetching technique for irregular accesses to linked data structures
Authors	M. Karlsson; F. Dahlgren; P. Stenstrom

Article Title	Cache-efficient matrix transposition
Authors	S. Chatterjee; S. Sen

Article Title	On the performance of hand vs. automatically optimized numerical codes
Authors	M. Jimenez; J. M. Llaberia; A. Fernandez

Article Title	Evaluation of active disks for decision support databases
Authors	M. Uysal; A. Acharya; J. Saltz

Article Title	Improving the throughput of synchronization by insertion of delays
Authors	R. Rajwar; A. Kagi; J. R. Goodman

Article Title	The effect of network total order, broadcast, and remote-write capability on network-based shared memory computing
Authors	R. Stets; S. Dwarkadas; L. Kontothanassis; U. Rencuzogullari; M. L. Scott

Article Title	Coherence communication prediction in shared-memory multiprocessors
Authors	S. Kaxiras; C. Young

Article Title	Register organization for media processing
Authors	S. Rixner; W. J. Dally; B. Khailany; P. Mattson; U. J. Kapasi; J. D. Owens

Article Title	Cache memory design for network processors
Authors	Tzi-Cker Chiueh; P. Pradhan

Article Title	Flit-reservation flow control
Authors	Li-Shiuan Peh; W. J. Dally

Article Title	Combining static and dynamic branch prediction to reduce destructive aliasing
Authors	H. Patil; J. Emer

Article Title	Trace cache redundancy: red and blue traces
Authors	A. Ramirez; J. Ll. Larriba-Pey; M. Valero

Article Title	Design of a parallel vector access unit for SDRAM memory systems
Authors	B. K. Mathew; S. A. McKee; J. B. Carter; A. Davis

Article Title	High-throughput coherence controllers
Authors	A. K. Nanda; A. -T. Nguyen; M. M. Michael; D. J. Joseph

Article Title	Reducing code size with run-time decompression
Authors	C. Lefurgy; E. Piccininni; T. Mudge

Article Title	Investigating the performance of two programming models for clusters of SMP PCs
Authors	F. Cappello; O. Richard; D. Etiemble

Article Title	PowerMANNA: a parallel architecture based on the PowerPC MPC620
Authors	P. M. Behr; S. Pletner; A. C. Sodan

Article Title	Architectural issues in Java runtime systems
Authors	R. Radhakrishnan; N. Vijaykrishnan; L. K. John; A. Sivasubramaniam

Article Title	Performance evaluation of dynamic reconfiguration in high-speed local area networks
Authors	R. Casado; A. Bermudez; F. J. Quiles; J. L. Sanchez; J. Duato

Article Title	Modified LRU policies for improving second-level cache behavior
Authors	W. A. Wong; J. -L. Baer

Article Title	Decoupled value prediction on trace processors
Authors	Sang-Jeong Lee; Yuan Wang; Pen-Chung Yew

Article Title	Performance analysis and visualization of parallel systems using SimOS and Rivet: a case study
Authors	R. Bosch; C. Stolte; G. Stoll; M. Rosenblum; P. Hanrahan

Article Title	A DSM architecture for a parallel computer Cenju-4
Authors	T. Hosomi; Y. Kanoh; M. Nakamura; T. Hirose

Article Title	The best distribution for a parallel OpenGL 3D engine with texture caches
Authors	A. Vartanian; J. -L. Bechennec; N. Drach-Temam

Article Title	Investigating QoS support for traffic mixes with the MediaWorm router
Authors	Ki Hwan Yum; A. Vaidya; C. R. Das; A. Sivasubramaniam

Article Title	eXtended block cache
Authors	S. Jourdan; L. Rappoport; Y. Almog; M. Erez; A. Yoaz; R. Ronen

Article Title	Branch transition rate: a new metric for improved branch classification analysis
Authors	M. Haungs; P. Sallee; M. Farrens

Article Title	Impact of chip-level integration on performance of OLTP workloads
Authors	L. A. Barroso; K. Gharachorloo; A. Nowatzyk; B. Verghese

Article Title	Memory dependence speculation tradeoffs in centralized, continuous-window superscalar processors
Authors	A. Moshovos; G. S. Sohi

Article Title	Quantifying the SMT layout overhead-does SMT pull its weight?
Authors	J. Burns; J. -L. Gaudiot

Article Title	Toward a cost-effective DSM organization that exploits processor-memory integration
Authors	J. Torrellas; Liuxi Yang; A. -T. Nguyen

Article Title	A technique for high bandwidth and deterministic low latency load/store accesses to multiple cache banks
Authors	H. Neefs; H. Vandierendonck; K. De Bosschere

Article Title	Software-controlled multithreading using informing memory operations
Authors	T. C. Mowry; S. R. Ramkissoon

Article Title	Impact of heterogeneity on DSM performance
Authors	R. J. O. Figueiredo; J. A. B. Fortes

Article Title	Dynamic cluster assignment mechanisms
Authors	R. Canal; J. M. Parcerisa; A. Gonzalez

HPCA 2001

Article Title	Stack Value File: Custom Microarchitecture for the Stack
Authors	H.-H. S. Lee; M. Smelyanskiy; C. J. Newburn; G. S. Tyson

Article Title	Register Renaming and Scheduling for Dynamic Execution of Predicated Code
Authors	P. H. Wang; H. Wang; R. M. Kling; K. Ramakrishnan; J. P. Shen

Article Title	Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order
Authors	P. Michaud; A. Seznec

Article Title	Speculative Data-Driven Multithreading
Authors	A. Roth; G. S. Sohi

Article Title	Towards Virtually-Addressed Memory Hierarchies
Authors	X. Qiu; M. Dubois

Article Title	Reevaluating Online Superpage Promotion with Hardware Support
Authors	Z. Fang; L. Zhang; J. B. Carter; W.C. Hsieh; S. A. McKee

Article Title	Performance of Hardware Compressed Main Memory
Authors	B. Abali; H. Franke; X. Shen; D. E. Poff; T. B. Smith

Article Title	JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers
Authors	A. Moshovos; G. Memik; B. Falsafi; A. Choudhary

Article Title	A New Scalable Directory Architecture for Large-scale Multiprocessors
Authors	M. E. Acacio; J. Gonzalez; J. M. Garcia; J. Duato

Article Title	Self-Tuned Congestion Control for Multiprocessor Networks
Authors	M. Thottethodi; A. R. Lebeck; S. S. Mukherjee

Article Title	Automatically Mapping Code on an Intelligent Memory Architecture
Authors	J. Lee; Y. Solihin; J. Torrellas

Article Title	CARS: A New Code Generation Framework for Clustered ILP Processors
Authors	K. Kailas; K. Ebcioglu; A. Agrawala

Article Title	An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron High-Performance I-Caches
Authors	S.-H. Yang; M. D. Powell; B. Falsafi; K. Roy; T. N. Vijaykumar

Article Title	DRAM Energy Management Using Software and Hardware Directed Power Mode Control
Authors	V. Delaluz; M. Kandemir; N. Vijaykrishnan; A. Sivasubramaniam; M. J. Irwin

Article Title	Dynamic Thermal Management for High-Performance Microprocessors
Authors	D. Brooks; M. Martonosi

Article Title	Dynamic Prediction of Critical Path Instructions
Authors	E. Tune; D. Liang; D. M. Tullsen; B. Calder

Article Title	Dynamic Branch Prediction with Perceptrons
Authors	D.A. Jimenez; C. Lin

Article Title	Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency
Authors	B. Goeman; H. Vandierendonck; K. De Bosschere

Article Title	DLP +TLP Processors for the Next Generation of Media Workloads
Authors	J. Corbal; R. Espasa; M. Valero

Article Title	An Architectural Evaluation of Java TPC-W
Authors	H. W. Cain; R. Rajwar; M. Marden; M. H. Lipasti

Article Title	A Programmable Co-Processor for Profiling
Authors	C. B. Zilles; G. S. Sohi

Article Title	A Delay Model and Speculative Architecture for Pipelined Routers
Authors	L. Peh; W.J. Dally

Article Title	Quantifying the Impact of Architectural Scaling on Communication
Authors	T. Heath; S. Kaw; R. P. Martin; T.D. Nguyen

Article Title	Call Graph Prefetching for Database Applications
Authors	M. Annavaram; J. M. Patel; E. S. Davidson

Article Title	Branch History Guided Instruction Prefetching
Authors	V. Srinivasan; E. S. Davidson; G. S. Tyson; M. J. Charney; T. R. Puzak

Article Title	Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
Authors	W. Lin; S. K. Reinhardt; D. Burger