HPCA Test of Time Award 2019: Eligible Papers

HPCA 1995
HPCA 1996
HPCA 1997
HPCA 1998
HPCA 1999
HPCA 2000
HPCA 2001


HPCA 1995

Article Title Non-consistent dual register files to reduce register pressure
Authors J. Llosa; M. Valero; E. Ayguade
Article Title How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors?
Authors K. I. Farkas; N. P. Jouppi; P. Chow
Article Title Reducing communication latency with path multiplexing: in optically interconnected multiprocessor systems
Authors Chunming Qiao; R. Melhem
Article Title Creating a wider bus using caching techniques
Authors D. Citron; L. Rudolph
Article Title Massively parallel array processor for logic, fault, and design error simulation
Authors Y. Hur; S. A. Szygenda; E. Scott Fehr; G. E. Ott; Sungho Kang
Article Title Toward high communication performance through compiled communications on a circuit switched interconnection network
Authors F. Cappello; C. Germain
Article Title Thread prioritization: a thread scheduling mechanism for multiple-context parallel processors
Authors S. Fiske; W. J. Dally
Article Title Two techniques for improving performance on bus-based multiprocessors
Authors C. Anderson; J. -L. Baer
Article Title Origin-based fault-tolerant routing in the mesh
Authors R. Libeskind-Hadas; E. Brandt
Article Title Architectural support for inter-stream communication in a MSIMD system
Authors V. Garg; D. E. Schimmel
Article Title The Named-State Register File: implementation and performance
Authors P. R. Nuth; W. J. Dally
Article Title Abstracting network characteristics and locality properties of parallel systems
Authors A. Sivasubramaniam; M. Singla; U. Ramachandran; H. Venkateswaran
Article Title The effects of STEF in finely parallel multithreaded processors
Authors Yamin Li; Wanming Chu
Article Title Simulation study of cached RAID5 designs
Authors R. Treiber; J. Menon
Article Title U-cache: a cost-effective solution to synonym problem
Authors Jesung Kim; Sang Lyul Min; Sanghoon Jeon; Byoungchu Ahn; Deog-Kyoon Jeong; Chong Sang Kim
Article Title Design and performance evaluation of a multithreaded architecture
Authors R. Govindarajan; S. S. Nemawarkar; P. LeNir
Article Title Modeling virtual channel flow control in hypercubes
Authors Y. M. Boura; C. R. Das
Article Title Implementation of atomic primitives on distributed shared memory multiprocessors
Authors M. M. Michael; M. L. Scott
Article Title An argument for simple COMA
Authors A. Saulsbury; T. Wilkinson; J. Carter; A. Landin
Article Title Efficient and balanced adaptive routing in two-dimensional meshes
Authors J. H. Upadhyay; V. Varavithya; P. Mohapatra
Article Title DASC cache
Authors A. Seznec
Article Title Optimizing instruction cache performance for operating system intensive workloads
Authors J. Torrellas; Chun Xia; R. Daigle
Article Title Implementing register interlocks in parallel-pipeline, multiple instruction queue, superscalar processors
Authors S. Weiss
Article Title Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors
Authors F. Dahlgren; P. Stenstrom
Article Title A VLSI architecture for computing the tree-to-tree distance
Authors R. Sastry; N. Ranganathan
Article Title Fast barrier synchronization in wormhole k-ary n-cube networks with multidestination worms
Authors D. K. Panda
Article Title Access ordering and memory-conscious cache utilization
Authors S. A. Mckee; W. A. Wulf
Article Title Fine-grain multi-thread processor architecture for massively parallel processing
Authors T. Kawano; S. Kusakabe; R. -I. Taniguchi; M. Amamiya
Article Title An initial evaluation of the Convex SPP-1000 for earth and space science applications
Authors T. L. Sterling; D. F. Savarese; P. R. Merkey; J. P. Gardner
Article Title Improving performance by cache driven memory management
Authors K. Westerholz; S. Honal; J. Plankl; C. Hafer
Article Title Software cache coherence for large scale multiprocessors
Authors L. I. Kontothanassis; M. L. Scott
Article Title Software assistance for data caches
Authors O. Temam; N. Drach
Article Title Fault-tolerant adaptive routing for two-dimensional meshes
Authors C. M. Cunningham; D. R. Avresky
Article Title Memory access reordering in vector processors
Authors De-Lei Lee
Article Title A design framework for hybrid-access caches
Authors K. B. Theobald; H. H. J. Hum; G. R. Gao
Article Title Program balance and its impact on high performance RISC architectures
Authors L. K. John; V. Reddy; P. T. Hulina; L. D. Coraor

HPCA 1996

Article Title Shuffle-Ring: overcoming the increasing degree of hypercube
Authors Guihai Chen; F. C. M. Lau
Article Title On the multiplexing degree required to embed permutations in a class of networks with direct interconnects
Authors Chunming Qiao; Yousong Mei
Article Title RMB-a reconfigurable multiple bus network
Authors H. ElGindy; H. Schroder; A. Spray; A. K. Somani; H. Schmeck
Article Title Bus-based COMA-reducing traffic in shared-bus multiprocessors
Authors A. Landin; F. Dahlgren
Article Title Improving the data cache performance of multiprocessor operating systems
Authors Chun Xia; J. Torrellas
Article Title Parallel intersecting compressed bit vectors in a high speed query server for processing postal addresses
Authors Wen-Jann Yang; R. Sridhar; V. Demjanenko
Article Title Using memory-mapped network interfaces to improve the performance of distributed shared memory
Authors L. I. Kontothanassis; M. L. Scott
Article Title Distance-adaptive update protocols for scalable shared-memory multiprocessors
Authors A. Raynaud; Zheng Zhang; J. Torrellas
Article Title Performance study of a multithreaded superscalar microprocessor
Authors M. Gulati; N. Bagherzadeh
Article Title The impact of shared-cache clustering in small-scale shared-memory multiprocessors
Authors B. A. Nayfeh; K. Olukotun; J. P. Singh
Article Title A cache coherency protocol for optically connected parallel computer systems
Authors J. A. Reisner; T. S. Wailes
Article Title Protected, user-level DMA for the SHRIMP network interface
Authors M. A. Blumrich; C. Dubnicki; E. W. Felten; Kai Li
Article Title Distributed prefetch-buffer/cache design for high performance memory systems
Authors T. Alexander; G. Kedem
Article Title A shared-bus control mechanism and a cache coherence protocol for a high-performance on-chip multiprocessor
Authors M. Takahashi; H. Takano; E. Kaneko; S. Suzuki
Article Title Decoupled vector architectures
Authors R. Espasa; M. Valero
Article Title Representative traces for processor models with infinite cache
Authors V. S. Iyengar; L. H. Trevillyan; P. Bose
Article Title Multitasking and multithreading on a multiprocessor with virtual shared memory
Authors H. L. Muller; P. W. A. Stallard; D. H. D. Warren
Article Title A comparison of entry consistency and lazy release consistency implementations
Authors S. V. Adve; A. L. Cox; S. Dwarkadas; R. Rajamony; W. Zwaenepoel
Article Title Telegraphos: high-performance networking for parallel processing on workstation clusters
Authors E. P. Markatos; M. G. H. Katevenis
Article Title Predictive sequential associative cache
Authors B. Calder; D. Grunwald; J. Emer
Article Title Fault-tolerant multicast routing in the mesh with no virtual channels
Authors R. Libeskind-Hadas; K. Watkins; T. Hehre
Article Title Two adaptive hybrid cache coherency protocols
Authors C. Anderson; A. R. Karlin
Article Title Performance characterization of the Alpha 21164 microprocessor using TP and SPEC workloads
Authors Z. Cvetanovic; D. Bhandarkar
Article Title Register file design considerations in dynamically scheduled processors
Authors K. I. Farkas; N. P. Jouppi; P. Chow
Article Title A topology-independent generic methodology for deadlock-free wormhole routing
Authors H. Park; D. P. Agrawal
Article Title Co-scheduling hardware and software pipelines
Authors R. Govindarajan; E. R. Altman; G. R. Gao
Article Title Fault-tolerance with multimodule routers
Authors S. Chalasani; R. V. Boppana
Article Title Performance evaluation of a cluster-based multiprocessor built from ATM switches and bus-based multiprocessor servers
Authors M. Karlsson; P. Stenstrom
Article Title Improving release-consistent shared virtual memory using automatic update
Authors L. Iftode; C. Dubnicki; E. W. Felten; Kai Li

HPCA 1997

Article Title Software-managed address translation
Authors B. Jacob; T. Mudge
Article Title Towards a communication characterization methodology for parallel applications
Authors S. Chodnekar; V. Srinivasan; A. S. Vaidya; A. Sivasubramaniam; C. R. Das
Article Title Distributed path reservation algorithms for multiplexed all-optical interconnection networks
Authors X. Yuan; R. Melhem; R. Gupta
Article Title Design issues and tradeoffs for write buffers
Authors K. Skadron; D. W. Clark
Article Title An evaluation of fine-grain producer-initiated communication in cache-coherent multiprocessors
Authors H. Abdel-Shafi; J. Hall; S. V. Adve; V. S. Adve
Article Title On the use and performance of explicit communication primitives in cache-coherent multiprocessor systems
Authors X. Qin; J. -L. Baer
Article Title Software DSM protocols that adapt between single writer and multiple writer
Authors C. Amza; A. L. Cox; S. Dwarkadas; W. Zwaenepoel
Article Title ATM and fast Ethernet network interfaces for user-level communication
Authors M. Welsh; A. Basu; T. von Eicken
Article Title Message proxies for efficient, protected communication on SMP clusters
Authors B. -H. Lim; P. Heidelberger; P. Pattnaik; M. Snir
Article Title A framework for statistical modeling of superscalar processor performance
Authors D. B. Noonburg; J. P. Shen
Article Title Speeding up the memory hierarchy in Flat COMA multiprocessors
Authors L. Yang; J. Torrellas
Article Title Architectural support for compiler-synthesized dynamic branch prediction strategies: Rationale and initial results
Authors D. I. August; D. A. Connors; J. C. Gyllenhaal; W. -M. W. Hwu
Article Title Multithreaded vector architectures
Authors R. Espasa; M. Valero
Article Title Reducing the replacement overhead in bus-based COMA multiprocessors
Authors F. Dahlgren; A. Landin
Article Title Scheduling communication on an SMP node parallel machine
Authors B. Falsafi; D. A. Wood
Article Title Multicast on irregular switch-based networks with wormhole routing
Authors R. Kesavan; K. Bondalapati; D. K. Panda
Article Title Reducing remote conflict misses: NUMA with remote cache versus COMA
Authors Z. Zhang; J. Torrellas
Article Title A performance comparison of hierarchical ring- and mesh-connected multiprocessor networks
Authors G. Ravindran; M. Stumm
Article Title Reducing the communication overhead of dynamic applications on shared memory multiprocessors
Authors A. Sivasubramaniam
Article Title Global address space, non-uniform bandwidth: a memory system performance characterization of parallel systems
Authors T. Stricker; T. Cross
Article Title The impact of instruction-level parallelism on multiprocessor performance and simulation methodology
Authors V. S. Pai; P. Ranganathan; S. V. Adve
Article Title Multiple branch and block prediction
Authors S. Wallace; N. Bagherzadeh
Article Title The memory performance of DSS commercial workloads in shared-memory multiprocessors
Authors P. Trancoso; J. -L. Larriba-Pey; Z. Zhang; J. Torrellas
Article Title Performance characterization of the Pentium Pro processor
Authors D. Bhandarkar; J. Ding
Article Title Architectural support for reducing communication overhead in multiprocessor interconnection networks
Authors B. Vien Dao; S. Yalamanchili; J. Duato
Article Title User-level DMA without operating system kernel modification
Authors E. P. Markatos; M. G. H. Katevenis
Article Title Evaluating MPI collective communication on the SP2, T3D, and Paragon multicomputers
Authors Kai Hwang; Choming Wang; Cho-Li Wang
Article Title Datapath design for a VLIW video signal processor
Authors A. Wolfe; J. Fritts; S. Dutta; E. S. T. Fernandes
Article Title Control flow speculation in multiscalar processors
Authors Q. Jacobson; S. Bennett; N. Sharma; J. E. Smith
Article Title Advances of the counterflow pipeline microarchitecture
Authors K. J. Janik; S. -L. Lu; M. F. Miller

HPCA 1998

Article Title Supporting highly speculative execution via adaptive branch trees
Authors Tien-Fu Chen
Article Title Virtual-physical registers
Authors A. Gonzalez; J. Gonzalez; M. Valero
Article Title Hardware for speculative run-time parallelization in distributed shared-memory multiprocessors
Authors Ye Zhang; L. Rauchwerger; J. Torrellas
Article Title Enhancing memory use in Simple Coma: Multiplexed Simple Coma
Authors S. Basu; J. Torrellas
Article Title PRISM: an integrated architecture for scalable shared memory
Authors K. Ekanadham; Beng-Hong Lim; P. Pattnaik; M. Snir
Article Title Architectural implications of a family of irregular applications
Authors D. O’Hallaron; J. R. Shewchuk; T. Gross
Article Title The emergence of workstation clusters: Should we continue to build mpps? [panel session]
Authors D. K. Panda
Article Title Challenging applications on fast networks
Authors K. Langendoen; R. Hofman; H. Bal
Article Title Fine-grain software distributed shared memory on SMP clusters
Authors D. J. Scales; K. Gharachorloo; A. Aggarwal
Article Title A very efficient distributed deadlock detection mechanism for wormhole networks
Authors P. Lopez; J. M. Martinez; J. Duato
Article Title Home-based SVM protocols for SMP clusters: Design and performance
Authors R. Samanta; A. Bilas; L. Iftode; J. P. Singh
Article Title Credit-flow-controlled ATM for MP interconnection: The ATLAS I single-chip ATM switch
Authors M. Katevenis; D. Serpanos; E. Spyridakis
Article Title The impact of data transfer and buffering alternatives on network interface design
Authors S. S. Mukherjee; M. D. Hill
Article Title The effectiveness of SRAM network caches in clustered DSMs
Authors A. Moga; M. Dubois
Article Title Control speculation in multithreaded processors through dynamic loop detection
Authors J. Tubella; A. Gonzalez
Article Title The sensitivity of communication mechanisms to bandwidth and latency
Authors F. T. Chong; R. Barua; F. Dahlgren; J. D. Kubiatowicz; A. Agarwal
Article Title Partial sampling with reverse state reconstruction: A new technique for branch predictor performance estimation
Authors D. E. Vengroff; G. R. Gao
Article Title Using multicast and multithreading to reduce communication in software DSM systems
Authors E. Speight; J. K. Bennett
Article Title Speculative versioning cache
Authors S. Gopal; T. N. Vijaykumar; J. E. Smith; G. S. Sohi
Article Title The architectural costs of streaming I/O: A comparison of workstations, clusters, and SMPs
Authors R. H. Arpaci-Dusseau; A. C. Arpaci-Dusseau; D. E. Culler; J. M. Hellerstein; D. A. Patterson
Article Title Address translation mechanisms in network interfaces
Authors I. Schoinas; M. D. Hill
Article Title The potential for using thread-level data speculation to facilitate automatic parallelization
Authors J. G. Steffan; T. C. Mowry
Article Title Performance study of a concurrent multithreaded processor
Authors Jenn-Yuan Tsai; Zhenzhen Jiang; E. Ness; Pen-Chung Yew
Article Title FPGA based custom computing machines for irregular problems
Authors D. Abramson; P. Logothetis; A. Postula; M. Randall
Article Title Exploiting two-case delivery for fast protected messaging
Authors K. Mackenzie; J. Kubiatowicz; M. Frank; W. Lee; V. Lee; A. Agarwal; M. F. Kaashoek
Article Title Non-stalling counterflow architecture
Authors M. F. Miller; K. J. Janik; Shih-Lien Lu
Article Title Temporal-based procedure reordering for improved instruction cache performance
Authors J. Kalamationos; D. R. Kaeli
Article Title Performance evaluation of tiling for the register level
Authors M. Jimenez; J. M. Llaberia; A. Fernandez
Article Title Treegion scheduling for wide issue processors
Authors W. A. Havanki; S. Banerjia; T. M. Conte
Article Title Communication across fault-containment firewalls on the SGI origin
Authors K. Ghosh; A. J. Christie
Article Title Efficiently adapting to sharing patterns in software DSMs
Authors L. R. Monnerat; R. Bianchini
Article Title Comparative evaluation of latency tolerance techniques for software distributed shared memory
Authors T. C. Mowry; C. Q. C. Chan; A. K. W. Lo

HPCA 1999

Article Title A study of control independence in superscalar processors
Authors E. Rotenberg; Q. Jacobson; J. Smith
Article Title Impulse: building a smarter memory controller
Authors J. Carter; W. Hsieh; L. Stoller; M. Swanson; Lixin Zhang; E. Brunvand; A. Davis; Chen-Chi Kuo; R. Kuramkote; M. Parker; L. Schaelicke; T. Tateyama
Article Title Sensitivity of parallel applications to large differences in bandwidth and latency in two-layer interconnects
Authors A. Plaat; H. E. Bal; R. F. H. Hofman
Article Title Improving CC-NUMA performance using Instruction-based Prediction
Authors S. Kaxiras; J. R. Goodman
Article Title Distributed modulo scheduling
Authors M. M. Fernandes; J. Llosa; N. Topham
Article Title A performance comparison of homeless and home-based lazy release consistency protocols in software shared memory
Authors A. L. Cox; E. de Lara; C. Hu; W. Zwaenepoel
Article Title The synergy of multithreading and access/execute decoupling
Authors J. -M. Parcerisa; A. Gonzalez
Article Title Supporting fine-grained synchronization on a simultaneous multithreading processor
Authors D. M. Tullsen; J. L. Lo; S. J. Eggers; H. M. Levy
Article Title LAPSES: a recipe for high performance adaptive router design
Authors A. S. Vaidya; A. Sivasubramaniam; C. R. Das
Article Title Using Lamport clocks to reason about relaxed memory models
Authors A. E. Condon; M. D. Hill; M. Plakal; D. J. Sorin
Article Title Memory hierarchy considerations for fast transpose and bit-reversals
Authors K. S. Gatlin; L. Carter
Article Title The impact of link arbitration on switch performance
Authors M. Pirvu; L. Bhuyan; N. Ni
Article Title Limits to the performance of software shared memory: a layered approach
Authors A. Bilas; Dongming Jiang; Yuanyuan Zhou; J. P. Singh
Article Title Dynamically variable line-size cache exploiting high on-chip memory bandwidth of merged DRAM/logic LSIs
Authors K. Inoue; K. Kai; K. Murakami
Article Title Impact of buffer size on the efficiency of deadlock detection
Authors J. M. Martinez; P. Lopez; J. Duato
Article Title Switch cache: a framework for improving the remote memory access latency of CC-NUMA multiprocessors
Authors R. Iyer; L. N. Bhuyan
Article Title Exploiting basic block value locality with block reuse
Authors Jian Huang; D. J. Lilja
Article Title Instruction pre-processing in trace processors
Authors Q. Jacobson; J. E. Smith
Article Title Out-of-order execution may not be cost-effective on processors featuring simultaneous multithreading
Authors S. Hily; A. Seznec
Article Title Comparative evaluation of fine- and coarse-grain approaches for software distributed shared memory
Authors S. Dwarkadas; K. Gharachorloo; L. Kontothanassis; D. J. Scales; M. L. Scott; R. Stets
Article Title WildFire: a scalable path for SMPs
Authors E. Hagersten; M. Koster
Article Title MP-LOCKs: replacing H/W synchronization primitives with message passing
Authors Chen-Chi Kuo; J. Carter; R. Kuramkote
Article Title Dynamically exploiting narrow width operands to improve processor power and performance
Authors D. Brooks; M. Martonosi
Article Title Access order and effective bandwidth for streams on a Direct Rambus memory
Authors S. I. Hong; S. A. McKee; M. H. Salinas; R. H. Klenke; J. H. Aylor; W. A. Wulf
Article Title RAPID-Cache-a reliable and inexpensive write cache for disk I/O systems
Authors Yiming Hu; Qing Yang; T. Nightingale
Article Title Improving the accuracy vs. speed tradeoff for simulating shared-memory multiprocessors with ILP processors
Authors M. Durbhakula; V. S. Pai; S. Adve
Article Title A scalable cache coherent scheme exploiting wormhole routing networks
Authors Yunseok Rhee; Joonwon Lee
Article Title Global context-based value prediction
Authors T. Nakra; R. Gupta; M. L. Soffa
Article Title Parallel Dispatch Queue: a queue-based programming abstraction to parallelize fine-grain communication protocols
Authors B. Falsafi; D. A. Wood
Article Title Hardware for speculative parallelization of partially-parallel loops in DSM multiprocessors
Authors Ye Zhang; L. Rauchwerger; J. Torrellas
Article Title Efficient all-to-all broadcast in all-port mesh and torus networks
Authors Yuanyuan Yang; Jianchao Wang
Article Title Instruction recycling on a multiple-path processor
Authors S. Wallace; D. M. Tullsen; B. Calder
Article Title Permutation development data layout (PDDL)
Authors T. J. E. Schwarz; J. Steinberg; W. A. Burkhard
Article Title Design and performance of directory caches for scalable shared memory multiprocessors
Authors M. M. Michael; A. K. Nanda
Article Title MMR: a high-performance MultiMedia Router-architecture and design trade-offs
Authors J. Duato; S. Yalamanchili; M. B. Caminero; D. Love; F. J. Quiles
Article Title Lightweight hardware distributed shared memory supported by generalized combining
Authors K. Tanaka; T. Matsumoto; K. Hiraki
Article Title Communication studies of single-threaded and multithreaded distributed-memory machines
Authors A. Sohn; Yunheung Paek; Jui-Yuan Ku; Y. Kodama; Y. Yamaguchi

HPCA 2000

Article Title A prefetching technique for irregular accesses to linked data structures
Authors M. Karlsson; F. Dahlgren; P. Stenstrom
Article Title Cache-efficient matrix transposition
Authors S. Chatterjee; S. Sen
Article Title On the performance of hand vs. automatically optimized numerical codes
Authors M. Jimenez; J. M. Llaberia; A. Fernandez
Article Title Evaluation of active disks for decision support databases
Authors M. Uysal; A. Acharya; J. Saltz
Article Title Improving the throughput of synchronization by insertion of delays
Authors R. Rajwar; A. Kagi; J. R. Goodman
Article Title The effect of network total order, broadcast, and remote-write capability on network-based shared memory computing
Authors R. Stets; S. Dwarkadas; L. Kontothanassis; U. Rencuzogullari; M. L. Scott
Article Title Coherence communication prediction in shared-memory multiprocessors
Authors S. Kaxiras; C. Young
Article Title Register organization for media processing
Authors S. Rixner; W. J. Dally; B. Khailany; P. Mattson; U. J. Kapasi; J. D. Owens
Article Title Cache memory design for network processors
Authors Tzi-Cker Chiueh; P. Pradhan
Article Title Flit-reservation flow control
Authors Li-Shiuan Peh; W. J. Dally
Article Title Combining static and dynamic branch prediction to reduce destructive aliasing
Authors H. Patil; J. Emer
Article Title Trace cache redundancy: red and blue traces
Authors A. Ramirez; J. Ll. Larriba-Pey; M. Valero
Article Title Design of a parallel vector access unit for SDRAM memory systems
Authors B. K. Mathew; S. A. McKee; J. B. Carter; A. Davis
Article Title High-throughput coherence controllers
Authors A. K. Nanda; A. -T. Nguyen; M. M. Michael; D. J. Joseph
Article Title Reducing code size with run-time decompression
Authors C. Lefurgy; E. Piccininni; T. Mudge
Article Title Investigating the performance of two programming models for clusters of SMP PCs
Authors F. Cappello; O. Richard; D. Etiemble
Article Title PowerMANNA: a parallel architecture based on the PowerPC MPC620
Authors P. M. Behr; S. Pletner; A. C. Sodan
Article Title Architectural issues in Java runtime systems
Authors R. Radhakrishnan; N. Vijaykrishnan; L. K. John; A. Sivasubramaniam
Article Title Performance evaluation of dynamic reconfiguration in high-speed local area networks
Authors R. Casado; A. Bermudez; F. J. Quiles; J. L. Sanchez; J. Duato
Article Title Modified LRU policies for improving second-level cache behavior
Authors W. A. Wong; J. -L. Baer
Article Title Decoupled value prediction on trace processors
Authors Sang-Jeong Lee; Yuan Wang; Pen-Chung Yew
Article Title Performance analysis and visualization of parallel systems using SimOS and Rivet: a case study
Authors R. Bosch; C. Stolte; G. Stoll; M. Rosenblum; P. Hanrahan
Article Title A DSM architecture for a parallel computer Cenju-4
Authors T. Hosomi; Y. Kanoh; M. Nakamura; T. Hirose
Article Title The best distribution for a parallel OpenGL 3D engine with texture caches
Authors A. Vartanian; J. -L. Bechennec; N. Drach-Temam
Article Title Investigating QoS support for traffic mixes with the MediaWorm router
Authors Ki Hwan Yum; A. Vaidya; C. R. Das; A. Sivasubramaniam
Article Title eXtended block cache
Authors S. Jourdan; L. Rappoport; Y. Almog; M. Erez; A. Yoaz; R. Ronen
Article Title Branch transition rate: a new metric for improved branch classification analysis
Authors M. Haungs; P. Sallee; M. Farrens
Article Title Impact of chip-level integration on performance of OLTP workloads
Authors L. A. Barroso; K. Gharachorloo; A. Nowatzyk; B. Verghese
Article Title Memory dependence speculation tradeoffs in centralized, continuous-window superscalar processors
Authors A. Moshovos; G. S. Sohi
Article Title Quantifying the SMT layout overhead-does SMT pull its weight?
Authors J. Burns; J. -L. Gaudiot
Article Title Toward a cost-effective DSM organization that exploits processor-memory integration
Authors J. Torrellas; Liuxi Yang; A. -T. Nguyen
Article Title A technique for high bandwidth and deterministic low latency load/store accesses to multiple cache banks
Authors H. Neefs; H. Vandierendonck; K. De Bosschere
Article Title Software-controlled multithreading using informing memory operations
Authors T. C. Mowry; S. R. Ramkissoon
Article Title Impact of heterogeneity on DSM performance
Authors R. J. O. Figueiredo; J. A. B. Fortes
Article Title Dynamic cluster assignment mechanisms
Authors R. Canal; J. M. Parcerisa; A. Gonzalez

HPCA 2001

Article Title Stack Value File: Custom Microarchitecture for the Stack
Authors H.-H. S. Lee; M. Smelyanskiy; C. J. Newburn; G. S. Tyson
Article Title Register Renaming and Scheduling for Dynamic Execution of Predicated Code
Authors P. H. Wang; H. Wang; R. M. Kling; K. Ramakrishnan; J. P. Shen
Article Title Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order
Authors P. Michaud; A. Seznec
Article Title Speculative Data-Driven Multithreading
Authors A. Roth; G. S. Sohi
Article Title Towards Virtually-Addressed Memory Hierarchies
Authors X. Qiu; M. Dubois
Article Title Reevaluating Online Superpage Promotion with Hardware Support
Authors Z. Fang; L. Zhang; J. B. Carter; W.C. Hsieh; S. A. McKee
Article Title Performance of Hardware Compressed Main Memory
Authors B. Abali; H. Franke; X. Shen; D. E. Poff; T. B. Smith
Article Title JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers
Authors A. Moshovos; G. Memik; B. Falsafi; A. Choudhary
Article Title A New Scalable Directory Architecture for Large-scale Multiprocessors
Authors M. E. Acacio; J. Gonzalez; J. M. Garcia; J. Duato
Article Title Self-Tuned Congestion Control for Multiprocessor Networks
Authors M. Thottethodi; A. R. Lebeck; S. S. Mukherjee
Article Title Automatically Mapping Code on an Intelligent Memory Architecture
Authors J. Lee; Y. Solihin; J. Torrellas
Article Title CARS: A New Code Generation Framework for Clustered ILP Processors
Authors K. Kailas; K. Ebcioglu; A. Agrawala
Article Title An Integrated Circuit/Architecture Approach to Reducing Leakage
in Deep-Submicron High-Performance I-Caches
Authors S.-H. Yang; M. D. Powell; B. Falsafi; K. Roy; T. N. Vijaykumar
Article Title DRAM Energy Management Using Software and Hardware Directed
Power Mode Control
Authors V. Delaluz; M. Kandemir; N. Vijaykrishnan; A. Sivasubramaniam;
M. J. Irwin
Article Title Dynamic Thermal Management for High-Performance Microprocessors
Authors D. Brooks; M. Martonosi
Article Title Dynamic Prediction of Critical Path Instructions
Authors E. Tune; D. Liang; D. M. Tullsen; B. Calder
Article Title Dynamic Branch Prediction with Perceptrons
Authors D.A. Jimenez; C. Lin
Article Title Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency
Authors B. Goeman; H. Vandierendonck; K. De Bosschere
Article Title DLP +TLP Processors for the Next Generation of Media Workloads
Authors J. Corbal; R. Espasa; M. Valero
Article Title An Architectural Evaluation of Java TPC-W
Authors H. W. Cain; R. Rajwar; M. Marden; M. H. Lipasti
Article Title A Programmable Co-Processor for Profiling
Authors C. B. Zilles; G. S. Sohi
Article Title A Delay Model and Speculative Architecture for Pipelined Routers
Authors L. Peh; W.J. Dally
Article Title Quantifying the Impact of Architectural Scaling on Communication
Authors T. Heath; S. Kaw; R. P. Martin; T.D. Nguyen
Article Title Call Graph Prefetching for Database Applications
Authors M. Annavaram; J. M. Patel; E. S. Davidson
Article Title Branch History Guided Instruction Prefetching
Authors V. Srinivasan; E. S. Davidson; G. S. Tyson; M. J. Charney; T. R. Puzak
Article Title Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
Authors W. Lin; S. K. Reinhardt; D. Burger