Richard Vuduc
TitleCited byYear
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
S Williams, L Oliker, R Vuduc, J Shalf, K Yelick, J Demmel
Supercomputing, 2007. SC'07. Proceedings of the 2007 ACM/IEEE Conference on …, 2007
OSKI: A library of automatically tuned sparse matrix kernels
R Vuduc, JW Demmel, KA Yelick
Journal of Physics: Conference Series 16 (1), 521, 2005
Model-driven autotuning of sparse matrix-vector multiply on GPUs
JW Choi, A Singh, RW Vuduc
ACM sigplan notices 45 (5), 115-126, 2010
Sparsity: Optimization framework for sparse matrix kernels
EJ Im, K Yelick, R Vuduc
The International Journal of High Performance Computing Applications 18 (1 …, 2004
Automatic performance tuning of sparse matrix kernels
RW Vuduc, JW Demmel
University of California, Berkeley, 2003
Self-adapting linear algebra algorithms and software
J Demmel, J Dongarra, V Eijkhout, E Fuentes, A Petitet, R Vuduc, ...
Proceedings of the IEEE 93 (2), 293-312, 2005
A massively parallel adaptive fast-multipole method on heterogeneous architectures
I Lashuk, A Chandramowlishwaran, H Langston, TA Nguyen, R Sampath, ...
Proceedings of the Conference on High Performance Computing Networking …, 2009
Performance optimizations and bounds for sparse matrix-vector multiply
R Vuduc, JW Demmel, KA Yelick, S Kamil, R Nishtala, B Lee
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 1-35, 2002
Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures
A Rahimian, I Lashuk, S Veerapaneni, A Chandramowlishwaran, ...
Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010
A performance analysis framework for identifying potential benefits in GPGPU applications
J Sim, A Dasgupta, H Kim, R Vuduc
ACM SIGPLAN Notices 47 (8), 11-22, 2012
Fast sparse matrix-vector multiplication by exploiting variable block structure
R Vuduc, HJ Moon
High Performance Computing and Communications, 807-816, 2005
On the limits of GPU acceleration
R Vuduc, A Chandramowlishwaran, J Choi, M Guney, A Shringarpure
Proceedings of the 2nd USENIX conference on Hot topics in parallelism 13, 2010
Falcon: fault localization in concurrent programs
S Park, RW Vuduc, MJ Harrold
Proceedings of the 32nd ACM/IEEE International Conference on Software …, 2010
Many-thread aware prefetching mechanisms for GPGPU applications
J Lee, NB Lakshminarayana, H Kim, R Vuduc
Microarchitecture (MICRO), 2010 43rd Annual IEEE/ACM International Symposium …, 2010
Statistical models for empirical search-based performance tuning
R Vuduc, JW Demmel, JA Bilmes
International Journal of High Performance Computing Applications 18 (1), 65-94, 2004
POET: Parameterized optimizations for empirical tuning
Q Yi, K Seymour, H You, R Vuduc, D Quinlan
Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE …, 2007
A roofline model of energy
JW Choi, D Bedard, R Fowler, R Vuduc
Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International …, 2013
When cache blocking of sparse matrix vector multiply works and why
R Nishtala, RW Vuduc, JW Demmel, KA Yelick
Applicable Algebra in Engineering, Communication and Computing 18 (3), 297-311, 2007
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems
S Venkatasubramanian, RW Vuduc
Proceedings of the 23rd international conference on Supercomputing, 244-255, 2009
Self-stabilizing iterative solvers
P Sao, R Vuduc
Proceedings of the Workshop on Latest Advances in Scalable Algorithms for …, 2013
The system can't perform the operation now. Try again later.
Articles 1–20