Follow
Kiran Kumar Matam
Kiran Kumar Matam
Research Scientist, Facebook
Verified email at usc.edu
Title
Cited by
Cited by
Year
Summarizer: trading communication with computing near storage
G Koo*, KK Matam*, I Te, HVKG Narra, J Li, HW Tseng, S Swanson, ...
2017 50th Annual IEEE/ACM International Symposium on Microarchitecture …, 2017
1532017
Sparse matrix-matrix multiplication on modern architectures
K Matam, SRKB Indarapu, K Kothapalli
2012 19th International Conference on High Performance Computing, 1-10, 2012
92*2012
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
692022
Accelerating sparse matrix vector multiplication in iterative methods using GPU
KK Matam, K Kothapalli
2011 International Conference on Parallel Processing, 612-621, 2011
692011
GraphSSD: graph semantics aware SSD
KK Matam, G Koo, H Zha, HW Tseng, M Annavaram
Proceedings of the 46th international symposium on computer architecture …, 2019
602019
High throughput and programmable online trafficclassifier on FPGA
D Tong, L Sun, K Matam, V Prasanna
Proceedings of the ACM/SIGDA international symposium on Field programmable …, 2013
432013
M. khorashadi, P
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Bhattacharya, P. Lapukhov, M. Naumov, L. Qiao, M. Smelyanskiy, B. Jia, and V …, 2021
362021
{Check-N-Run}: A checkpointing system for training deep learning recommendation models
A Eisenman, KK Matam, S Ingram, D Mudigere, R Krishnamoorthi, K Nair, ...
19th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2022
352022
First-generation inference accelerator deployment at facebook
M Anderson, B Chen, S Chen, S Deng, J Fix, M Gschwind, A Kalaiah, ...
arXiv preprint arXiv:2107.04140, 2021
312021
High-performance, distributed training of large-scale deep learning recommendation models
D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ...
arXiv preprint arXiv:2104.05158, 2021
302021
CPU and/or GPU: Revisiting the GPU vs. CPU myth
K Kothapalli, DS Banerjee, PJ Narayanan, S Sood, AK Bahl, S Sharma, ...
arXiv preprint arXiv:1303.2171, 2013
162013
GPU accelerated Lanczos algorithm with applications
KK Matam, K Kothapalli
2011 IEEE Workshops of International Conference on Advanced Information …, 2011
152011
Energy-efficient large-scale matrix multiplication on FPGAs
KK Matam, VK Prasanna
2013 International Conference on Reconfigurable Computing and FPGAs …, 2013
112013
Efficient Discrete Range Searching primitives on the GPU with applications
J Soman, MK Kumar, K Kothapalli, PJ Narayanan
High Performance Computing (HiPC), 2010 International Conference on, 1-10, 2010
112010
Evaluating energy efficiency of floating point matrix multiplication on FPGAs
KK Matam, H Le, VK Prasanna
2013 IEEE High Performance Extreme Computing Conference (HPEC), 1-6, 2013
82013
Check-n-run: A checkpointing system for training recommendation models
A Eisenman, KK Matam, S Ingram, D Mudigere, R Krishnamoorthi, ...
arXiv preprint arXiv:2010.08679 5, 2020
52020
Energy efficient architecture for matrix multiplication on fpgas
KK Matam, H Le, VK Prasanna
2013 23rd International Conference on Field programmable Logic and …, 2013
52013
Multilogvc: efficient out-of-core graph processing framework for flash storage
KK Matam, H Hashemi, M Annavaram
2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2021
42021
PartitionedVC: Partitioned external memory graph analytics framework for SSDs
KK Matam, H Hashemi, M Annavaram
arXiv preprint arXiv:1905.04264, 2019
42019
Efficient automatic parallelization of a single GPU program for a multiple GPU system
MK Kumar, MR Abdel-Majeed, M Annavaram
Integration 66, 35-43, 2019
42019
The system can't perform the operation now. Try again later.
Articles 1–20