Distributed deep learning using synchronous stochastic gradient descent D Das, S Avancha, D Mudigere, K Vaidynathan, S Sridharan, D Kalamkar, ... arXiv preprint arXiv:1602.06709, 2016 | 174 | 2016 |
Mixed precision training of convolutional neural networks using integer operations D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ... arXiv preprint arXiv:1802.00930, 2018 | 125 | 2018 |
Deep learning at 15pf: supervised and semi-supervised classification for scientific data T Kurth, J Zhang, N Satish, E Racah, I Mitliagkas, MMA Patwary, T Malas, ... Proceedings of the International Conference for High Performance Computing …, 2017 | 84 | 2017 |
Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints S Sridharan, J Dinan, DD Kalamkar SC'14: Proceedings of the International Conference for High Performance …, 2014 | 48 | 2014 |
Thread migration to improve synchronization performance S Sridharan, B Keck, R Murphy, S Chandra, P Kogge Workshop on Operating System Interference in High Performance Applications, 2006 | 39 | 2006 |
Deep learning training in facebook data centers: Design of scale-up and scale-out systems M Naumov, J Kim, D Mudigere, S Sridharan, X Wang, W Zhao, S Yilmaz, ... arXiv preprint arXiv:2003.09518, 2020 | 31 | 2020 |
On scale-out deep learning training for cloud and hpc S Sridharan, K Vaidyanathan, D Kalamkar, D Das, ME Smorkalov, ... arXiv preprint arXiv:1801.08030, 2018 | 28 | 2018 |
Memory in processor: A novel design paradigm for supercomputing architectures N Venkateswaran, WR Foundation, A Krishnan, SN Kumar, A Shriraman, ... ACM SIGARCH Computer Architecture News 32 (3), 19-26, 2003 | 27 | 2003 |
Comparing runtime systems with exascale ambitions using the parallel research kernels RF Wijngaart, A Kayi, JR Hammond, G Jost, T St John, S Sridharan, ... International Conference on High Performance Computing, 321-339, 2016 | 19 | 2016 |
Fine-grain compute communication execution for deep learning frameworks S Sridharan, D Mudigere US Patent App. 15/869,502, 2018 | 18 | 2018 |
Exploring shared-memory optimizations for an unstructured mesh CFD application on modern parallel systems D Mudigere, S Sridharan, A Deshpande, J Park, A Heinecke, ... 2015 IEEE International Parallel and Distributed Processing Symposium, 723-732, 2015 | 18 | 2015 |
Communication optimizations for distributed machine learning S Sridharan, K Vaidyanathan, D Das, C Sakthivel, ME Smorkalov US Patent 11,270,201, 2022 | 15 | 2022 |
Extending the BT NAS parallel benchmark to exascale computing RF Van der Wijngaart, S Sridharan, VW Lee SC'12: Proceedings of the International Conference on High Performance …, 2012 | 15 | 2012 |
Evaluating synchronization techniques for light-weight multithreaded/multicore architectures S Sridharan, A Rodrigues, P Kogge Proceedings of the nineteenth annual ACM symposium on Parallel algorithms …, 2007 | 15 | 2007 |
Dynamic precision management for integer deep learning primitives N Mellempudi, D Mudigere, D Das, S Sridharan US Patent 10,643,297, 2020 | 14 | 2020 |
TensorFlow at Scale: Performance and productivity analysis of distributed training with Horovod, MLSL, and Cray PE ML T Kurth, M Smorkalov, P Mendygral, S Sridharan, A Mathuriya Concurrency and Computation: Practice and Experience 31 (16), e4989, 2019 | 14 | 2019 |
Planning for performance: Enhancing achievable performance for MPI through persistent collective operations DJ Holmes, B Morgan, A Skjellum, PV Bangalore, S Sridharan Parallel Computing 81, 32-57, 2019 | 14 | 2019 |
Data parallelism and halo exchange for distributed machine learning D Das, K Vaidyanathan, S Sridharan US Patent App. 15/869,551, 2018 | 14 | 2018 |
Planning for performance: persistent collective operations for MPI B Morgan, DJ Holmes, A Skjellum, P Bangalore, S Sridharan Proceedings of the 24th European MPI Users' Group Meeting, 1-11, 2017 | 14 | 2017 |
High-performance, distributed training of large-scale deep learning recommendation models D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ... arXiv e-prints, arXiv: 2104.05158, 2021 | 13 | 2021 |