Follow
Naveen Mellempudi
Naveen Mellempudi
Verified email at amd.com
Title
Cited by
Cited by
Year
A study of BFLOAT16 for deep learning training
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
3002019
Mixed precision training of convolutional neural networks using integer operations
D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ...
arXiv preprint arXiv:1802.00930, 2018
1872018
Ternary neural networks with fine-grained quantization
N Mellempudi, A Kundu, D Mudigere, D Das, B Kaul, P Dubey
arXiv preprint arXiv:1705.01462, 2017
1302017
Performing power management in a multicore processor
VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ...
US Patent 10,234,930, 2019
1242019
Mixed precision training with 8-bit floating point
N Mellempudi, S Srinivasan, D Das, B Kaul
arXiv preprint arXiv:1905.12334, 2019
682019
FP8 formats for deep learning
P Micikevicius, D Stosic, N Burgess, M Cornea, P Dubey, R Grisenthwaite, ...
arXiv preprint arXiv:2209.05433, 2022
652022
Optimized compute hardware for machine learning operations
D Das, R Gramunt, M Smelyanskiy, J Corbal, D Mudigere, NK Mellempudi, ...
US Patent 10,776,699, 2020
432020
Dynamic precision management for integer deep learning primitives
N Mellempudi, D Mudigere, D Das, S Sridharan
US Patent 10,643,297, 2020
432020
Scaling half-precision floating point tensors for training deep neural networks
N Mellempudi, D Das
US Patent 11,501,139, 2022
402022
On scale-out deep learning training for cloud and hpc
S Sridharan, K Vaidyanathan, D Kalamkar, D Das, ME Smorkalov, ...
arXiv preprint arXiv:1801.08030, 2018
352018
Mixed low-precision deep learning inference using dynamic fixed point
N Mellempudi, A Kundu, D Das, D Mudigere, B Kaul
arXiv preprint arXiv:1701.08978, 2017
282017
Performing power management in a multicore processor
VW Lee, D Kim, Y Bai, S Ji, S Li, DD Kalamkar, NK Mellempudi
US Patent 9,910,481, 2018
222018
Incremental precision networks using residual inference and fine-grain quantization
A Kundu, N Mellempudi, D Mudigere, D Das
US Patent 11,556,772, 2023
162023
Ternary residual networks
A Kundu, K Banerjee, N Mellempudi, D Mudigere, D Das, B Kaul, ...
arXiv preprint arXiv:1707.04679, 2017
142017
Conversion hardware mechanism
N Mellempudi, D Das, MEI Chunhui, K Wong, DD Kalamkar, HH Jiang, ...
US Patent 11,494,163, 2022
132022
High performance scalable FPGA accelerator for deep neural networks
S Srinivasan, P Janedula, S Dhoble, S Avancha, D Das, N Mellempudi, ...
arXiv preprint arXiv:1908.11809, 2019
52019
Technologies for scaling deep learning training
NK Mellempudi, S Sridharan, D Mudigere, D Das
US Patent 11,068,780, 2021
42021
Performing power management in a multicore processor
VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ...
US Patent 10,775,873, 2020
42020
K-tanh: Hardware efficient activations for deep learning
A Kundu, S Srinivasan, EC Qin, D Kalamkar, NK Mellempudi, D Das, ...
arXiv preprint arXiv:1909.07729, 2019
42019
Hardware apparatuses and methods relating to elemental register accesses
V Lee, U Echeruo, G Chrysos, N Mellempudi
US Patent 9,996,347, 2018
32018
The system can't perform the operation now. Try again later.
Articles 1–20