Follow
Maxim Naumov
Maxim Naumov
Meta (Sr. Manager & Research Scientist)
Verified email at fb.com - Homepage
Title
Cited by
Cited by
Year
The Llama 3 Herd of Models
A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ...
arXiv preprint arXiv:2407.21783, 2024
13752024
Deep Learning Recommendation Model for Personalization and Recommendation Systems
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
7992019
The architectural implications of Facebook's DNN-based personalized recommendation
U Gupta, CJ Wu, X Wang, M Naumov, B Reagen, D Brooks, B Cottel, ...
IEEE International Symposium on High Performance Computer Architecture (HPCA …, 2020
3322020
Atomistic simulation of realistically sized nanodevices using NEMO 3-D—Part I: Models and benchmarks
G Klimeck, SS Ahmed, H Bae, N Kharche, S Clark, B Haley, S Lee, ...
IEEE Transactions on Electron Devices 54 (9), 2079-2089, 2007
2942007
Recnmp: Accelerating personalized recommendation with near-memory processing
L Ke, U Gupta, BY Cho, D Brooks, V Chandra, U Diril, A Firoozshahian, ...
2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020
2422020
Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications
J Park, M Naumov, P Basu, S Deng, A Kalaiah, D Khudia, J Law, P Malani, ...
arXiv preprint arXiv:1811.09886, 2018
2272018
CUSPARSE Library: A Set of Basic Linear Algebra Subroutines for Sparse Matrices
M Naumov, LS Chien, P Vandermersch, U Kapasi
GPU Technology Conference (GTC), 2010
206*2010
Parallel solution of sparse triangular linear systems in the preconditioned iterative methods on the GPU
M Naumov
Nvidia Technical Report NVR-2011-001, 2011
198*2011
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
A Devarakonda, M Naumov, M Garland
arXiv preprint arXiv:1712.02029, 2017
1932017
AmgX: A Library for GPU Accelerated Algebraic Multigrid and Preconditioned Iterative Methods
M Naumov, M Arsaev, P Castonguay, J Cohen, J Demouth, J Eaton, ...
SIAM Journal on Scientific Computing 37 (5), S602-S626, 2015
1772015
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
146*2022
Compositional embeddings using complementary partitions for memory-efficient recommendation systems
HJM Shi, D Mudigere, M Naumov, J Yang
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020
1262020
Mixed dimension embeddings with application to memory-efficient recommendation systems
AA Ginart, M Naumov, D Mudigere, J Yang, J Zou
2021 IEEE International Symposium on Information Theory (ISIT), 2786-2791, 2021
1082021
Incomplete-LU and Cholesky preconditioned iterative methods using CUSPARSE and CUBLAS
M Naumov
Nvidia White Paper, 2011
1082011
Bandana: Using Non-volatile Memory for Storing Deep Learning Models
A Eisenman, M Naumov, D Gardner, M Smelyanskiy, S Pupyrev, ...
Conference on Machine Learning and Systems (MLSys), 2019
952019
Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems
M Naumov, J Kim, D Mudigere, S Sridharan, X Wang, W Zhao, S Yilmaz, ...
arXiv preprint arXiv:2003.09518, 2020
912020
Multimillion Atom Simulation of Electronic and Optical Properties of Nanoscale Devices Using NEMO 3-D
S Ahmed, N Kharche, R Rahman, M Usman, S Lee, H Ryu, H Bae, ...
Encyclopedia of Complexity and Systems Science, 1-69, 2015
75*2015
Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU
M Naumov, P Castonguay, J Cohen
Nvidia Technical Report NVR-2015-001, 2015
632015
Microscaling Data Formats for Deep Learning
BD Rouhani, R Zhao, A More, M Hall, A Khodamoradi, S Deng, ...
arXiv preprint arXiv:2310.10537, 2023
412023
With Shared Microexponents, A Little Shifting Goes a Long Way
B Darvish Rouhani, R Zhao, V Elango, R Shafipour, M Hall, ...
Proceedings of the 50th Annual International Symposium on Computer …, 2023
392023
The system can't perform the operation now. Try again later.
Articles 1–20