Follow
Sneha Kudugunta
Sneha Kudugunta
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Investigating Multilingual NMT Representations at Scale
SR Kudugunta, A Bapna, I Caswell, N Arivazhagan, O Firat
arXiv preprint arXiv:1909.02197, 2019
702019
Leveraging monolingual data with self-supervision for multilingual neural machine translation
A Siddhant, A Bapna, Y Cao, O Firat, M Chen, S Kudugunta, ...
arXiv preprint arXiv:2005.04816, 2020
512020
Quality at a glance: An audit of web-crawled multilingual datasets
J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ...
Transactions of the Association for Computational Linguistics 10, 50-72, 2022
352022
MURAL: multimodal, multitask retrieval across languages
A Jain, M Guo, K Srinivasan, T Chen, S Kudugunta, C Jia, Y Yang, ...
arXiv preprint arXiv:2109.05125, 2021
142021
Nisansa de Silva
J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ...
Sakine Çabuk Ballı, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur …, 2022
82022
Beyond distillation: Task-level mixture-of-experts for efficient inference
S Kudugunta, Y Huang, A Bapna, M Krikun, D Lepikhin, MT Luong, O Firat
arXiv preprint arXiv:2110.03742, 2021
72021
MURAL: Multimodal, multitask representations across languages
A Jain, M Guo, K Srinivasan, T Chen, S Kudugunta, C Jia, Y Yang, ...
Findings of the Association for Computational Linguistics: EMNLP 2021, 3449-3463, 2021
52021
A Loss Curvature Perspective on Training Instability in Deep Learning
J Gilmer, B Ghorbani, A Garg, S Kudugunta, B Neyshabur, D Cardoze, ...
arXiv preprint arXiv:2110.04369, 2021
42021
A loss curvature perspective on training instabilities of deep learning models
J Gilmer, B Ghorbani, A Garg, S Kudugunta, B Neyshabur, D Cardoze, ...
International Conference on Learning Representations, 2021
42021
Exploring routing strategies for multilingual mixture-of-experts models
S Kudugunta, Y Huang, A Bapna, M Krikun, D Lepikhin, T Luong, O Firat
22020
Systems and methods for routing within multitask mixture-of-experts models
Y Huang, D Lepikhin, M Krikun, O Firat, A Bapna, T Luong, S Kudugunta
US Patent App. 17/159,437, 2022
2022
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
A Bapna, DD Lepikhin, M Krikun, O Firat, SR Kudugunta, T Luong, ...
2021
The system can't perform the operation now. Try again later.
Articles 1–12