Palm 2 technical report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 296 | 2023 |
Investigating Multilingual NMT Representations at Scale SR Kudugunta, A Bapna, I Caswell, N Arivazhagan, O Firat arXiv preprint arXiv:1909.02197, 2019 | 99 | 2019 |
Quality at a glance: An audit of web-crawled multilingual datasets J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ... Transactions of the Association for Computational Linguistics 10, 50-72, 2022 | 96 | 2022 |
Leveraging monolingual data with self-supervision for multilingual neural machine translation A Siddhant, A Bapna, Y Cao, O Firat, M Chen, S Kudugunta, ... arXiv preprint arXiv:2005.04816, 2020 | 73 | 2020 |
Mural: multimodal, multitask retrieval across languages A Jain, M Guo, K Srinivasan, T Chen, S Kudugunta, C Jia, Y Yang, ... arXiv preprint arXiv:2109.05125, 2021 | 58* | 2021 |
Beyond distillation: Task-level mixture-of-experts for efficient inference S Kudugunta, Y Huang, A Bapna, M Krikun, D Lepikhin, MT Luong, O Firat arXiv preprint arXiv:2110.03742, 2021 | 51 | 2021 |
A loss curvature perspective on training instabilities of deep learning models J Gilmer, B Ghorbani, A Garg, S Kudugunta, B Neyshabur, D Cardoze, ... International Conference on Learning Representations, 2021 | 40* | 2021 |
Madlad-400: A multilingual and document-level large audited dataset S Kudugunta, I Caswell, B Zhang, X Garcia, CA Choquette-Choo, K Lee, ... arXiv preprint arXiv:2309.04662, 2023 | 4 | 2023 |
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer A Asai, S Kudugunta, XV Yu, T Blevins, H Gonen, M Reid, Y Tsvetkov, ... arXiv preprint arXiv:2305.14857, 2023 | 2 | 2023 |
MatFormer: Nested Transformer for Elastic Inference Devvrit*, S Kudugunta*, A Kusupati*, T Dettmers, K Chen, I Dhillon, ... arXiv preprint arXiv:2310.07707, 2023 | 1 | 2023 |
Systems and methods for routing within multitask mixture-of-experts models Y Huang, D Lepikhin, M Krikun, O Firat, A Bapna, T Luong, S Kudugunta US Patent App. 17/159,437, 2022 | | 2022 |