Gpt-4 technical report J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ... arXiv preprint arXiv:2303.08774, 2023 | 5168 | 2023 |
Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... Journal of Machine Learning Research 24 (240), 1-113, 2023 | 4880 | 2023 |
Unsupervised pixel-level domain adaptation with generative adversarial networks K Bousmalis, N Silberman, D Dohan, D Erhan, D Krishnan Proceedings of the IEEE conference on computer vision and pattern …, 2017 | 1949 | 2017 |
Rethinking attention with performers K Choromanski, V Likhosherstov, D Dohan, X Song, A Gane, T Sarlos, ... International Conference on Learning Representations, 2021 | 1628 | 2021 |
Qanet: Combining local convolution with global self-attention for reading comprehension AW Yu, D Dohan, MT Luong, R Zhao, K Chen, M Norouzi, QV Le International Conference on Learning Representations, 2018 | 1354* | 2018 |
Program synthesis with large language models J Austin, A Odena, M Nye, M Bosma, H Michalewski, D Dohan, E Jiang, ... arXiv preprint arXiv:2108.07732, 2021 | 1152 | 2021 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 1092 | 2022 |
Solving quantitative reasoning problems with language models A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ... Advances in Neural Information Processing Systems 35, 3843-3857, 2022 | 565 | 2022 |
Show your work: Scratchpads for intermediate computation with language models M Nye, AJ Andreassen, G Gur-Ari, H Michalewski, J Austin, D Bieber, ... arXiv preprint arXiv:2112.00114, 2021 | 527 | 2021 |
Large language models can be easily distracted by irrelevant context F Shi, X Chen, K Misra, N Scales, D Dohan, EH Chi, N Schärli, D Zhou International Conference on Machine Learning, 31210-31227, 2023 | 292 | 2023 |
Model-based reinforcement learning for biological sequence design C Angermueller, D Dohan, D Belanger, R Deshpande, K Murphy, ... International conference on learning representations, 2019 | 135 | 2019 |
Palm: Scaling language modeling with pathways. arXiv 2022 A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311 10, 2022 | 112 | 2022 |
Masked language modeling for proteins via linearly scalable long-context transformers K Choromanski, V Likhosherstov, D Dohan, X Song, A Gane, T Sarlos, ... arXiv preprint arXiv:2006.03555, 2020 | 94 | 2020 |
Language model cascades D Dohan, W Xu, A Lewkowycz, J Austin, D Bieber, RG Lopes, Y Wu, ... arXiv preprint arXiv:2207.10342, 2022 | 78 | 2022 |
Population-based black-box optimization for biological sequence design C Angermueller, D Belanger, A Gane, Z Mariet, D Dohan, K Murphy, ... International conference on machine learning, 324-334, 2020 | 61 | 2020 |
Chi, Nathanael Schärli, and Denny Zhou. 2023. Large language models can be easily distracted by irrelevant context F Shi, X Chen, K Misra, N Scales, D Dohan arXiv preprint arXiv:2302.00093 12, 28, 2023 | 60 | 2023 |
EvoPrompting: language models for code-level neural architecture search A Chen, D Dohan, D So Advances in Neural Information Processing Systems 36, 2024 | 58 | 2024 |
Is transfer learning necessary for protein landscape prediction? A Shanehsazzadeh, D Belanger, D Dohan NeurIPS workshop on Machine Learning in Structural Biology, 2020 | 55 | 2020 |
Towards learning universal hyperparameter optimizers with transformers Y Chen, X Song, C Lee, Z Wang, R Zhang, D Dohan, K Kawakami, ... Advances in Neural Information Processing Systems 35, 32053-32068, 2022 | 54 | 2022 |
Program synthesis with large language models. CoRR abs/2108.07732 (2021) J Austin, A Odena, MI Nye, M Bosma, H Michalewski, D Dohan, E Jiang, ... arXiv preprint arXiv:2108.07732, 2021 | 54 | 2021 |