Tri Dao

Cited by

	All	Since 2019
Citations	3118	3114
h-index	24	24
i10-index	32	32

1400

700

350

1050

20192020202120222023202449 82 154 272 1363 1189

Public access

View all

22 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Christopher RéComputer Science, Stanford UniversityVerified email at cs.stanford.edu
Atri RudraKatherine Johnson Chair in AI, Professor, CSE, University at BuffaloVerified email at buffalo.edu
Albert GuCarnegie Mellon UniversityVerified email at andrew.cmu.edu
Stefano ErmonStanford UniversityVerified email at cs.stanford.edu
Beidi ChenCarnegie Mellon UniversityVerified email at andrew.cmu.edu
Daniel Y FuGraduate Student, Stanford UniversityVerified email at cs.stanford.edu
Zhao SongAdobe ResearchVerified email at ias.edu
Karan GoelStanford UniversityVerified email at stanford.edu
Michael PoliStanford UniversityVerified email at stanford.edu
Eric NguyenStanford UniversityVerified email at stanford.edu
Khaled Kamal SaabGoogle, Stanford UniversityVerified email at google.com
Ce ZhangTogether AIVerified email at together.xyz
Binhang Yuan（袁彬航）Hong Kong University of Science and TechnologyVerified email at ust.hk
Stephen BaccusProfessor of Neurobiology, Stanford UniversityVerified email at stanford.edu
Christopher De SaAssistant Professor of Computer Science, Cornell UniversityVerified email at cs.cornell.edu
Armin W. ThomasStanford UniversityVerified email at stanford.edu
Jue WANGTogether AI; ZJUVerified email at zju.edu.cn
Matthew EichhornCornell UniversityVerified email at cornell.edu
Zichang LiuRice UniversityVerified email at rice.edu
Stefano MassaroliRIKENVerified email at riken.jp

Tri Dao

Princeton University, Together AI

Verified email at princeton.edu - Homepage

Machine learning Systems


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Flashattention: Fast and memory-efficient exact attention with io-awareness T Dao, D Fu, S Ermon, A Rudra, C Ré Advances in Neural Information Processing Systems 35, 16344-16359, 2022	690	2022
Starcoder: may the source be with you! R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... Transactions on Machine Learning Research (TMLR), 2023	368*	2023
A kernel theory of modern data augmentation T Dao, A Gu, A Ratner, V Smith, CD Sa, C Ré Proceedings of the 36th International Conference on Machine Learning, ICML, 9-15, 2019	196	2019
Hippo: Recurrent memory with optimal polynomial projections A Gu, T Dao, S Ermon, A Rudra, C Ré Advances in neural information processing systems 33, 1474-1487, 2020	180	2020
Mamba: Linear-time sequence modeling with selective state spaces A Gu, T Dao arXiv preprint arXiv:2312.00752, 2023	173	2023
Flashattention-2: Faster attention with better parallelism and work partitioning T Dao International Conference on Learning Representations, 2023	162	2023
Combining recurrent, convolutional, and continuous-time models with linear state space layers A Gu, I Johnson, K Goel, K Saab, T Dao, A Rudra, C Ré Advances in neural information processing systems 34, 572-585, 2021	148	2021
Hungry Hungry Hippos: Towards Language Modeling with State Space Models DY Fu, T Dao, KK Saab, AW Thomas, A Rudra, C Re The Eleventh International Conference on Learning Representations, 2023	136	2023
Hyena Hierarchy: Towards Larger Convolutional Language Models M Poli, S Massaroli, E Nguyen, DY Fu, T Dao, S Baccus, Y Bengio, ... International Conference on Machine Learning, 2023	118	2023
Learning fast algorithms for linear transforms using butterfly factorizations T Dao, A Gu, M Eichhorn, A Rudra, C Ré International conference on machine learning, 1517-1527, 2019	97	2019
Scatterbrain: Unifying sparse and low-rank attention B Chen, T Dao, E Winsor, Z Song, A Rudra, C Ré Advances in Neural Information Processing Systems 34, 17413-17426, 2021	79*	2021
Mongoose: A learnable lsh framework for efficient neural network training B Chen, Z Liu, B Peng, Z Xu, JL Li, T Dao, Z Song, A Shrivastava, C Re International Conference on Learning Representations, 2020	71	2020
Deja vu: Contextual sparsity for efficient llms at inference time Z Liu, J Wang, T Dao, T Zhou, B Yuan, Z Song, A Shrivastava, C Zhang, ... International Conference on Machine Learning, 22137-22176, 2023	70	2023
Gaussian quadrature for kernel features T Dao, CM De Sa, C Ré Advances in neural information processing systems 30, 2017	57	2017
Monarch: Expressive structured matrices for efficient and accurate training T Dao, B Chen, NS Sohoni, A Desai, M Poli, J Grogan, A Liu, A Rao, ... International Conference on Machine Learning, 4690-4721, 2022	55	2022
Pixelated butterfly: Simple and efficient sparse training for neural network models T Dao, B Chen, K Liang, J Yang, Z Song, A Rudra, C Re International Conference on Learning Representations, 2021	53	2021
S4nd: Modeling images and videos as multidimensional signals with state spaces E Nguyen, K Goel, A Gu, G Downs, P Shah, T Dao, S Baccus, C Ré Advances in neural information processing systems 35, 2846-2861, 2022	50	2022
Learning compressed transforms with low displacement rank A Thomas, A Gu, T Dao, A Rudra, C Ré Advances in neural information processing systems 31, 2018	49	2018
Decentralized training of foundation models in heterogeneous environments B Yuan, Y He, J Davis, T Zhang, T Dao, B Chen, PS Liang, C Re, C Zhang Advances in Neural Information Processing Systems 35, 25464-25477, 2022	47	2022
Kaleidoscope: An efficient, learnable representation for all structured linear maps T Dao, NS Sohoni, A Gu, M Eichhorn, A Blonder, M Leszczynski, A Rudra, ... International Conference on Learning Representations, 2020	47	2020

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors