Quentin Anthony

Cited by

	All	Since 2019
Citations	1929	1928
h-index	14	14
i10-index	15	15

1100

550

275

825

2020202120222023202413 28 131 714 1035

Public access

View all

16 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Stella BidermanBooz Allen Hamilton, EleutherAIVerified email at bah.com
Hari SubramoniThe Ohio State UniversityVerified email at cse.ohio-state.edu
Dhabaleswar K. PandaProfessor of Computer Science, The Ohio State UniversityVerified email at cse.ohio-state.edu
Hailey SchoelkopfResearcher, EleutherAIVerified email at eleuther.ai
Aamir ShafiResearch Scientist, Ohio State UniversityVerified email at osu.edu
Ammar Ahmad AwanMicrosoftVerified email at osu.edu

Quentin Anthony

PhD Student, Ohio State University

Verified email at osu.edu - Homepage

HPC Deep Learning Parallel Computing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Gpt-neox-20b: An open-source autoregressive language model S Black, S Biderman, E Hallahan, Q Anthony, L Gao, L Golding, H He, ... Proceedings of the ACL Workshop on Challenges & Perspectives in Creating …, 2022	661	2022
Pythia: A suite for analyzing large language models across training and scaling S Biderman, H Schoelkopf, Q Anthony, H Bradley, K O'Brien, E Hallahan, ... International conference on machine learning (ICML), 2023	596	2023
Rwkv: Reinventing rnns for the transformer era B Peng, E Alcaide, Q Anthony, A Albalak, S Arcadinho, S Biderman, ... arXiv preprint arXiv:2305.13048, 2023	270*	2023
Emergent and Predictable Memorization in Large Language Models S Biderman, US Prashanth, L Sutawika, H Schoelkopf, Q Anthony, ... https://arxiv.org/pdf/2304.11158.pdf, 2023	86	2023
Gems: Gpu-enabled memory-aware model-parallelism system for distributed dnn training A Jain, AA Awan, AM Aljuhani, JM Hashmi, QG Anthony, H Subramoni, ... SC20: International Conference for High Performance Computing, Networking …, 2020	49	2020
Performance characterization of dnn training using tensorflow and pytorch on modern clusters A Jain, AA Awan, Q Anthony, H Subramoni, DKDK Panda 2019 IEEE International Conference on Cluster Computing (CLUSTER), 1-11, 2019	41	2019
Continual Pre-Training of Large Language Models: How to (re) warm your model? K Gupta, B Thérien, A Ibrahim, ML Richter, Q Anthony, E Belilovsky, I Rish, ...	35	2023
GPT-NeoX: Large scale autoregressive language modeling in pytorch A Andonian, Q Anthony, S Biderman, S Black, P Gali, L Gao, E Hallahan, ...	29*	2021
Blackmamba: Mixture of experts for state-space models Q Anthony, Y Tokpanov, P Glorioso, B Millidge arXiv preprint arXiv:2402.01771, 2024	17	2024
trlX: A framework for large scale reinforcement learning from human feedback A Havrilla, M Zhuravinskyi, D Phung, A Tiwari, J Tow, S Biderman, ... Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023	17	2023
Eagle and finch: Rwkv with matrix-valued states and dynamic recurrence B Peng, D Goldstein, Q Anthony, A Albalak, E Alcaide, S Biderman, ... arXiv preprint arXiv:2404.05892, 2024	15	2024
Accelerating mpi all-to-all communication with online compression on modern gpu clusters Q Zhou, P Kousha, Q Anthony, K Shafie Khorassani, A Shafi, ... International Conference on High Performance Computing, 3-25, 2022	15	2022
Simple and scalable strategies to continually pre-train large language models A Ibrahim, B Thérien, K Gupta, ML Richter, Q Anthony, T Lesort, ... arXiv preprint arXiv:2403.08763, 2024	14	2024
Adaptive and hierarchical large message all-to-all communication algorithms for large-scale dense gpu systems KS Khorassani, CH Chu, QG Anthony, H Subramoni, DK Panda 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet …, 2021	14	2021
Hypar-flow: Exploiting mpi and keras for scalable hybrid-parallel dnn training using tensorflow AA Awan, A Jain, Q Anthony, H Subramoni, DK Panda arXiv preprint arXiv:1911.05146, 2019	14*	2019
Accelerating distributed deep learning training with compression assisted allgather and reduce-scatter communication Q Zhou, Q Anthony, L Xu, A Shafi, M Abduljabbar, H Subramoni, ... 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2023	7	2023
Efficient training of semantic image segmentation on summit using horovod and mvapich2-gdr Q Anthony, AA Awan, A Jain, H Subramoni, DKDK Panda 2020 IEEE International Parallel and Distributed Processing Symposium …, 2020	7	2020
Zamba: A Compact 7B SSM Hybrid Model P Glorioso, Q Anthony, Y Tokpanov, J Whittington, J Pilault, A Ibrahim, ... arXiv preprint arXiv:2405.16712, 2024	5	2024
Highly efficient alltoall and alltoallv communication algorithms for gpu systems CC Chen, KS Khorassani, QG Anthony, A Shafi, H Subramoni, DK Panda 2022 IEEE International Parallel and Distributed Processing Symposium …, 2022	5	2022
Hy-Fi: Hybrid Five-Dimensional Parallel DNN Training on High-Performance GPU Clusters A Jain, A Shafi, Q Anthony, P Kousha, H Subramoni, DK Panda International Conference on High Performance Computing, 109-130, 2022	5	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors