Follow
Nitin Kedia
Nitin Kedia
Research Fellow, Microsoft Research India
Verified email at microsoft.com
Title
Cited by
Cited by
Year
Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
A Agrawal, N Kedia, A Panwar, J Mohan, N Kwatra, BS Gulavani, ...
arXiv preprint arXiv:2403.02310, 2024
32024
The system can't perform the operation now. Try again later.