Follow
Zhen Zhang
Zhen Zhang
Applied Scientist, Amazon Web Services
Verified email at amazon.com - Homepage
Title
Cited by
Cited by
Year
{PipeSwitch}: Fast pipelined context switching for deep learning applications
Z Bai, Z Zhang, Y Zhu, X Jin
14th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2020
852020
Is network the bottleneck of distributed training?
Z Zhang, C Chang, H Lin, Y Wang, R Arora, X Jin
Proceedings of the Workshop on Network Meets AI & ML, 8-13, 2020
652020
MiCS: near-linear scaling for training gigantic model on public cloud
Z Zhang, S Zheng, Y Wang, J Chiu, G Karypis, T Chilimbi, M Li, X Jin
arXiv preprint arXiv:2205.00119, 2022
242022
TKPERM: cross-platform permission knowledge transfer to detect overprivileged third-party applications
FH Shezan, K Cheng, Z Zhang, Y Cao, Y Tian
Network and Distributed Systems Security (NDSS) Symposium, 2020
112020
Oobleck: Resilient distributed training of large models using pipeline templates
I Jang, Z Yang, Z Zhang, X Jin, M Chowdhury
Proceedings of the 29th Symposium on Operating Systems Principles, 382-395, 2023
102023
Gemini: Fast failure recovery in distributed training with in-memory checkpoints
Z Wang, Z Jia, S Zheng, Z Zhang, X Fu, TSE Ng, Y Wang
Proceedings of the 29th Symposium on Operating Systems Principles, 364-381, 2023
72023
Towards a secure zero-rating framework with three parties
Z Liu, Z Zhang, Y Cao, Z Xi, S Jing, H La Roche
27th USENIX Security Symposium (USENIX Security 18), 711-728, 2018
22018
Decoupled Model Schedule for Deep Learning Training
H Chen, C Hao Yu, S Zheng, Z Zhang, Z Zhang, Y Wang
arXiv e-prints, arXiv: 2302.08005, 2023
12023
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud
Z Zhang, S Zheng, Y Wang, J Chiu, G Karypis, T Chilimbi, M Li, X Jin
Proceedings of the VLDB Endowment, 0
The system can't perform the operation now. Try again later.
Articles 1–9