Hong Zhang

Assistant Professor of Cheriton School of Computer Science, University of Waterloo

Office: DC3530

I develop high-performance, scalable systems for big data and ML applications. My research advocates an application-oriented design principle for big data and ML systems: fully exploiting application-specific structures --- communication patterns, execution dependencies, ML model structures, etc. --- to suit application-specific performance demands. This principle has led to several scalable systems with theoretically sound scheduling algorithms tailored for different big data and ML applications. I received a Google Ph.D. Fellowship in systems and networking.

Biography: I was a PostDoc in the RISELab at UC Berkeley working with Prof. Ion Stoica. I received my Ph.D. from Department of Computer Science and Engineering, Hong Kong University of Science and Technology, where I worked with Prof. Kai Chen in System networkING (SING) research group.

Interests:

  • Large-scale Data Analytics

  • Distributed ML Training & Serving Systems

  • Application and Network Scheduling

  • Data Center Networking

  • Serverless Computing and Cloud Computing

news

Sep 17, 2024

I am looking for self-motivated PhD students to work with me on distributed systems (ML systems in particular) starting Spring/Fall 2025. Drop me an email if you are interested!

Sep 17, 2024

Serve on the program committee for OSDI'25, NSDI'25, and EuroSys'25.

selected publications

  1. ___ASPLOS___
    RainbowCake: Mitigating Cold-starts in Serverless with Layer-wise Container Cachine and Sharing
    Yu, Hanfei, Roy, Rohan Basu, Fontenot, Christian, Tiwari, Devesh, Li, Jian, Zhang, Hong, Wang, Hao, and Park, Seung-Jong
    In ASPLOS 2024
  2. ___EuroSys___
    Accelerating Privacy-Preserving Machine Learning with GeniBatch
    Huang, Xinyang, Zhang, Junxue, Cheng, Xiaodian, Zhang, Hong, Jin, Yilun, Hu, Shuihai, Tian, Han, and Chen, Kai
    In EuroSys 2024
  3. _____NSDI_____
    SHEPHERD: Serving DNNs in the Wild
    Zhang, Hong, Tang, Yupeng, Khandelwal, Anurag, and Stoica, Ion
    In Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, 2023
  4. _____NSDI_____
    NetHint: White-Box Networking for Multi-Tenant Data Centers
    Chen, Jingrong, Zhang, Hong, Zhang, Wei, Luo, Liang, Chase, Jeffery, Stoica, Ion, and Zhuo, Danyang
    In Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022
  5. __SIGCOMM__
    LiteFlow: Towards High-performance Adaptive Neural Networks for Kernel Datapath
    Zhang, Junxue, Zeng, Chaoliang, Zhang, Hong, Hu, Shuihai, and Chen, Kai
    In Proceedings of the ACM SIGCOMM 2022 Conference, 2022
  6. _____NSDI_____
    Caerus: NIMBLE Task Scheduling for Serverless Analytics
    Zhang, Hong, Tang, Yupeng, Khandelwal, Anurag, Chen, Jingrong, and Stoica, Ion
    In Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, 2021
  7. __SIGCOMM__
    Resilient Datacenter Load Balancing in the Wild
    Zhang, Hong, Zhang, Junxue, Bai, Wei, Chen, Kai, and Chowdhury, Mosharaf
    In Proceedings of the ACM SIGCOMM 2017 Conference, 2017
  8. __SIGCOMM__
    CODA: Toward Automatically Identifying and Scheduling Coflows in the Dark
    Zhang, Hong, Chen, Li, Yi, Bairen, Chen, Kai, Chowdhury, Mosharaf, and Geng, Yanhui
    In Proceedings of the ACM SIGCOMM 2016 Conference, 2016
  9. ___EuroSys___
    Guaranteeing Deadlines for Inter-Datacenter Transfers
    Zhang, Hong, Chen, Kai, Bai, Wei, Han, Dongsu, Tian, Chen, Wang, Hao, Guan, Haibin, and Zhang, Ming
    In Proceedings of the 10th European Conference on Computer Systems, 2015