projects
My research has followed a clear vision to continuously maturate the application-oriented design principle step by step. My thesis work focused on designing networked systems for big data applications at scale. My postdoctoral research extended the depth and breadth of my thesis work in three directions (1) from networked system design to a joint system design considering both compute and communication, (2) from traditional server-centric deployments to emerging serverless environments, and (3) from data analytics to ML applications.
[2020.11-2022.4] SerFlex: Online adaptive ML serving system against bursty and unpredictable workload. Developing an adaptive ML serving system that can timley react to bursty and unpredictable workloads to meet per-request latency requirements [NSDI'23].
[2020.10-2021.9] NetHint: Cooperative network optimization for big data and ML applications in public cloud. Developed a network abstraction and an interactive mechanism between cloud provider and tenants to cooperatively enhance the performance of big data and ML applications [NSDI'22].
[2020.2-2022.2] LiteFlow: High-performance Adaptive Neural Networks for Kernel Datapath. Developed an adaptive Neural Networks (NNs)-based solution to optimize OS kernel datapath functions. It decouples the control path of adaptive NNs into: (1) a kernel-space fast path for efficient inference execution, and (2) a userspace slow path for efficient model tuning [SIGCOMM'22].
[2019.6-2020.9] Caerus: Timely task scheduling for serverless analytics. Developed a task execution framework for serverless analytics. It optimizes both execution cost and job completion time by fully exploiting task execution dependencies [NSDI'21].
[2017.9-Present] DeepScheduler: Optimizing parameter synchronization for ML training systems. Developing a scheduling framework for model training systems. It fully exploits the allreduce communication pattern to speed up distributed ML training [APNet'20, Under submission].
[2016.2-2017.2] Hermes: Network load balancing system for big data applications. Developed a resilient load balancing system that can gracefully handle uncertainties (e.g., congestions and failures) for big data applications in a practical, readily-deployable fashion [SIGCOMM'17].
[2015.6-2016.2] CODA: Automatic network optimization for big data applications. Developed a network scheduler that can automatically identify and exploit application semantics (e.g., communication and execution dependencies) without manually updating applications [SIGCOMM'16].
[2013.10-2015.2] Amoeba: Deadline-aware networked system for inter-data center data transfers. Developed a deadline-based network abstraction and a deadline-aware networked system to guarantee deadlines for inter-data center data transfers [EuroSys'15, ToN'17].