research theme
Towards Application-Oriented Big Data and ML Systems
Big data and ML applications have application-specific internal structures: They are executed as a designated sequence of computation and communication operations, following different execution dependencies and specific communication patterns. Moreover, applications have various resource and performance demands, exposing different trade-offs between cost, latency, and throughput. However, most existing system designs either focus on network- and system-level metrics (i.e., network latency and system utilization), or treat different applications in a similar way without efficiently recognizing application-specific features. Such a gap between application-level insights and system-level design results in significant losses in application performance. In contrast, my research makes applications first-class citizens, aiming to answer one fundamental question in designing big data and ML systems: how to fully exploit application-specific structures to better suit application-specific demands?
My research addresses this problem by developing scalable systems with theoretically sound scheduling algorithms tailored for different applications. On one hand, to bring demand awareness to system design, I have proposed APIs and algorithm designs to express and satisfy application demands \cite{amoeba,amoebaj}. On the other hand, to integrate application structures into system design, I have designed a wide spectrum of solutions to identify, model, and exploit various application structures — communication patterns \cite{coda, deepscheduler,rat,pas}, execution dependencies \cite{caerus}, and ML model structures \cite{deepscheduler, serflex} — to optimize application performance.