Hi, I’m Yiyan Zhai :)

I recently graduated from Carnegie Mellon University (B.S. in CS, minor in ML) and am applying to PhD programs for Fall 2026! My research interests lie at building efficient and scalable ML systems.

I have been working with with Prof. Tianqi Chen at CMU Catalyst Group on:

  • FlashInfer-Bench, a kernel benchmarking loop that goes from kernel generation → evaluation → drop-in replacement in serving stacks (FlashInfer/SGLang/vLLM).
  • WebLLM Assistant, which integrates Overleaf and Google Workspace with in-browser agents using WebLLM.

I am also fortunate to collaborate with Prof. Juncheng Yang at Harvard SEAS on:

  • Cache replacement algorithms for real-world enterprise storage systems (VMware vSAN)
  • Resilient routing for LLM inference