Hi there, hello.
I am a PhD student at the University of Washington, where I explore how to make computer systems more efficient. I’m especially interested in approaches that could meaningfully impact society.
Lately, my focus has been on making generative large language models more efficient and sustainable by leveraging their structural execution properties. This has led me to analyze power management opportunities, design phase-separated clusters, and most recently, optimize mixture of experts models.
Before that, I wandered through a mix of topics, dipping my toes into energy-efficient scheduling, deep learning systems, human-computer interaction, virtual memory design, kernel-bypass stacks, and real-time systems. You can find more details on my publications page.
I love discussing and learning together—so if any of this piques your curiosity, reach out!
Presented our vision on carbon-aware clouds at HotCarbon’23!
Started my summer internship at Azure Systems Research! I’ll be working on improving LLM inference efficiency in large-scale GPU deployments.
Our paper on cloud GPU power management is published in IEEE CAL!
The course I helped create is now an official seminar at UW CSE! If you are a new student, please consider attending CSE 590X: How to PhD to learn how to navigate the ups and downs of graduate school!
[ older news ]