Ph.D. Student
Paul G. Allen School of Computer Science and Engineering
University of Washington
cyulin [at] cs.washington.edu
I’m a final year Ph.D. student in the Paul G. Allen School of Computer Science and Engineering at University of Washington. I'm broadly interested in machine learning system and particularly passionate about acceleration and efficiency. I like to tackle the inefficiency from a vertical perspective, up from algorithms, middle to software and system, and down to hardware architecture. I have done works in sparse CNNs (computer vision), GNNs (graph analysis), NERFs (neural rendering), and currently focus on LLMs. I’m fortunate to be advised by Prof. Luis Ceze, and being a member of the awesome Sampl group.
Prior to UW, I obtained my Bachelor and Master degree from Department of Electronics Engineering, National Chiao-Tung University. I also hold a minor in Computer Science for my undergraduate. During my Master, I was fortunate to work with Prof. Bo-Cheng Lai.
Besides doing research, I enjoy tennis, hiking and (backcountry) skiing.
NanoFlow: Towards Optimal Large Language Model Serving Throughput
Kan Zhu, Yilong Zhao, Liangyu Zhao, Gefei Zuo, Yile Gu, Dedong Xie, Yufei Gao, Qinyu Xu, Tian Tang, Zihao Ye, Keisuke Kamahori, Chien-Yu Lin, Stephanie Wang, Arvind Krishnamurthy, Baris Kasikci, Preprint, ArXiv 2024
[
Paper]
[
Code]
Palu: Compressing KV-Cache with Low-Rank Projection
Chi-Chih Chang*, Wei-Cheng Lin*, Chien-Yu Lin*, Chong-Yan Chen, Yu-Fang Hu, Pei-Shuo Wang, Ning-Chi Huang, Luis Ceze, Kai-Chiang Wu, Preprint, ArXiv 2024
(*equal contribution).
[
Paper]
[
Code]
Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks
Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, Hao Cheng, Noah A. Smith, Mari Ostendorf, Preprint, ArXiv 2024.
[
Paper]
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci, in Conference on Machine Learning and Systems (MLSys) 2024 (acceptance rate 22% ).
[
Paper]
[
Code]
FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline
Chien-Yu Lin, Qichen Fu, Thomas Merth, Karren Yang, Anurag Ranjan, IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, Oral (Top 2.6%) .
SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks
Chien-Yu Lin*, Anish Prabhu*, Thomas Merth, Sachin Mehta, Anurag Ranjan, Maxwell Horton and Mohammad Rastegari, in the 17th European Conference on Computer Vision (ECCV), 2022(*equal contribution).
[
Paper]
[
Code]
[
Video]
Accelerating Spmm Kernel with Cache-First Edge Sampling for Graph Neural Networks
Chien-Yu Lin, Liang Luo, and Luis Ceze, Preprint, 2021.
[
Code]
Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks
Bo-Cheng Lai, Jyun-Wei Pan and Chien-Yu Lin,
in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2019.
[
Paper]
Supporting compressed-sparse activations and weights on SIMD-like accelerator for sparse convolutional neural networks
Chien-Yu Lin and Bo-Cheng Lai,
in the 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), 2018.
[
Paper]
[
Slides]
I'm fortunate to work with and mentor some amazing undergrad and master students!