Zhihan Xiong

I am currently a fifth-year PhD student in the Paul G. Allen School of Computer Science & Engineering at University of Washington, advised by Prof. Maryam Fazel.

My research interest generally lies in the theory and application of reinforcement learning and bandit problems.

Prior to UW, I received my Master's Degree in Statistics from Stanford University in 2020 and Bachelor's Degree in Mathematics and Engineering Physics from University of Illinois at Urbana-Champaign in 2018, where I was fortunate to be advised by Prof. Pierre Moulin.

Email  /  CV  /  Google Scholar  /  LinkedIn

Publications/ Preprints

(* indicates equal contributions)

Language Model Preference Evaluation with Multiple Weak Evaluators [arXiv]
Zhengyu Hu, Jieyu Zhang, Zhihan Xiong, Alexander Ratner, Hui Xiong, Ranjay Krishna
Preprint

Policy Mirror Descent with Dual Function Approximation [arXiv]
Zhihan Xiong, Maryam Fazel, Lin Xiao
Preprint

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity [arXiv] [paper]
Zhihan Xiong*, Romain Camilleri*, Maryam Fazel, Lalit Jain, Kevin Jamieson
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Conference on Digital Experimentation @ MIT (CODE@MIT), 2023

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning [arXiv]
Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du
International Conference on Learning Representations (ICLR), 2024

Offline Congestion Games: How Feedback Type Affects Data Coverage Requirement [arXiv]
Haozhe Jiang*, Qiwen Cui*, Zhihan Xiong, Maryam Fazel, Simon S. Du
International Conference on Learning Representations (ICLR), 2023

Learning in Congestion Games with Bandit Feedback [arXiv] [paper]
Qiwen Cui*, Zhihan Xiong*, Maryam Fazel, Simon S. Du
Advances in Neural Information Processing Systems (NeurIPS), 2022

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes [arXiv] [paper]
Zhihan Xiong*, Ruoqi Shen*, Qiwen Cui*, Maryam Fazel, Simon S. Du
Advances in Neural Information Processing Systems (NeurIPS), 2022

Fourier Learning with Cyclical Data [paper]
Yingxiang Yang*, Zhihan Xiong*, Tianyi Liu*, Taiqing Wang, Chong Wang
International Conference on Machine Learning (ICML), 2022

Selective Sampling for Online Best-arm Identification [arXiv] [paper]
Romain Camilleri* , Zhihan Xiong*, Maryam Fazel, Lalit Jain, Kevin Jamieson
Advances in Neural Information Processing Systems (NeurIPS), 2021

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning [arXiv] [paper]
Tian Tan* Zhihan Xiong*, Vikranth R. Dwaracherla
Association for the Advancement of Artificial Intelligence (AAAI, Oral), 2020

Professional Experiences

Visiting Researcher, Meta (FAIR Labs) Oct 2022 -- Present
Research Intern, Bytedance (AML Group) Jun 2021 -- Sep 2021
Applied Scientist Intern, Zillow (Personalization Team) Jun 2019 -- Sep 2019

Reviewer for: ICML (2021, 2022, 2023, 2024), NeurIPS (2021, 2022, 2023) and ICLR (2022, 2023, 2024).

Teaching Experiences

CS 229: Machine Learning, Teaching Assistant, Autumn 2019 Stanford University, CA
CS 234: Reinforcement Learning, Teaching Assistant, Winter 2020 Stanford University, CA
CS 229: Machine Learning, Teaching Assistant, Spring 2020 Stanford University, CA