Yizhong Wang

PhD student
Paul G. Allen School of Computer Science & Engineering
University of Washington, Seattle, WA

Email: yizhongw [at] cs.washington.edu

Hello! I am a final-year PhD student at the Paul G. Allen School of Computer Science & Engineering, University of Washington. I am fortunate to be co-advised by Hannaneh Hajishirzi and Noah Smith. I am also a student researcher at AI2. I have previously interned at Meta AI and Microsoft Research Asia. Prior to UW, I did a master at Peking University and an undergraduate at Shanghai Jiao Tong University.

These days, I am excited about data-centric approaches for understanding and advancing modern artificial intelligence systems. I believe that data can serve as an effective, sustainable, auditable, and beneficial ground for future human AI collaboration and dual improvement. Here are some topics I have been thinking about recently:

Feel free to drop me an email if you would like to collaborate on these topics or just chat :)

News

  • July 10, 2024
  • Two papers about proxy tuning and hallucination detection were accepted to the first COLM conference!
  • June 13, 2024
  • Tülu has grown to 2.5, which explores RLHF data and techniques systematically. Check out all open artifacts!
  • May 16, 2024
  • OLMo 1 was accepted to ACL 2024 main conference, and temporal alignment was accepted to the findings. See people in Bangkok!
  • Feb. 12, 2024
  • I have passed my PhD general exam! 🏃
  • Feb. 1, 2024
  • I am excited to be part of the OLMo first release. Check out the blog post and tech report.
  • Jan. 16, 2023
  • Self-RAG and BTR got accepted to ICLR 2024!
  • Sep. 22, 2023
  • 📢 We are organizing a Workshop on Instruction Tuning and Instruction Following at NeurIPS 2023. Please consider submitting your paper or joining us at the conference!
  • Nov. 18, 2023
  • We released Tülu 2, which tops open models on several benchmarks (e.g. AlpacaEval and Chatbot Arena)!

    Selected Publications

    * indicates equal contribution. For a full list of publications, please refer to my Google Scholar page.

    Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

    Hamish Ivison*, Yizhong Wang*, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

    Arxiv Preprint
    How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources (Spotlight)

    Yizhong Wang*, Hamish Ivison*, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

    NeurIPS 2023
    Self-Instruct: Aligning Language Models with Self-Generated Instructions

    Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A Smith, Daniel Khashabi, Hannaneh Hajishirzi

    ACL 2023
    Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

    Yizhong Wang*, Swaroop Mishra*, Pegah Alipoormolabashi, Yeganeh Kordi et al.

    EMNLP 2022
    Probing Across Time: What Does RoBERTa Know and When?

    Leo Z. Liu*, Yizhong Wang*, Jungo Kasai, Hannaneh Hajishirzi, Noah A. Smith

    EMNLP 2021 Findings
    Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

    Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith and Yejin Choi

    EMNLP 2020
    Do Neural NLP Models Know Numbers? Probing Numeracy in Embeddings

    Eric Wallace*, Yizhong Wang*, Sujian Li, Sameer Singh and Matt Gardner

    EMNLP-IJCNLP 2019
    DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

    Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh and Matt Gardner

    NAACL 2019
    Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification

    Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li and Haifeng Wang.

    ACL 2018
    DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

    Wei He, Kai Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang

    ACL 2018 MRQA Workshop
    A Two-Stage Parsing Method for Text-level Discourse Analysis (Outstanding Paper Award)

    Yizhong Wang, Sujian Li and Houfeng Wang

    ACL 2017