Yizhong Wang

PhD candidate
Paul G. Allen School of Computer Science & Engineering
University of Washington, Seattle, WA

Email: yizhongw [at] cs.washington.edu
Short Bio CV Research Statement X GitHub Google Scholar

Hello! I am a final-year PhD candidate at the Paul G. Allen School of Computer Science & Engineering, University of Washington. I am fortunate to be co-advised by Hannaneh Hajishirzi and Noah Smith. I am also a student researcher at Ai2. I have previously interned at Meta AI and Microsoft Research Asia. Prior to UW, I did a master at Peking University and an undergraduate at Shanghai Jiao Tong University.

I work broadly on natural language processing, machine learning, and artificial intelligence. In particular, I enjoy studying fundamental data challenges in AI development and how to advance AI through data creation. My work has led to the unification of NLP tasks with instruction tuning (e.g., Super-NaturalInstructions), pioneered the use of synthetic data for language model training (e.g., Self-Instruct), and scientified the development of fully open language models (e.g., OLMo and Tülu). I believe that data serves as the foundation that defines the behavior and upper limits of AI models. It also provides an effective, interpretable, and beneficial ground to advance AI as a community. I have been thinking about these topics recently:

Data for generality. The major focus of AI is progressing from single-purpose applications to a new stage aiming to build better general-purpose models. Optimizing towards this goal requires data beyond traditional paradigms and with a unified view of tasks, modalities, and domains. This poses many opportunities for data creation and calls for data-centric algorithms in achieving the right balance among diversity, speciality, quantity, quality, efficiency, and many other factors.
Synthetic data. One promising direction for creating data is to automatically synthesize it using models, rules, or environments. Synthetic data not only enables easy adaptation of models to specific scenarios, but also opens up potentials for model self-improvement. Many active topics are also very related, such as data augmentation, AI feedback, and the exploration of actions in RL. Research in this area is still in full swing and has many open questions to be answered.
Open science of language models. Openness is critical for collaboration, innovation, and accountability, while today's frontier models are trending oppositely. I am passionate about building fully open alternatives, to inform the public about the science behind these models and their potential risks. This also drives studies in model training methods, their interplay with data, and how we can improve them with the open and research community.

I am on the job market now! Feel free to reach out if you would like to share opportunities, collaborate, or just chat :)

News

Nov. 21, 2024

Tülu has evolved to v3, with SoTA performance & fully open post-training recipes & a playground!

Oct. 29, 2024

We released Hybrid Preferences, a framework for combining human and AI feedbakc for better RLHF.

Sep. 25, 2024

Tulu 2.5 got accepted to NeurIPS 2024!

July 10, 2024

OLMo won the Best Theme Paper Award at ACL 2024!

July 10, 2024

Two papers about proxy tuning and hallucination detection were accepted to the first COLM conference!

May 16, 2024

OLMo was accepted to ACL 2024 main conference, and temporal alignment was accepted to the findings. See people in Bangkok!

Feb. 12, 2024

I have passed my PhD general exam! 🏃

Feb. 1, 2024

I am excited to be part of the OLMo first release. Check out the blog post and tech report.

Jan. 16, 2023

Self-RAG and BTR got accepted to ICLR 2024!

Sep. 22, 2023

📢 I am co-organizing the Instruction Tuning and Instruction Following Workshop at NeurIPS 2023. Please consider submitting your paper or joining us at the conference!

Nov. 18, 2023

We released Tülu 2, which tops open models on several benchmarks (e.g. AlpacaEval and Chatbot Arena)!

Sep. 22, 2023

Tülu got accepted into NeurIPS 2023 Datasets and Benchmarks Track. See people in New Orleans!

June 9, 2023

We arxived a paper that systematically studies instruction tuning resources and released Tülu, a suite of full-parameter instruction-tuned models from 7B to 65B! [Tweets]

May 2, 2023

We have three papers accepted by ACL 2023. Looking forward to meeting people at Toronto!

Apr. 18, 2023

I gave a guest lecture about instruction tuning of large languag models at JHU. [Slides][Video]

Jan. 23, 2023

I started doing part-time research internship at AI2.

Dec. 20, 2022

We arxived Self-Instruct, a new way to align language models with little human annotation. [Tweets]

Apr. 16, 2022

We released Natural Instructions V2 that covers 1600+ NLP tasks together with their instructions!

Aug. 12, 2021

I will be interning at FAIR London starting from September.

Jan. 13, 2021

Our MultiModalQA paper collaborated with AI2 Israel was accepted to ICLR 2021!

Dec. 7, 2020

I will be interning part-time at AI2 in the next few months, mostly with the Aristo team.

Dec. 1, 2020

Our paper on plain language summarization of medical reviews was accepted to AAAI 2021!

Nov. 1, 2020

Our LiveQA paper was elected as CCL 2020 best paper. Congrats to Qianying and Sicong!

Sep. 15, 2020

Our paper on dataset analysis was accepted to EMNLP 2020!

Mar. 1, 2020

I will co-organize the ACL 2020 Student Research Workshop [Website].

Sep. 5, 2019

I arrived in Seattle and started my PhD journey at UW.

Aug. 13, 2019

Our paper on numeracy probing was accepted to EMNLP 2019!

July. 4, 2019

I graduated from Peking University! Thanks for my advisor Sujian Li and all my labmates. Farewell!

Apr. 15, 2019

Super excited to announce that I'll be joining the UW NLP group as a CSE PhD student in Sep. 2019!

Feb. 22, 2019

Our DROP paper was accepted to NAACL 2019! [Details], [Matt's tweets].

Oct. 5, 2018

I started new internship at Allen Institute for Artificial Intelligence, working with Matt and Sameer!

Sep. 13, 2018

Our MRC system (nlnet) firstly achieves human parity w.r.t F1 on SQuAD 1.1 and also tops 2.0! [Leaderboad]

May 24, 2018

One paper on discourse segmentation was accepted to EMNLP 2018! [Preprint] [Code]

Apr. 26, 2018

I started my internship at Microsoft Reserach Asia, working with Furu Wei.

Apr. 21, 2018

Two papers were accepted to ACL 2018!

Feb. 25, 2018

Our reading comprehension system (V-Net) won the first place on MS-MARCO leaderboard! [Report]

Nov. 16, 2017

We released a large-scale dataset for Chinese Machine Reading Comprehension.

July 08, 2017

One paper on discourse relation classification was accepted to IJCNLP 2017.

June 14, 2017

I started my internship at Baidu NLP Team.

June 03, 2017

Our paper on discourse parsing was selected as ACL 2017 outstanding paper! [Full List]

Mar. 31, 2017

One short paper on RST discourse parsing was accepted to ACL 2017.

Sep. 30, 2016

One demo paper on dependency parsing was accepted to COLING 2016.

July 4, 2016

I graduated (with honours) from Shanghai Jiao Tong University . Thanks and Bye!

Selected Publications

* indicates equal contribution. For a full list of publications, please refer to my Google Scholar page.

Tülu 3: Pushing Frontiers in Open Language Model Post-Training

Nathan Lambert, Jacob Morrison, Valentina Pyatkin, Shengyi Huang, Hamish Ivison, Faeze Brahman, Lj Miranda, ..., Luca Soldaini, Noah A. Smith, Yizhong Wang, Pradeep Dasigi, Hannaneh Hajishirzi

Arxiv

Media coverage: Geekwire, TechCrunch, VentureBeat, MSN, and more.

Paper Blog Playground

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Lj Miranda*, Yizhong Wang*, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi

Arxiv

Paper Data Code

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Hamish Ivison, Yizhong Wang, Jiacheng Liu, Zeqiu Wu, Valentina Pyatkin, Nathan Lambert, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi

NeurIPS 2024

Paper Code Models

Set the Clock: Temporal Alignment of Pretrained Language Models

Bowen Zhao*, Zander Brumbaugh*, Yizhong Wang*, Hannaneh Hajishirzi, Noah A. Smith

ACL 2024 Findings

Paper Code

OLMo: Accelerating the Science of Language Models

Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, et al.

ACL 2024 (Best Theme Paper)

Media coverage: Forbes, GeekWire, TechCrunch, Axios, and more.

Paper Blog Site

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

Yizhong Wang*, Hamish Ivison*, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

NeurIPS 2023 (Spotlight)

Paper Code Models

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A Smith, Daniel Khashabi, Hannaneh Hajishirzi

ACL 2023 (Most Influential Paper #1 by Paper Digest)

Paper Data Code

Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Yizhong Wang*, Swaroop Mishra*, Pegah Alipoormolabashi, Yeganeh Kordi et al.

EMNLP 2022 (Most Influential Paper #2 by Paper Digest)

Paper Data Code Site

Probing Across Time: What Does RoBERTa Know and When?

Leo Z. Liu*, Yizhong Wang*, Jungo Kasai, Hannaneh Hajishirzi, Noah A. Smith

EMNLP 2021 Findings

Paper Slides

Do Neural NLP Models Know Numbers? Probing Numeracy in Embeddings

Eric Wallace*, Yizhong Wang*, Sujian Li, Sameer Singh and Matt Gardner

EMNLP-IJCNLP 2019

Paper Code

DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh and Matt Gardner

NAACL 2019 (Most Influential Paper #8 by Paper Digest)

Paper Code

Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification

Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li and Haifeng Wang.

ACL 2018

Paper Slides

A Two-Stage Parsing Method for Text-level Discourse Analysis

Yizhong Wang, Sujian Li and Houfeng Wang

ACL 2017 (Outstanding Paper Award)

Paper Code Slides