 I am an associate professor in the Paul G. Allen School of Computer Science & Engineering, at the University of Washington. I work on natural language processing, and I'm particularly interested in hybrid solutions at the intersection of machine learning and theoretical or social linguistics, i.e., solutions that combine interesting learning/modeling methods and insights about human languages or about people speaking these languages.
I am an associate professor in the Paul G. Allen School of Computer Science & Engineering, at the University of Washington. I work on natural language processing, and I'm particularly interested in hybrid solutions at the intersection of machine learning and theoretical or social linguistics, i.e., solutions that combine interesting learning/modeling methods and insights about human languages or about people speaking these languages. 
Much of my research group's work focuses on understanding and advancing large language models, AI ethics, multilingual learning, and machine learning for NLP. This research is motivated by a unified goal: to extend the capabilities of human language technology beyond individual populations and across language and culture boundaries, thereby enabling NLP for all users. 
Here are my CV and Google Scholar page.
Previously, I was an assistant professor in the Language Technologies Institute, School of Computer Science at Carnegie Mellon University (I'm currently an adjunct professor at LTI), and before that I was a postdoc in the Stanford NLP Group. I got my PhD from CMU. 
Biased AI can Influence Political Decision-Making. In Sub, 2025. PDF Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence. In Sub, 2025. PDF JPEG-LM: LLMs as Image Generators with Canonical Codec Representations. In Sub, 2025. PDF Explore Theory of Mind: Program-guided Adversarial Data Generation for Theory of Mind Reasoning. Proc. ICLR, 2025. PDF Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only. Proc. ICLR, 2025. PDF ComPO: Community Preferences for Language Model Personalization. Proc. NAACL, 2025. PDF Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs. Proc. NAACL, 2025. PDF Know Your Limits: A Survey of Abstention in Large Language Models. TACL, 2025. PDF Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically. TACL. PDF MEDIQ: Question-Asking LLMs for Adaptive and Reliable Clinical Reasoning. Proc. NeurIPS 2024. PDF MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization. Proc. NeurIPS 2024. PDF MatFormer: Nested Transformer for Elastic Inference. Proc. NeurIPS 2024. PDF The Art of Saying No: Contextual Noncompliance in Language Models. Proc. NeurIPS 2024, Datasets and Benchmarks Track. PDF Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia. Proc. EMNLP 2024. PDF Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects. Proc. EMNLP 2024. PDF Modular Pluralism: Pluralistic Alignment via Multi-LLM Collaboration. Proc. EMNLP 2024. PDF Teaching LLMs to Abstain across Languages via Multilingual Feedback. Proc. EMNLP 2024. PDF Can LLM Graph Reasoning Generalize beyond Pattern Memorization? Proc. EMNLP 2024, findings. PDF Can Machines Learn Morality? The Delphi Experiment. Nature Machine Intelligence. PDF Resolving Knowledge Conflicts in Large Language Models. Proc. COLM 2024. PDF Tuning Language Models by Proxy. (Spotlight Paper) Fine-grained Hallucination Detection and Editing for Language Models. Proc. COLM 2024. PDF Do Membership Inference Attacks Work on Large Language Models?. Proc. COLM 2024. PDF DIALECTBENCH: A NLP Benchmark for Dialects, Varieties, and Closely-Related Languages. (Best Social Impact Paper Award) Proc. ACL 2024. PDF Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration. (Outstanding Paper Award & Area Chair Award, QA track)  Proc. ACL 2024. PDF What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection. Proc. ACL 2024. PDF Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks. Proc. ACL 2024. PDF Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language Models. Proc. ACL 2024, findings. PDF DELL: Generating Reactions and Explanations for LLM-Based Misinformation Detection. Proc. ACL 2024, findings. PDF  David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs. Proc. NAACL.  PDF Publications
( this list is more likely to be updated)
2025
2024
P3Sum: Preserving Author's Perspective in News Summarization with Diffusion Language Models. Proc. NAACL. PDF
Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers. Proc. NAACL. PDF
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer. Proc. NAACL. PDF
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding. Proc. NAACL. PDF
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation. Proc. NAACL. PDF
LatticeGen: Hiding Generated Text in a Lattice for Privacy-Aware Large Language Model Generation on Cloud. Proc. NAACL Findings. PDF
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models. (Oral) Proc. WebConf. PDF
Gen-Z: Generative Zero-Shot Text Classification with Contextualized Label Descriptions. Proc. ICLR. PDF
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models. (Oral) Proc. ICLR. PDF
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory. (Spotlight paper) Proc. ICLR. PDF
Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I Learned to Start Worrying about Prompt Formatting. Proc. ICLR. PDF
Can Language Models Solve Graph Problems in Natural Language? (Spotlight paper) Proc. NeurIPS. PDF
MatFormer: Nested Transformer for Elastic Inference. (Best paper award) Proc. ENLSP @ NeurIPS 2023. PDF
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models. Proc. EMNLP. PDF
FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge. Proc. EMNLP. PDF
GlobalBench: A Benchmark for Global Progress in Natural Language Processing. Proc. EMNLP. PDF
BotPercent: Estimating Twitter Bot Populations from Groups to Crowds. Proc. EMNLP Findings. PDF
Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too? Proc. EMNLP Findings. PDF
On the Zero-Shot Generalization of Machine-Generated Text Detectors. Proc. EMNLP Findings. PDF
TalkUp: A Novel Dataset Paving the Way for Understanding Empowering Language. Proc. EMNLP Findings. PDF
LEXPLAIN: Improving Model Explanations via Lexicon Supervision. Proc. StarSEM. PDF
Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker. (Outstanding paper award) Proc. ACL. PDF
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models. (Best paper award) Proc. ACL. PDF
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control. Proc. ACL. PDF
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding. Proc. ACL. PDF
Understanding In-Context Learning via Supportive Pretraining Data. Proc. ACL. PDF
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation. Proc. ACL PDF
Examining Risks of Racial Biases in NLP Tools for Child Protective Services. Proc. FAccT. PDF
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey. Proc. EACL. PDF
Unsupervised Keyphrase Extraction via Interpretable Neural Networks. Proc. EACL. PDF
An Analysis of Emotions and the Prominence of Positivity in #BlackLivesMatter Tweets. Proc. PNAS. PDF
Correcting Diverse Factual Errors in Abstractive Summarization via Post-Editing and Language Model Infilling. Proc. EMNLP. PDF
Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation. Proc. EMNLP. PDF
Gradient-based Constrained Sampling from Language Models. Proc. EMNLP. PDF
Gendered Mental Health Stigma in Masked Language Models. Proc. EMNLP. PDF
Challenges and Opportunities in Information Manipulation Detection: An Examination of Wartime Russian Media. Proc. Findings of EMNLP. PDF
Threat Scenarios and Best Practices to Detect Neural Fake News. Proc. COLING. PDF
Speaker Information Can Guide Models to Better Inductive Biases: A Case Study On Predicting Code-Switching. Proc. ACL'22. PDF
Controlled Analyses of Social Biases in Wikipedia Bios. (Wikimedia Foundation research award of the year) Proc. TheWebConf'22. PDF
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision. Proc. ICLR'22. PDF
Controlled Text Generation as Continuous Optimization with Multiple Constraints. Proc. NeurIPS'21. PDF
SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers. Proc. EMNLP'21. PDF
Evaluating the Morphosyntactic Well-formedness of Generated Texts. Proc. EMNLP'21. PDF
Influence Tuning: Demoting Spurious Correlations via Instance Attribution and Instance-Driven Updates. Proc. Findings of EMNLP'21. PDF
Detecting Community Sensitive Norm Violations in Online Conversations. Proc. Findings of EMNLP'21. PDF
Efficient Test Time Adapter Ensembling for Low-resource Language Varieties. Proc. Findings of EMNLP'21. PDF
Simple and Efficient ways to Improve REALM. Proc. MRQA'21. PDF
Improving the Diversity of Unsupervised Paraphrasing with Embedding Outputs. Proc. MRL'21. PDF
Improving Span Representation for Domain-adapted Coreference Resolution. Proc. CRAC'21. PDF
A Survey of Race, Racism, and Anti-Racism in NLP. Proc. ACL'21. PDF
Machine Translation into Low-resource Language Varieties. Proc. ACL'21. PDF
Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation. Proc. Findings of ACL'21. PDF
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics. Proc. NAACL'21. PDF
Controlling Dialogue Generation with Semantic Exemplars. Proc. NAACL'21. PDF
DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues. Proc. ICLR'21. PDF
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models. (Spotlight paper) Proc. ICLR'21. PDF
StructSum: Incorporating Latent and Explicit Sentence Dependencies for Single Document Summarization. Proc. EACL'21. PDF
Ranking Transfer Languages with Pragmatically-Motivated Features for Multilingual Sentiment Analysis. Proc. EACL'21. PDF
Multilingual Contextual Affective Analysis of LGBT People Portrayals in Wikipedia. Proc. ICWSM'21. PDF
An Exploration of Data Augmentation Techniques for Improving English to Tigrinya Translation. Proc. AfricaNLP'21. PDF
Unsupervised Discovery of Implicit Gender Bias. Proc. EMNLP'20. PDF
On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment. Proc. EMNLP'20. PDF
Fortifying Toxic Speech Detectors Against Veiled Toxicity. Proc. EMNLP'20. PDF
Automatic Extraction of Rules Governing Morphological Agreement. Proc. EMNLP'20. PDF
Understanding Linguistic Accommodation in Code-Switched Human-Machine Dialogues. Proc. CoNLL'20. PDF
LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for Multi-Granular Propaganda Span Identification. Proc. SemEval'20. PDF
A Computational Analysis of Polarization on Indian and Pakistani Social Media. (Best paper runner-up) Proc. SocInfo'20. PDF
A framework for the computational linguistic analysis of dehumanization. Frontiers in Artificial Intelligence. PDF
Demoting Racial Bias in Hate Speech Detection. Proc. SocialNLP'20. PDF
A Deep Reinforced Model for Cross-Lingual Summarization with Bilingual Semantic Similarity Reward. Proc. WNGT'20. PDF
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions Proc. ACL'20. PDF
Balancing Training for Multilingual Neural Machine Translation Proc. ACL'20. PDF
Stress and Burnout in Open Source: Toward Finding, Understanding, and Mitigating Unhealthy Interactions Proc. of International Conference on Software Engineering -- New Ideas Track (ICSE-NIER). PDF
Augmenting Non-Collaborative Dialog Systems with Explicit Semantic and Strategic Dialog History Proc. ICLR'20. PDF
What Code-Switching Strategies are Effective in Dialog Systems? Proc. SCiL'20. PDF
Where New Words Are Born: Distributional Semantic Analysis of Neologisms and Their Semantic Neighborhoods Proc. SCiL'20. PDF
Topics to Avoid: Demoting Latent Confounds in Text Classification Proc. EMNLP'19. PDF
Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts Proc. EMNLP'19. PDF
Learning to Generate Word- and Phrase-Embeddings for Efficient Phrase-Based Neural Machine Translation Proc. WNGT'19. PDF
A Margin-based Loss with Synthetic Negative Samples for Continuous-output Machine Translation Proc. WNGT'19. PDF
A Dynamic Strategy Coach for Effective Negotiation Proc. SIGdial'19. PDF
Entity-Centric Contextual Affective Analysis Proc. ACL'19. PDF
CMU-01 at the SIGMORPHON 2019 Shared Task on Crosslinguality and Context in Morphology. (Interpretability Prize) Proc. SIGMORPHON'19. PDF
Quantifying Social Biases in Contextual Word Representations Proc. of Workshop on Gender Bias for NLP. PDF
Contextual Affective Analysis: A Case Study of People Portrayals in Online #MeToo Stories Proc. ICWSM'19. PDF
Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings Proc. NAACL'19. PDF
Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs Proc. ICLR'19. PDF
Framing and Agenda-setting in Russian News: a Computational Analysis of Intricate Political Strategies Proc. EMNLP'18. PDF
Style Transfer Through Back-Translation Proc. ACL'18. PDF
Native Language Cognate Effects on Second Language Lexical Choice Proceedings of the Transactions of Association for Computational Linguistics (TACL). 2018. PDF DATA
RtGender: A Corpus for Studying Differential Responses to Gender Proc. LREC'18. PDF DATA
Incorporating Dialectal Variability for Socially Equitable Language Identification Proc. ACL'17. PDF CODE
Writer Profiling Without the Writer's Text Proc. SocInfo'17. PDF
Linguistic Knowledge in Data-Driven Natural Language Processing PhD thesis, September 2016. PDF
Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning Proc. ACL'16. PDF
Correlation-based Intrinsic Evaluation of Word Vector Representations In RepEval'16. PDF CODE
Problems With Evaluation of Word Embeddings Using Word Similarity Tasks In RepEval'16. PDF
Polyglot Neural Language Models: Case Study in Cross-Lingual Phonetic Representation Learning Proc. NAACL'16. PDF
Morphological Inflection Generation Using Character Sequence to Sequence Learning Proc. NAACL'16. PDF
Massively Multilingual Word Embeddings arXiv preprint PDF
Cross-Lingual Bridges with Models of Lexical Borrowing. Journal of Artificial Intelligence Research (JAIR). 2016. PDF
Evaluation of Word Vector Representations by Subspace Alignment. In Proc. EMNLP'15. PDF CODE
Not All Contexts Are Created Equal: Better Word Representations with Variable Attention. In Proc. EMNLP'15. PDF
Lexicon Stratification for Translating Out-of-Vocabulary Words. In Proc. ACL'15. PDF
Sparse Overcomplete Word Vector Representations. In Proc. ACL'15. PDF
A Bottom Up Approach to Category Mapping and Meaning Change. In Proc. NetWordS'15. PDF
Constraint-Based Models of Lexical Borrowing. In Proc. NAACL'15. PDF
Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources. Computational Linguistics, 40(2):449-468, 2014. PDF
Metaphor Detection with Cross-Lingual Model Transfer. In Proc. ACL'14. PDF CODE DATA
Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation. In Proc. EACL'14. PDF
Augmenting English Adjective Senses with Supersenses. In Proc. LREC'14. PDF CODE DATA
Unified Annotation Scheme for the Semantic/Pragmatic Components of Definiteness. In Proc. LREC'14. PDF DATA
Automatic Classification of Communicative Functions of Definiteness. In Proc. COLING'14. PDF
The CMU Machine Translation Systems at WMT 2014. In Proc. WMT'14. PDF
Generating English Determiners in Phrase-Based Translation with Synthetic Translation Options. In Proc. WMT'13. PDF
The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References. In Proc. WMT'13. PDF
Identifying the L1 of non-native writers: the CMU-Haifa system. In Proc. the 8th Workshop on Innovative Use of NLP for Building Educational Applications, 2013. PDF
Cross-Lingual Metaphor Detection Using Common Semantic Features. In Proc. Meta4NLP Workshop, 2013. PDF
Identification and Modeling of Word Fragments in Spontaneous Speech. In Proc. ICASSP'13. PDF
Extraction of Multi-word Expressions from Small Parallel Corpora. In Natural Language Engineering 18(4):549-573, 2012. PDF
Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources. In Proc. EMNLP'11. PDF
Extraction of Multi-word Expressions from Small Parallel Corpora. University of Haifa M.Sc. thesis, September 2010. PDF
Extraction of Multi-word Expressions from Small Parallel Corpora. In Proc. COLING'10. PDF
Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content. In Proc. LREC'10. PDF
Teaching
 
Algorithms for NLP (undergraduate IITP course; co-teaching with David Mortensen)
 
Algorithms for NLP (undergraduate IITP course; co-teaching with David Mortensen)