Noah Smith

photo of Noah Smith نوح سميث
ノア スミス
Νώε Σμιθ
Ной Смит
노아 스미스
Photo by Dennis Wise/UW.
Interview by UW student Neil Flodin.
Noah Smith is a computer scientist working at the junction of natural language processing, machine learning, and computational social science. He has worked on:

Core problems in NLP, like parsing sentences in different languages into syntactic representations (EGS '05; MSX '09, AMBDS '16) and semantic representations (DSCS '10; FTDCS '14, SBDS '16), as well as cross-cutting techniques for unsupervised language learning (SE '05; CS '09). His 2011 book, Linguistic Structure Prediction, synthesizes many statistical modeling techniques for language.

Methods applicable to a range of problems in NLP, such as conditional random field autoencoders (ADS '14), linguistic regularizers (YS '14), alternating directions dual decomposition (AD3; MFASX '15), retrofitting (FDJDHS '15), recurrent neural network grammars (DKBS '16), entity language models (JTMCS '17), scaffolds (STLZDS '18), rational recurrences (PSTS '18), deep weighted averaging classifiers (CZS '19), and knowledge-enhanced contextual word vectors (PNLSJSS '19).

Methodology challenges in NLP, including quantifying artifacts in data (GSLSBS '18), bias (SZS '19), interpretability (SS '19), and, when models aren't interpretable, other analysis methods (LGBPS '19).

Applications of NLP like automatic translation (ACJKLMOPSY '99; GS '11), summarization (LFTSS '15), question answering (WSM '07), empirical work in the social sciences (KLRSS '09; YCS '09; SAGS '13) and humanities (BUS '14), education (HS '10), and other next-generation language technologies.

Smith is Professor of Computer Science & Engineering at the University of Washington, Adjunct in Linguistics, Affiliate of the Center for Statistics and the Social Sciences, and Senior Data Science Fellow at the eScience Institute. He is also Senior Research Manager for the AllenNLP team at the Allen Institute for Artificial Intelligence. Previously, he was Finmeccanica Associate Professor in the School of Computer Science at Carnegie Mellon University, completed his Ph.D. as a Hertz Foundation Fellow at Johns Hopkins University, and studied at the Universities of Maryland and Edinburgh and Western Maryland College. His undergraduate, graduate, and postdoctoral advisees have earned positions at leading organizations all over the world, where they make wide-ranging and high-impact research contributions. Smith is an amateur clarinetist, tanguero, swimmer, runner, and cocktail enthusiast, and he serves on the staff of two felines. For more details, see his biographical blurb or academic c.v.


Tutorials and Public Presentations

Invited talk at ACL, August 1, 2017
Lisbon Machine Learning Summer School, July 2011–7
WSDM winter school, January 31, 2015
EACL, April 27, 2014, with André Martins, Mário Figueiredo, and Dani Yogatama
NAACL, June 3, 2012, with André Martins and Mário Figueiredo
NSF SoCS PI meeting, June 28, 2013
Invited course at the University of Heidelberg, November 2014
International Summer School in Language and Speech Technologies, July–August 2012
IBM's T. J. Watson Research Center, May 2011, with Shay Cohen
SxSW, March 13, 2011, with Philip Resnik [Philip's slides]
ICML, June 14, 2009


At CMU, I taught courses on NLP at the undergraduate and graduate levels, including an course originally called "Language and Statistics II" and later "Structured Prediction for Language and Other Discrete Data." Once I taught the graduate course "Probabilistic Graphical Models." I regularly led advanced seminars and lab courses on NLP. In 2013, students in the lab developed open-source morphology tools for Akkadian (Assyrian) and Babylonian, Farsi, French, German, Hindi, Japanese, Russian, Slovene, and Spanish (the author of each is credited at the Github or Bitbucket site). At various times, I co-taught with William Cohen, Chris Dyer, Bob Frederking, and Alon Lavie.

At JHU, I designed and taught short courses titled "Empirical Research Methods in Computer Science" (with David Smith) and "Computational Genomics: Sequence Modeling" (with Roy Tromble). Other teaching materials include the hands-on exercise Predicting English (with Jason Eisner; read the paper), and brief tutorials on hidden Markov models and log-linear models.


Some research activities and events that I am or have been involved in: