Noah Smith

Photo by Dennis Wise/UW.
Interview by UW student Neil Flodin.
Noah Smith designs algorithms for automated analysis of human language. He often exploits the web to this end, including mining the web for translations (RS '03), measuring public opinion from social messages (OBRS '10), and inferring geographic linguistic variation (EOSX '10).

Smith has also contributed algorithms tackling the core problems of natural language processing: parsing sentences in different langauges into syntactic representations (EGS '05; MSX '09, AMBDS '16) and semantic representations (DSCS '10; FTDCS '14, SBDS '16), as well as cross-cutting techniques for unsupervised language learning (SE '05; CS '09). His 2011 book, Linguistic Structure Prediction, synthesizes many statistical modeling techniques for language.

Some of the methods he has contributed recently include conditional random field autoencoders (ADS '14), linguistic regularizers (YS '14), alternating directions dual decomposition (AD3; MFASX '15), retrofitting (FDJDHS '15), and recurrent neural network grammars (DKBS '16).

Such methods advance applications for automatic translation (ACJKLMOPSY '99; GS '11), summarization (LFTSS '15), question answering (WSM '07), empirical work in the social sciences (KLRSS '09; YCS '09, SAGS '13) and humanities (BUS '14), education (HS '10), and other next-generation language technologies.

Smith is Associate Professor of Computer Science & Engineering at the University of Washington, Adjunct in Linguistics, Affiliate of the Center for Statistics and the Social Sciences, and Senior Data Science Fellow at the eScience Institute. Previously, he was Finmeccanica Associate Professor in the School of Computer Science at Carnegie Mellon University, completed his Ph.D. as a Hertz Foundation Fellow at Johns Hopkins University, and studied at the Universities of Maryland and Edinburgh and Western Maryland College. He is an amateur clarinetist, tanguero, swimmer, cocktail enthusiast, and serves on the staff of two felines. For more details, see his biographical blurb or academic c.v.


Tutorials and Public Presentations

Lisbon Machine Learning Summer School, July 2011–6
WSDM winter school, January 31, 2015
EACL, April 27, 2014, with André Martins, Mário Figueiredo, and Dani Yogatama
NAACL, June 3, 2012, with André Martins and Mário Figueiredo
NSF SoCS PI meeting, June 28, 2013
Invited course at the University of Heidelberg, November 2014
International Summer School in Language and Speech Technologies, July–August 2012
IBM's T. J. Watson Research Center, May 2011, with Shay Cohen
SxSW, March 13, 2011, with Philip Resnik [Philip's slides]
ICML, June 14, 2009


At CMU, I taught courses on NLP at the undergraduate and graduate levels, including an course originally called "Language and Statistics II" and later "Structured Prediction for Language and Other Discrete Data." Once I taught the graduate course "Probabilistic Graphical Models." I regularly led advanced seminars and lab courses on NLP. In 2013, students in the lab developed open-source morphology tools for Akkadian (Assyrian) and Babylonian, Farsi, French, German, Hindi, Japanese, Russian, Slovene, and Spanish (the author of each is credited at the Github or Bitbucket site). At various times, I co-taught with William Cohen, Chris Dyer, Bob Frederking, and Alon Lavie.

At JHU, I designed and taught short courses titled "Empirical Research Methods in Computer Science" (with David Smith) and "Computational Genomics: Sequence Modeling" (with Roy Tromble). Other teaching materials include the hands-on exercise Predicting English (with Jason Eisner; read the paper), and brief tutorials on hidden Markov models and log-linear models.


