CSE 599 - Advanced Natural Language Processing

University of Washington Department of Computer Science & Engineering

CSE 599 - Advanced NLP - Spring 2015
Mon, Wed 3:00-4:20 in EEB 042

CSE Home

About Us

Contact Info

Instructor: Yejin Choi (yejin at cs dot washington dot edu)

Class Objectives

This course aims to improve participants' knowledge of current techniques, challenges, directions, and developments in all areas of NLP; to hone students' critical technical reading skills, oral presentation skills, and written communication skills; to generate discussion among students across research groups to inspire new research. This class will be organized as collaborative learning experience in which students are required to read, think, talk, and write intensively. The classwork consists largely of reading and writing. No midterm or final. Students can pursue an optional class project if desired.

Topic: Representation Learning and Formalisms

The topical focus of this quarter will be on representation of all types --- neural and non-neural; symbolic and statistical; language and non-language. The goal is to develop comprehensive knowledge and insights into the emerging developments and challenges of semantic representations, to draw connections among seemingly disparate methods and formalisms, and to develop future research directions and original research ideas. The tentative plan of the class is to allocate class time as follows, but the actual plan will be updated based on students' interests and feedback: neural approaches (35%), non-neural approaches (20%), symbolic and graph representations (20%), language-based representations (10%), language grounding (15%).

Schedule (subject to change)

Dates	Topic	Leader	Required Readings	Supplementary Readings
Apr 1	Memory & Reasoning	Yejin	"Towards AI-Complete Question Answering:A Set of Prerequisite Toy Tasks" [pdf]	[Apr 1]
Apr 6	Neural LM	Antoine	"A Neural Probabilistic Language Model." [pdf]	[Apr 6]
Apr 8	Spectral LM	Luke	"A Spectral Algorithm for Learning Class-Based n-gram Models of Natural Language." [pdf]	[Apr 8]
Apr 13	Skip-gram LM	Nicholas	"Efficient Estimation of Word Representations in Vector Space." [pdf]	[Apr 13]
Apr 15	Embeddings	Hao	"Improving Distributional Similarity with Lessons Learned from Word Embeddings" [pdf]	[Apr 15]
Apr 20	Natural Logic	Chloe	"An extended model of natural logic." [pdf]	[Apr 20]
Apr 22	AMR	Ben	"Unsupervised Entity Linking with Abstract Meaning Representation" [pdf]	[Apr 22]
Apr 27	Embeddings	Luheng	"Dependency Based Word Embeddings" [pdf] "Retrofitting Word Vectors to Semantic Lexicons" [pdf]	[Apr 27]
Apr 29	Skip-gram	Mark	"Combining Language and Vision with a Multimodal Skip-gram Model" [pdf]	[Apr 29]
May 6	AMR	Kenton	"Toward Abstractive Summarization Using Semantic Representations." [pdf]	[May 6]
May 11	LSTMs	David	"Sequence to Sequence Learning with Neural Networks" [pdf]	[May 11]
May 13	Generation	Leila	"A Global Model for Concept-to-Text Generation" [pdf]
May 18	Compositional	Victoria	"Experimental Support for a Categorical Compositional Distributional Model of Meaning" [pdf]	[May 18]
May 20	Visual	Hamid	"Phrase-based Image Captioning" [pdf]	[May 20]
May 28	Social	Hannah	"You're Mr. Lebowski, I'm The Dude" [pdf]
Jun 1		Eunsol
Jun 3		Gabe

Syllabus (subject to change)

Bengio et al's survey on representation learning + Yoshua Bengio, Aaron Courville and Pascal Vincent. "Representation Learning: A Review and New Perspectives." [pdf] TPAMI 35:8(1798-1828)

Embeddings & Language Models

Skip-gram embeddings [Apr 13, 15] + Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. "Efficient Estimation of Word Representations in Vector Space." [pdf] ICLR, 2013.
+ Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. "Distributed Representations of Words and Phrases and their Compositionality." [pdf] NIPS, 2013.
+ [king-man+woman=queen] Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. "Linguistic Regularities in Continuous Space Word Representations." [pdf] NAACL, 2013.
+ [technical note] Yoav Goldberg and Omer Levy "word2vec explained: deriving Mikolov et al.'s negative-sampling word-embedding method" [pdf] Tech-report 2013
+ [buzz-busting] Omer Levy and Yoav Goldberg "Linguistic Regularities in Sparse and Explicit Word Representations" [pdf] CoNLL-2014 Best Paper Award
+ [lessons learned] Omer Levy, Yoav Goldberg, Ido Dagan "Improving Distributional Similarity with Lessons Learned from Word Embeddings" [pdf], TACL 2015 Embedding enhancement: Syntax, Retrofitting, etc [Apr 27, 29]
+ [dependency embeddings] Omer Levy and Yoav Goldberg "Dependency Based Word Embeddings" [pdf] ACL-2014 (Short)
+ [retrofitting with lexical knowledge] Manaal Faruqui, Jesse Dodge, Sujay Kumar Jauhar, Chris Dyer, Eduard Hovy and Noah A. Smith. "Retrofitting Word Vectors to Semantic Lexicons" [pdf], NAACL 2015
+ [contrastive estimation] Mnih and Kavukcuoglu, "Learning Word Embeddings Efficiently with Noise-Contrastive Estimation." [pdf] NIPS 2013
+ [embedding documents] Quoc V Le, Tomas Mikolov. "Distributed representations of sentences and documents" [pdf] ICML 2014
+ [multimodal] Angeliki Lazaridou, Nghia The Pham and Marco Baroni. "Combining Language and Vision with a Multimodal Skip-gram Model" [pdf] NAACL 2015 Embeddings as matrix factorization [May 18]

[pdf]

Classic(!) + Brown et al., "Class-Based n-Gram Models of Natural Language." [pdf] Computational Linguistics 1992 Spectral language models [Apr 8]
+ [spectral LM] Karl Stratos, Do-kyum Kim, Michael Collins, and Daniel Hsu. "A Spectral Algorithm for Learning Class-Based n-gram Models of Natural Language." [pdf] UAI 2014.
+ [tutorial] Shay Cohen, Michael Collins, Dean Foster, Karl Stratos and Lyle Ungar. Spectral Learning Algorithms for Natural Language Processing, [tutorial] NAACL 2013
+ [from 2003] "Spectral Learning", [pdf] Sepandar Kamvar, Dan Klein, and Chris Manning, IJCAI 2003.
+ [unsupervised parsing] A. Parikh, S. Cohen and E. P. Xing, "Spectral Unsupervised Parsing with Additive Tree Metrics", [pdf] ACL 2014.

Deep & Neural

Long short term memory (LSTMs) [May 11] + [parsing] Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton, "Grammar as Foreign Language" [pdf] arXiv 2014
+ [program] Wojciech Zaremba, Ilya Sutskever, "Learning to Execute" [pdf] arXiv 2014
+ [translation] Ilya Sutskever, Oriol Vinyals, Quoc Le, "Sequence to Sequence Learning with Neural Networks" [pdf] NIPS 2014
+ [more stuff at naacl 2015] CNNs: convolution neural networks for language + [convoluting from character-level to doc-level] Xiang Zhang, Yann LeCun. "Text Understanding from Scratch" [pdf]
+ [character LM for doc-level] Peng, F., Schuurmans, D., Keselj, V. and Wang, S. "Language independent authorship attribution using character level language models." [pdf] EACL 2004.
+ [convnet for sentences] Nal Kalchbrenner, Edward Grefenstette and Phil Blunsom. "A Convolutional Neural Network for Modelling Sentences" [pdf] ACL 2014.
+ [convnet for paraphrasing] Wenpeng Yin and Hinrich Schutze. "Convolutional Neural Network for Paraphrase Identification." NAACL 2015
+ [convolute better with word order] Rie Johnson and Tong Zhang. "Effective Use of Word Order for Text Categorization with Convolutional Neural Networks" [pdf] QA with commonsense reasoning [Apr 1]
+ [nlp for AI] Jason Weston, Antoine Bordes, Sumit Chopra, Tomas Mikolov. "Towards AI-Complete Question Answering:A Set of Prerequisite Toy Tasks" [pdf] 2015
+ [memory networks] Jason Weston, Sumit Chopra, Antoine Bordes "Memory Networks" [pdf] ICLR 2015
+ [winograd schema] Hector J. Levesque. "The Winograd Schema Challenge" [pdf] AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning 2011
+ [textual entailment] Ion Androutsopoulos, Prodromos Malakasiotis "A Survey of Paraphrasing and Textual Entailment Methods" [pdf] Journal of Artificial Intelligence Research 38 (2010) 135-187 Neural langauge models [Apr 6] + [neural LM] Bengio et al., "A Neural Probabilistic Language Model." [pdf] Journal of Machine Learning Research 2003
+ [bi-loglinear LM]
+ [discriminative LM] Brian Roark, Murat Saraclar, and Michael Collins. "Discriminative n-gram language modeling." [pdf] Computer Speech and Language, 21(2):373-392. 2007 Compositional + Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeff Dean, "Distributed Representations of Words and Phrases and their Compositionality," [pdf] NIPS 2013
+ [socher's]
+ [cutting RNN trees] Christian Scheible, Hinrich Schutze. "Cutting Recursive Autoencoder Trees" [pdf] CoRR abs/1301.2811 (2013)

Abstract Meaning Representation (AMR) & Graph Grammars

Representation [Apr 22] + [8 pages] L. Banarescu, C. Bonial, S. Cai, M. Georgescu, K. Griffitt, U. Hermjakob, K. Knight, P. Koehn, M. Palmer, and N. Schneider. "Abstract Meaning Representation for Sembanking" [pdf] Proc. Linguistic Annotation Workshop, 2013.
+ [57 pages] AMR 1.1 Specification [pdf]
+ [parsing] Jeffrey Flanigan, Sam Thomson, Jaime Carbonell, Chris Dyer, and Noah A. Smith. "A Discriminative Graph-Based Parser for the Abstract Meaning Representation." [pdf] ACL 2014 Do something with AMR [May 6] + [summarization with AMR] Fei Liu, Jeffrey Flanigan, Sam Thomson, Norman Sadeh, and Noah A. Smith. "Toward Abstractive Summarization Using Semantic Representations." [pdf] NAACL 2015
+ [IE with AMR] Xiaoman Pan, Taylor Cassidy, Ulf Hermjakob, Heng Ji and Kevin Knight. "Unsupervised Entity Linking with Abstract Meaning Representation" [pdf] NAACL 2015 Hyperedge Replacement Grammars (HRG) + [graphworld] D. Chiang, J. Andreas, D. Bauer, K. M. Hermann, B. Jones, and K. Knight. "Parsing Graphs with Hyperedge Replacement Grammars," [pdf] ACL 2013.

Natural Language is Representation

Natural logic: knowledge representation and reasoning with language [Apr 20] + [naturally] Gabor Angeli, Chris Manning "NaturalLI: Natural Logic Inference for Common Sense Reasoning" [pdf] EMNLP. 2014
+ [maccartney] Bill MacCartney and Christopher D. Manning "An extended model of natural logic." [pdf] The Eighth International Conference on Computational Semantics (IWCS-8) 2009

Language and X (X != language)

Image Captioning [May 20] + Mao, J., Xu, W., Yang, Y., Wang, J., and Yuille, A. L. Explain Images with Multimodal Recurrent Neural Networks. ICLR 2105.
+ Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. Show and Tell: A Neural Image Caption Generator. CVPR 2015.
+ Karpathy, A. and Fei-Fei, L. Deep Visual-Semantic Alignments for Generating Image Descriptions. CVPR 2015.
+ Chen, X. and Zitnick, C. L. Learning a Recurrent Visual Representation for Image Caption Generation. CVPR 2015.
+ Fang, H., Gupta, S., Iandola, F. N., Srivastava, R., Deng, L., Dollar, P., Gao, J., He, X., Mitchell, M., Platt, J. C., Zitnick, C. L., and Zweig, G. From Captions to Visual Concepts and Back. CVPR 2015.
+ Donahue, J., Hendricks, L. A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. CVPR 2015. Language and vision with storyline + G. Kim and E. P. Xing, "Reconstructing Storyline Graphs for Image Recommendation from Web Community Photos", [pdf] CVPR 2014.
+ G. Kim, L. Sigal and E. P. Xing, "Jointly Summarizing Large-Scale Web Images and Videos for the Storyline Reconstruction", [pdf] CVPR 2014.
Multimodal skip-gram models + Angeliki Lazaridou, Nghia The Pham and Marco Baroni. "Combining Language and Vision with a Multimodal Skip-gram Model" [pdf] NAACL 2015 Aligning instructions and videos + Iftekhar Naim, Young C. Song, Qiguang Liu, Liang Huang, Henry Kautz, Jiebo Luo and Daniel Gildea. "Discriminative Unsupervised Alignment of Natural Language Instructions with Corresponding Video Segments" [pdf] AAAI 2014

And More

Compositional Semantics [May 18] + Edward Grefenstette, Mehrnoosh Sadrzadeh. "Experimental Support for a Categorical Compositional Distributional Model of Meaning." EMNLP 2011.
+ Stephen Clark. "Type-Driven Syntax and Semantics for Composing Meaning Vectors." Quantum Physics and Linguistics: A Compositional, Diagrammatic Discourse, pp.359-377. Chris Heunen, Mehrnoosh Sadrzadeh, and Edward Grefenstette (eds), Oxford University Press, 2013.
+ Bob Coecke, Mehrnoosh Sadrzadeh, Stephen Clark. "Mathematical Foundations for a Compositional Distributional Model of Meaning." Linguistic Analysis, 36(1-4): A Festschrift for Joachim Lambek, pp. 345-384, van Bentham and Moortgat (eds), 2011. Unsupervised SRL with reconstruction minimization + Ivan Titov. Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework [pdf] NAACL 2015 Not yet categorized - A. P. Parikh, A. Saluja, C. Dyer and E. P. Xing, Language Modeling with Power Low Rank Ensembles, EMNLP 2014 (Best paper runner-up)
- Socher et al., 2013, "Parsing with Compositional Vector Grammars."
- Collobert et al., 2011, "Natural Language Processing (Almost) from Scratch." November 18:
- Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Meg Mitchell, Jianfeng Gao, Bill Dolan and Jian-Yun Nie. "A Neural Network Approach to Context-Sensitive Generation of Conversational Responses" NAACL 2015
- Tao Lei, Fan Long, Regina Barzilay and Martin C. Rinard "From Natural Language Specifications to Program Input Parsers", ACL 2013.
- S. R. K. Branavan, David Silver and Regina Barzilay. "Learning to Win by Reading Manuals in a Monte-Carlo Framework", Journal of Artificial Intelligence Research, 43, 2012.

Example Notes, Surveys, Mini-Tutorials, Technical Reports

+ Yoav Goldberg. "A note on Latent Semantic Analysis" [pdf] Tech-report
+ Yoav Goldberg and Omer Levy "word2vec explained: deriving Mikolov et al.'s negative-sampling word-embedding method" [pdf] Tech-report 2013

Background Material

No official prerequisite, but familiarity with NLP and Machine Learning is assumed. Here are a few pointers:

Stanford Deep Learning (with video lectures!) [here]
Our own grad NLP class [here]
Some cool NLP textbooks:

online

Class Activities & Grading

Class activities consist of four components (% for final grade)
(i) leading class discussions (25%),
(ii) participation for the discussions (25%),
(iii) collaborative bibliography writing (25%),
(iv) individual writing project --- can be either a technical note or mini-tutorial [examples] or a research proposal (25%).
If also pursuing a final research project, it will contribute toward additional 25% grade. The total sum will be rescaled so that final grades will be comparable regardless of the optional research project.

Contact

Please feel free to email the course staff, addresses above, and come to office hours. Let us know if you need to meet outside of the scheduled hours, we will do our best to accomodate.
We also have a GoPost discussion board. Please consider posting your questions there; everyone will benefit. We also encourage you to try to answer questions, which will count as class participation. We will monitor daily and contribute as long as the boards are being used.
Grades: Assignment grades are posted in the online CSE 599 Gradebook. Please let us know if you see any errors.

Course Administration and Policies

Assignments must be done individually unless otherwise specified. You may discuss the subject matter with other students in the class, but all final answers must be your own work. You are expected to maintain the utmost level of academic integrity in the course.
Each assignment may be handed in up to three days late, at a penalty of 10% of the maximum grade per day. You have 7 panelty-free late day credits that you can use at any time during the quarter. Above 10% substraction will apply only after you have used all your late day credits. Being late by a partial day (e.g., 1 hour) will be rounded up to 1 full day. This late day policy does not apply to the final project submission due to tight grading schedule at the end of the quarter.
Comments can be sent to the instructor or TA using this anonymous feedback form.

Department of Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA 98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX