Victoria Lin (林曦)

Victoria Lin

Paul G. Allen Center, 486
xilin@cs.washington.edu

I am a PhD student in Computer Science at the University of Washington, advised by Luke Zettlemoyer. My research area is natural language processing and AI. I work on problems regarding representation and extraction of structured knowledge from natural language.

Before coming to UW, I was a PhD student at the CIS Department of University of Pennsylvania, advised by Ben Taskar . Before that I completed a 1-year MSc program at Oxford University, advised by Stephen Pulman. I obtained my Bachelor's degree in Electronic and Information Engineering from the Hong Kong Polytechnic University.

 

Publication

2016
June
Compositional Learning of Embeddings for Relation Paths in Knowledge Bases and Text.
Kristina Toutanova, Xi Victoria Lin, Scott Wen-tau Yih, Hoifung Poon and Chris Quirk.
ACL - Association for Computational Linguistics.
Pdf Abstract Bibtex
Modeling relation paths has offered significant gains in embedding models for knowledge base (KB) completion. However, enumerating paths between two entities is very expensive, and existing approaches typically resort to approximation with a sampled subset. This problem is particularly acute when text is jointly modeled with KB relations and used to provide direct evidence for facts mentioned in it. In this paper, we propose the first exact dynamic programming algorithm which enables efficient incorporation of all relation paths of bounded length, while modeling both relation types and intermediate nodes in the compositional path representations. We conduct a theoretical analysis of the efficiency gain from the approach. Experiments on two datasets show that it addresses representational limitations in prior approaches and improves accuracy in KB completion.
@InProceedings{lin16_pathcomp,
author = {Kristina Toutanova, Xi Victoria Lin, Scott Wen-tau Yih, Hoifung Poon and Chris Quirk.},
title = {Compositional Learning of Embeddings for Relation Paths in Knowledge Bases and Text.},
booktitle = {ACL - Association for Computational Linguistics.},
year = 2016,
month = 08,
address={Berlin, Germany},
url={ACL - Association for Computational Linguistics.}
}
2014
December
Multi-label Learning with Posterior Regularization.
Xi Victoria Lin, Sameer Singh, Luheng He, Ben Taskar, and Luke Zettlemoyer.
NIPS Workshop on Modern Machine Learning and NLP.
Pdf Abstract Bibtex
In many multi-label learning problems, especially as the number of labels grow, it is challenging to gather completely annotated data. This work presents a new approach for multi-label learning from incomplete annotations. The main assumption is that because of label correlation, the true label matrix as well as the soft predictions of classifiers shall be approximately low rank. We introduce a posterior regularization technique which enforces soft constraints on the classifiers, regularizing them to prefer sparse and low-rank predictions. Avoiding strict low-rank constraints results in classifiers which better fit the real data. The model can be trained efficiently using EM and stochastic gradient descent. Experiments in both the image and text domains demonstrate the contributions of each modeling assumption and show that the proposed approach achieves state-of-the-art performance on a number of challenging datasets..
@InProceedings{lin14_prlr,
author = {Xi Victoria Lin and Sameer Singh and Luheng He and Ben Taskar and Luke Zettlemoyer},
title = {Multi-label Learning with Posterior Regularization},
booktitle = {NIPS Workshop on Modern Machine Learning and Natural Language Processing},
year = 2014,
month = 12,
address={Montreal, Quebec, CA},
url={http://homes.cs.washington.edu/~xilin/pubs/mlnlp2014.pdf}
}
2011
September
Fine-grained Named Entity Recognition in Machine Reading.
Xi Victoria Lin.
Master's thesis, Oxford University.
Pdf Abstract Bibtex
Fine-grained named entity classification or FG-NEC refers to the process of classifying a set of named entities from naturally occurring texts to the maximum granularity. It is essentially different from the traditional coarse-grained NEC (PER, LOC, ORG) in that it requires deep semantic analysis and the FG semantic classes are highly ambiguous. While research has been conducted in an application-oriented manner, few works have addressed this problem per se. This thesis addressed this problem, with a special focus on the person category. Our methodology is to extract the key property of each candidate instance first and automatically classify them according to a reference taxonomy. The classification takes into account the non-uniformity and insufficiency of context clues and uses a cascade framework such that named entities with different kinds of context clues are resolved at different stages. The cascade framework is highly efficient since the simple instances can be filtered out at early stages thereby the system can focus on the more difficult ones. We also developed a joint-inference based property extraction algorithm for entities whose target properties are explicitly specified in the texts. Evaluated on the Wall Street Journal corpus, the extractor achieves an F1 score of 91.91, which is quite competitive. Trained on newswire texts, this framework can be easily tuned to apply to texts in other styles..
@MastersThesis{lin11-fine_grained_ner,
author = {Xi Victoria Lin},
title = {Fine-grained Named Entity Recognition in Machine Reading},
school = {Oxford University},
year = 2011,
month= 9,
url={http://homes.cs.washington.edu/~xilin/pubs/msc_thesis.pdf}
}
↑ top
 

Resume

[download]

↑ top
 

Miscellaneous

↑ top

Design inspired by Sameer Singh. Last Updated On: September 27, 2015