Thomas Lin "Leveraging Knowledge Bases in Web Text Processing" (2012, Microsoft)
Dr. Stefan Schoenmackers "Inference over the Web" (2011, Decide.com)
Dr. Michele Banko (2009, Startups). Michele's dissertation,
Open Information Extraction for the Web, investigated the problem of
extracting information from arbitrary Web text in a scalable,
domain-independent manner.
Dr.
Mike Cafarella (2009, University of Michigan). Mike's
dissertation, Extracting and Managing Structured Web Data, bridged the
gap between information extraction and databases. Co-advisors:
Dan Suciu and Alon Halevy.
Dr. Doug
Downey (2008, Northwestern University). Doug's dissertation,
Redundancy in Web-scale Information Extraction: Probabilistic Model and
Experimental Result, investigated in depth what we can learn from
finding extractions repeatedly in the Web corpus.
Dr. Alex
Yates (2007, Temple University). Alex's dissertation,
Information Extraction from the Web: Techniques and Applications,
investigated the problem of unsupervised synonym resolution on the
Web.
Dr. Ana-Maria
Popescu (2007, Yahoo Research). Ana-Maria's
dissertation, Information Extraction from Unstructured Web Text,
investigated how to extract high-quality information from Web
text. Her most impressive demonstration was the Opine system,
which extracted product attributes, and associated opinions, from
reviews found on-line.
Dr.
Luke McDowell (2004, U.S. Naval Academy). Luke's
dissertation, Bringing Meaning to the Masses, investigated how to make
the Semantic Web a reality and how to generalize the vision to
encompass email as well. Co-advisor: Alon Halevy.
Dr. Mike Perkowitz.
(2000, Amazon, Intel Research, Startups) Mike's dissertation,
Adaptive Web Sites, investigated web sites that automatically
reconfigure their layout and presentation by analyzing user access
patterns recorded in their server logs.
Dr. Oren Zamir (1999, Google). Oren's dissertation,
Clustering
Web Documents: A Phrase-Based Method for Grouping Search Engine
Results, investigated the use of a novel and fast clustering algorithm
to group the results of Web search engines into easily-browsed
clusters. The most distinctive aspect of the algorithm was its
treatment of documents as strings of words, represented by a suffix
tree, in contrast with the standard vector-based representation.
Dr. Erik Selberg (1999,
Microsoft). Erik's dissertation, Towards Comprehensive Web
Search, explored meta-search as embodied in MetaCrawler. The
dissertation was the first to show (back in WWW4, 1995) that the
fraction of the Web covered by individual search engines
such as Alta Vista and Lycos was very limited, demonstrating the need
for meta-search engines.
Dr. Keith Golden (1997, NASA Ames, Google). Keith's
dissertation, Planning Support for Softbots, investigated novel
planning and knowledge representation techniques to support
softbots. Primary advisor: Dan Weld.
Dr. Neal Lesh
(1997, MERL, Harvard MPH, D-Tree International). Neal's
dissertation, Scalable and Adaptive Goal Recognition, focused on
automating the construction of plan libraries adapting techniques from
planning and concept learning. His objective was to scale up goal
recognition to domains containing millions of
plans and goals.
Dr.
Richard Segal (1996, IBM Watson research center). Richard's
dissertation, Machine Learning as Massive Search, focused on data
mining using massive search: our BRUTE data mining software can analyze
over 100,000 hypotheses per second, when run on a SPARC-10.
Masters Students Advised
Tessa
Lau.
Master's thesis: Privacy in a Collaborative Web Browsing Environment,
1997. (UW PhD with Weld and Domingos, now at IBM).
Marc
Langheinrich. Master's thesis: A domain independent architecture
for efficient information retrieval on the World Wide Web, 1997
(University of Lugano in Switzerland).
Jonathan Shakes. Master's thesis: Dynamic Reference Sifting: a
Case Study in the Homepage Domain, 1996. (Amazon).