Ben Taskar

Ben Taskar Boeing Associate Professor  
Computer Science and Engineering
Office:  532 Allen Center
Email:  lastname@cs

In spring of 2013, I moved to the University of Washington as Boeing Associate Professor. My primary research interests are machine learning, computational linguistics and computer vision.

I received my bachelor's and doctoral degree in Computer Science from Stanford University. After a postdoc at the University of California at Berkeley, I joined the faculty at the University of Pennsylvania's Computer and Information Science Department in 2007.

I've been awarded the Sloan Research Fellowship, the NSF CAREER Award, and selected for the Young Investigator Program by the Office of Naval Research and the DARPA Computer Science Study Group. I've also served as a Distinguished Research Fellow at the Annenberg Center for Public Policy. My work on structured prediction has received best paper awards at NIPS and EMNLP conferences.

[ Teaching | Group | Research | Publications | Funding ]

 News

This summer, I'm giving an invited talk at CoNLL on determinantal processes and a tutorial at CVPR on weakly-supervised structured learning with Matthew Blaschko and M. Pawan Kumar.

Alex Kulesza and I gave a tutorial about determinantal processes at UAI 2012.

Our survey paper on determinantal processes was just published by Foundation and Trends (arXiv version).

Recent tutorial at ACL and Interspeech about Posterior Regularization and Generalized Expectation Constraints by Greg Druck and my students Kuzman Ganchev and Joao Graca.

 Teaching

Spring 2013
CSE515 - Statistical Methods in Computer Science
Fall 2012 - CIS 520 - Machine Learning
Fall 2011 - CIS 520 - Machine Learning
Spring 2011 - CIS 121 - Data Structures and Algorithms
Spring 2011 - CIS 800 - Green Buildings: Optimization & Adaptation
Fall 2010 - CIS 520 - Machine Learning
Fall 2009 - CIS 520 - Machine Learning
Spring 2009 - CIS 620 - Advanced Topics in AI - Probabilistic Graphical Models
Fall 2008 - CIS 520 - Machine Learning
Spring 2008 - CIS 700 - Advanced Topics in Machine Learning
Fall 2007 - CIS 521 - Fundamentals of Artificial Intelligence
Spring 2007 - CIS 620 - Advanced Topics in Artificial Intelligence

 Research Group

Postdocs
Ofir Pele
PhD
Brian Dolhansky
Jennifer Gillenwater
Victoria Lin
David Weiss
Alumni
Kayhan N. Batmanghelich (co-advised with Christos Davatzikos, now a postdoc at MIT)
Timothee Cour (now at Google)
Kuzman Ganchev (co-advised with Fernando Pereira, now at Google Research)
Joao Graca (now at INESC-ID Lisboa)
Alex Kulesza (now a postdoc at U. Michigan)
Philippos Mordohai (postdoc co-advised with Kostas Daniilidis, now faculty at Stevens Institute of Technology)
Ben Sapp, (now at Google)
Ben Snyder (postdoc, now faculty at University of Wisconsin-Madison)
Alex Toshev (co-advised with Kostas Daniilidis and Jianbo Shi, now at Google Research)
Umar Syed (postdoc, co-advised with Michael Kearns, now at Google Research New York)

 Recent Projects

Geometry of Diversity and Determinantal Point Processes

Determinantal point processes (DPPs) arise in random matrix theory and quantum physics as models of random variables with negative correlations. Among many remarkable properties, they offer tractable algorithms for exact inference, including computing marginals, computing certain conditional probabilities, and sampling. DPPs are a natural model for subset selection problems where diversity is preferred. For example, they can be used to select diverse sets of sentences to form document summaries, or to return relevant but varied text and image search results, or to detect non-overlapping multiple object trajectories in video. In our recent work, we discovered a novel factorization and dual representation of DPPs that enables efficient inference for exponentially-sized structured sets. We developed a new inference algorithm based on Newton identities for DPPs conditioned on subset size. We also derived efficient parameter estimation for DPPs from several types of observations. We demonstrated the advantages of the model on several natural language and vision tasks: extractive document summarization, diversifying image search results and multi-person articulated pose estimation problems in images.


Relevant Materials:
UAI12 tutorial, Long DPP survey on arXiv, Discovering Diverse and Salient Threads in Document Collections, Learning Determinantal Point Processes, k-DPPs: Fixed-Size Determinantal Point Processes and Structured Determinantal Point Processes
Code: DPP toolkit

Computation and Approximation in Structured Prediction

Structured prediction tasks pose a fundamental bias-computation trade-off: The need for complex models to increase predictive power on the one hand and the limited computational resources for inference in the exponentially-sized output spaces on the other. We formulate and develop structured prediction cascades to address this trade-off: a sequence of increasingly complex models that progressively filter the space of possible outputs. We represent an exponentially large set of filtered outputs using max marginals and propose a novel convex loss for learning cascades that balances filtering error with filtering efficiency. We derive generalization bounds for error and efficiency losses and evaluate our approach on several natural language and vision problems: handwriting recognition, part-of-speech tagging and articulated pose estimation in images and videos. We find that the learned cascades are capable of reducing the complexity of inference by up to several orders of magnitude, enabling the use of models which incorporate higher order dependencies and features and yield significantly higher accuracy.


Relevant Papers: Long report on arXiv, Structured Prediction Cascades, Cascaded Models for Articulated Pose Estimation, Sidestepping Intractable Inference with Structured Ensemble Cascades.

Posterior Regularization for Structured Latent Variable Models

Posterior regularization is a probabilistic framework for structured, weakly supervised learning. Our framework efficiently incorporates indirect supervision via constraints on posterior distributions of probabilistic models with latent variables. Posterior regularization separates model complexity from the complexity of structural constraints it is desired to satisfy. By directly imposing decomposable regularization on the posterior moments of latent variables during learning, we retain the computational efficiency of the unconstrained model while ensuring desired constraints hold in expectation. We present an efficient algorithm for learning with posterior regularization and illustrate its versatility on a diverse set of structural constraints such as bijectivity, symmetry and group sparsity in several large scale experiments, including multi-view learning, cross-lingual dependency grammar induction, unsupervised part-of-speech induction, and bitext word alignment.


Relevant Papers: Graph-Based Posterior Regularization for Semi-Supervised Structured Prediction, Posterior Regularization for Structured Latent Variable Models , Learning Tractable Word Alignment Models with Complex Constraints , Controlling Complexity in Part-of-Speech Induction.
Tutorials and code: http://sideinfo.wikkii.com/

Learning from Partial Labels

In partially-labeled multiclass classification, instead of a single label per instance, the algorithm is given a candidate set of labels, only one of which is correct. Our setting is motivated by a common scenario in many image and video collections, where only partial access to labels is available. The goal is to learn a classifier that can disambiguate the partially-labeled training instances, and generalize to unseen data. We define an intuitive property of the data distribution that sharply characterizes the ability to learn in this setting and show that effective learning is possible even when all the data is only partially labeled. Exploiting this property of the data, we propose a convex learning formulation based on minimization of a loss function appropriate for the partial label setting. We analyze the conditions under which our loss function is asymptotically consistent, as well as its generalization and transductive performance. We apply our framework to identifying faces culled from web news sources and to naming characters in TV series and movies; in particular, we annotated and experimented on a very large video data set and achieve very accurate character naming on over a dozen episodes of the TV series Lost.


Relevant Papers: Learning from Partial Labels, Semi-Supervised Learning with Adversarially Missing Label Information, Learning from Ambiguously Labeled Images.

 Publications

2013

Graph-Based Posterior Regularization for Semi-Supervised Structured Prediction, L. He, J. Gillenwater and B. Taskar. Computational Natural Language Learning (CoNLL), Sofia, Bulgaria, Aug 2013.
[Supplemental Material]

Collective Stability in Structured Prediction: Generalization from One Example, B. London, B. Huang, B. Taskar and L. Getoor. International Conference on Machine Learning (ICML), Atlanta, GA, June 2013.
[Supplemental Material]

MODEC: Multimodal Decomposable Models for Human Pose Estimation, B. Sapp, and B. Taskar. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013.
[Data and Code]

SCALPEL: Segmentation CAscades with Localized Priors and Efficient Learning, D. Weiss and B. Taskar. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013.
[Code]

The Pairwise Piecewise-Linear Embedding for Efficient Non-Linear Classification, O. Pele, B. Taskar, A. Globerson and M. Werman. International Conference on Machine Learning (ICML), Atlanta, GA, June 2013.

Approximating Determinantal Point Processes Using the Nystrom Method, R. Affandi, A. Kulesza, E. Fox and B. Taskar. International Conference on Artificial Intelligence and Statistics (AISTATS), Scottsdale, AZ, April 2013.

2012

Near-Optimal MAP Inference for Determinantal Point Processes, J. Gillenwater, A. Kulesza, and B. Taskar. Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada, December 2012. [Supplemental Material] [Code and Data]

Determinantal Point Processes for Machine Learning, A. Kulesza and B. Taskar. Foundations and Trends in Machine Learning: Vol. 5, No 2-3, December 2012. (arXiv version).

Tutorial: Determinantal Point Processes with A. Kulesza, at UAI, Catalina, Californina, August 2012.

Structured Prediction Cascades, D. Weiss, B. Sapp, and B. Taskar. arXiv, August 2012.

Discovering Diverse and Salient Threads in Document Collections, J. Gillenwater, A. Kulesza, and B. Taskar. Conference on Empirical Methods on Natural Language Processing (EMNLP), Jeju, Korea, July 2012. [Supplemental Material]

Wiki-ly Supervised Part-of-Speech Tagging, S. Li, J. Graca, and B. Taskar. Conference on Empirical Methods on Natural Language Processing (EMNLP), Jeju, Korea, July 2012. [Code]

Shape-based Object Detection via Boundary Structure Segmentation, A. Toshev, B. Taskar, and K. Daniilidis, International Journal of Computer Vision (IJCV), Volume 99, Issue 2, pp 123-146, September 2012.

2011

Generative-Discriminative Basis Learning for Medical Imaging, N. K. Batmanghelich, B. Taskar and C. Davatzikos, IEEE Transactions on Medical Imaging Journal (T-MI).

Learning Determinantal Point Processes, A. Kulesza, and B. Taskar. Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, July 2011.
For updated results on the summarization task (DUC04), see the long arXiv report.

Controlling Complexity in Part-of-Speech Induction, J. Graca, K. Ganchev, F. Pereira, L. Coheur and B. Taskar. Journal of Artificial Intelligence Research (JAIR).

k-DPPs: Fixed-Size Determinantal Point Processes, A. Kulesza, and B. Taskar. International Conference on Machine Learning (ICML), Bellevue, WA, June 2011.

Learning from Partial Labels, T. Cour, B. Sapp, and B. Taskar. Journal of Machine Learning Research (JMLR), volume 12, May 2011.
[Data and Code]

Parsing Human Motion with Stretchable Models, B. Sapp, D. Weiss, and B. Taskar. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, Colorado, June 2011.
[Code]

Posterior Sparsity in Dependency Grammar Induction, J. Gillenwater, K. Ganchev, J. Graca, F. Pereira, and B. Taskar. Journal of Machine Learning Research (JMLR), Volume 12, Feb 2011.
[Tech Report] [PR Toolkit] .

2010

Structured Determinantal Point Processes, A. Kulesza, and B. Taskar. Neural Information Processing Systems Conference (NIPS), Vancouver, BC, December 2010. [Supplemental Materials]
[Note: eigenvector normalization factor for dual representation fixed in this version.]

Semi-Supervised Learning with Adversarially Missing Label Information, Umar Syed, and B. Taskar. Neural Information Processing Systems Conference (NIPS), Vancouver, BC, December 2010. [Supplemental Materials]

Sidestepping Intractable Inference with Structured Ensemble Cascades, D. Weiss, B. Sapp, and B. Taskar. Neural Information Processing Systems Conference (NIPS), Vancouver, BC, December 2010. [Supplemental Materials]

Cascaded Models for Articulated Pose Estimation, B. Sapp, A. Toshev and B. Taskar. European Conference on Computer Vision (ECCV), Crete, Greece, September 2010. [Code]

Sparsity in Dependency Grammar Induction, J. Gillenwater, K. Ganchev, J. Graca, F. Pereira, and B. Taskar. Association for Computational Linguistics (ACL), Uppsala, Sweden, July 2010.

Posterior Regularization for Structured Latent Variable Models , K. Ganchev, J. Graca, J. Gillenwater and B. Taskar, Journal of Machine Learning Research (JMLR), Volume 11, July 2010.
[Earlier Tech Report] [PR Toolkit] .

Learning Tractable Word Alignment Models with Complex Constraints , J. Graca, K. Ganchev, and B. Taskar, The Computational Linguistics Journal (CL), September 2010.

Adaptive Pose Priors for Pictorial Structures, B. Sapp, C. Jordan, and B. Taskar. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010.

Object Detection via Boundary Structure Segmentation, A. Toshev, B. Taskar, and K. Daniilidis, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010.

Talking Pictures: Temporal Grouping and Dialog-Supervised Person Recognition, T. Cour, B. Sapp, A. Nagle, and B Taskar, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010.

Detecting and Parsing Architecture at City Scale, A. Toshev, P. Mordohai, and B. Taskar, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010.

Structured Prediction Cascades, D. Weiss and B. Taskar. International Conference on Artificial Intelligence and Statistics (AISTATS), Sardinia, Italy, May 2010. [Cascade learning code available.]

2009

Posterior vs. Parameter Sparsity in Latent Variable Models, J. Graca , K. Ganchev, B. Taskar and F. Pereira. Neural Information Processing Systems Conference (NIPS), Vancouver, BC, December 2009.
Supplementary Materials

Dependency Grammar Induction via Bitext Projection Constraints , K. Ganchev, J. Gillenwater and B. Taskar. Association for Computational Linguistics (ACL), Singapore, August 2009.

Learning from Ambiguously Labeled Images, T. Cour, B. Sapp, C. Jordan and B. Taskar. Computer Vision and Pattern Recognition (CVPR), Florida, June 2009.
[Tech Report]

Learning Sparse Markov Network Structure via Ensemble-of-Trees Models, Y. Lin, S. Zhu, D. Lee, B. Taskar. Artificial Intelligence and Statistics (AISTATS), Florida, April 2009.

Joint Covariate Selection and Joint Subspace Selection for Multiple Classification Problems, G. Obozinski, B. Taskar, and M. Jordan. Journal of Statistics and Computing, Appeared online 2009.

2008

Movie/Script: Alignment and Parsing of Video and Text Transcription, T. Cour, C. Jordan, E. Miltsakaki, B. Taskar. European Conference on Computer Vision (ECCV), Marseille, France, October 2008.
Video demos

Multi-View Learning over Structured and Non-Identical Outputs, K. Ganchev, J. Graca , J. Blitzer and B. Taskar. Uncertainty in Artificial Intelligence (UAI), Helsinki, Finland, July 2008.

Better Alignments = Better Translations?, K. Ganchev, J. Graca and B. Taskar. Association for Computational Linguistics (ACL), Columbus, Ohio, June 2008.
Code available: Constrained Alignment Toolkit

Online, Self-supervised Terrain Classification via Discriminatively Trained Submodular Markov Random Fields, P. Vernaza, B. Taskar and D. Lee. International Conference on Robotics and Automation (ICRA). Pasadena, California, May 2008.

2007

Tutorial: Structured Prediction: A Large Margin Approach. Neural Information Processing Systems Conference (NIPS), Vancouver, BC, December 2007.

Expectation Maximization and Posterior Constraints, J. Graca, K. Ganchev, and B. Taskar. Neural Information Processing Systems Conference (NIPS), Vancouver, BC, December 2007.

Book Chapter: Graphical Models in a Nutshell. D. Koller, N. Friedman, L. Getoor, and B. Taskar. In L. Getoor and B. Taskar, editors, Introduction to Statistical Relational Learning, 2007.

Book Chapter: Probabilistic Relational Models. L. Getoor, D. Koller, N. Friedman, A. Pfeffer, B. Taskar. In L. Getoor and B. Taskar, editors, Introduction to Statistical Relational Learning, 2007.

Book Chapter: Relational Markov Networks. B. Taskar, P. Abbeel, M.F. Wong, and D. Koller. In L. Getoor and B. Taskar, editors, Introduction to Statistical Relational Learning, 2007.

Book: Introduction to Relational Statistical Learning, Edited by L. Getoor and B. Taskar. MIT Press, November 2007.

Book: Predicting Structured Data, Edited by G. H. Bakir, T. Hofmann, B. Schölkopf, A. J. Smola, B. Taskar and S. V. N. Vishwanathan. MIT Press, September 2007.

Mixture-of-Parents Maximum Entropy Markov Models, D. Rosenberg, D. Klein and B. Taskar. Uncertainty in Artificial Intelligence (UAI), Vancouver, BC, July 2007.

A Permutation-Augmented Sampler for Dirichlet Process Mixture Models, P. Liang, M. Jordan and B. Taskar. International Conference on Machine Learning (ICML), Corvalis, OR, June 2007.

2006

An End-to-End Discriminative Approach to Machine Translation, P. Liang, Alexandre Bouchard-Cote, D. Klein and B. Taskar. Association for Computational Linguistics (ACL06), Sydney, Australia, July 2006.

Alignment by Agreement, P. Liang, B. Taskar, and D. Klein. Human Language Technology conference - North American chapter of the Association for Computational Linguistics (HLT-NAACL06), New York, June 2006.

Word Alignment via Quadratic AssignmentS. Lacoste-Julien, B. Taskar, D. Klein, and M. Jordan. Human Language Technology conference - North American chapter of the Association for Computational Linguistics (HLT-NAACL06), New York, June 2006.

Structured Prediction, Dual Extragradient and Bregman Projections, B. Taskar, S. Lacoste-Julien, and M. Jordan. Journal of Machine Learning Research (JMLR), Volume 7, 2006. Special Topic on Machine Learning and Large Scale Optimization.

2005

Structured Prediction via the Extragradient Method, B. Taskar, S. Lacoste-Julien, and M. Jordan, Neural Information Processing Systems Conference (NIPS05), Vancouver, British Columbia, December 2005. [Longer version]

A Discriminative Matching Approach to Word Alignment, B. Taskar, S. Lacoste-Julien, and D. Klein, Empirical Methods in Natural Language Processing (EMNLP05), Vancouver, British Columbia, October 2005.

Tutorial: Max-Margin Methods for NLP: Estimation, Structure, and Applications. The Association for Computational Linguistics (ACL05), Ann Arbor, MI, June 2005.

Learning Structured Prediction Models: A Large Margin Approach.  B. Taskar, V. Chatalbashev, D. Koller and C. Guestrin. Twenty Second International Conference on Machine Learning (ICML05), Bonn, Germany, August 2005.

Discriminative Learning of Markov Random Fields for Segmentation of 3D Scan Data.   D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng. International Conference on Computer Vision and Pattern Recognition (CVPR05), San Diego, CA, June 2005.
See 3D Segmentation Project Page

2004

Thesis: Learning Structured Prediction Models: A Large Margin Approach. Stanford University, CA, December 2004.

Exponentiated gradient algorithms for large-margin structured classificationP. Bartlett, M. Collins, B. Taskar and D. McAllester. Neural Information Processing Systems Conference (NIPS04), Vancouver, Canada, December 2004.

Max-Margin Parsing,  B. Taskar, D. Klein, M. Collins, D. Koller and C. Manning. Empirical Methods in Natural Language Processing (EMNLP04), Barcelona, Spain, July 2004. Received best paper award.

Learning Associative Markov Networks,  B. Taskar, V. Chatalbashev and D. Koller. Twenty First International Conference on Machine Learning (ICML04), Banff, Canada, July 2004.

2003

Max-Margin Markov Networks,  B. Taskar, C. Guestrin and D. Koller. Neural Information Processing Systems Conference (NIPS03), Vancouver, Canada, December 2003. Received best student paper award.
OCR dataset from the paper

Link Prediction in Relational Data,  B. Taskar, M. F. Wong, P. Abbeel and D. Koller. Neural Information Processing Systems Conference (NIPS03), Vancouver, Canada, December 2003.

Learning on the Test Data: Leveraging Unseen Features, B. Taskar, M. F. Wong and D. Koller. Twentieth International Conference on Machine Learning (ICML03), Washington, DC, August 2003.

2002

Discriminative Probabilistic Models for Relational Data,  B. Taskar, P. Abbeel and D. Koller. Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI02), Edmonton, Canada, August 2002.

Learning Probabilistic Models of Link Structure, L. Getoor, N. Friedman, D. Koller and B. Taskar. Journal of Machine Learning Research (JMLR), 2002.

2001

Probabilistic Clustering in Relational Data,   B. Taskar, E. Segal, and D. Koller. Seventeenth International Joint Conference on Artificial Intelligence (IJCAI01), Seattle, Washington, August 2001.

Probabilistic Models of Text and Link Structure for Hypertext Classification,,   L. Getoor, E. Segal, B. Taskar, D. Koller. IJCAI01 Workshop on "Text Learning: Beyond Supervision", Seattle, Washington, August 2001.

Rich Probabilistic Models for Gene ExpressionE. Segal, B. Taskar, A. Gasch, N. Friedman, and D. Koller. Ninth International Conference on Intelligent Systems For Molecular Biology (ISMB01), Copenhagen, Denmark, July 2001.

Learning Probabilistic Models of Relational Structure, L. Getoor, N. Friedman, D. Koller and B. Taskar. Eighteenth International Conference on Machine Learning (ICML01), Williamstown, Massachusetts, June 2001.

Selectivity Estimation using Probabilistic ModelsL. Getoor, B. Taskar and D. Koller, ACM SIGMOD01 International Conference on Management of Data, Santa Barbara, California, May 2001. 

 Funding

My work has been supported by grants from NSF, DARPA, Sloan Foundation, ONR, ARL and Google. Current grants include:

Sloan: Precise Structured Learning from Imprecise Supervision
NSF CAREER: Computation and Approximation in Structured Learning
NSF: Statistical Learning of Language Universals
NSF: Dynamically-Structured Conditional Random Fields for Complex, Natural Domains
NSF: From Actors to Actions: Analysis and Alignment of Images, Video and Text
ONR YIP: Beyond Labels: Generalized Supervision for Structured Learning
Google: Towards Action Parsing In The Wild
DARPA CSSG: Adaptive Joint Inference and Learning for Visual Recognition
DARPA: Detection, Explanation and Prediction of Emerging Network Developments
ONR MURI: Rich Representations with Exposed Semantics for Deep Visual Reasoning
ARL: Robotics Collaborative Technology Alliance