PEDRO DOMINGOS

Professor Emeritus

Address:
Allen School of Computer Science & Eng.
University of Washington
Box 352350
Seattle, WA 98195-2350

Telephone: (206) 543-4229
Fax: (206) 543-2969
Email: pedrod@cs.washington.edu
Office: 466 Allen Center

Twitter: @pmddomingos

Brief Bio

I'm a professor emeritus of computer science and engineering at the University of Washington and the author of 2040 and The Master Algorithm. I'm a winner of the SIGKDD Innovation Award and the IJCAI John McCarthy Award, two of the highest honors in data science and AI. I'm a Fellow of the AAAS and AAAI, and I've received an NSF CAREER Award, a Sloan Fellowship, a Fulbright Scholarship, an IBM Faculty Award, several best paper awards, and other distinctions. I received an undergraduate degree (1988) and M.S. in Electrical Engineering and Computer Science (1992) from IST, in Lisbon, and an M.S. (1994) and Ph.D. (1997) in Information and Computer Science from the University of California at Irvine. I'm the author or co-author of over 200 technical publications in machine learning, data science, and other areas. I'm a member of the editorial board of the Machine Learning journal, co-founder of the International Machine Learning Society, and past associate editor of JAIR. I was program co-chair of KDD-2003 and SRL-2009, and I've served on the program committees of AAAI, ICML, IJCAI, KDD, NIPS, SIGMOD, UAI, WWW, and others. I've written for the Wall Street Journal, Spectator, Scientific American, Wired, and others. I helped start the fields of statistical relational AI, data stream mining, adversarial learning, machine learning for information integration, and influence maximization in social networks.

Vita

Research Interests

My main research interests are in the fields of machine learning and data mining. I'd like to make computers do more with less help from us, learn from experience, adapt effortlessly, and discover new knowledge. We need computers that reduce the information overload by extracting the important patterns from masses of data. This poses many deep and fascinating scientific problems: How can a computer decide autonomously which representation is best for target knowledge? How can it tell genuine regularities from chance occurrences? How can pre-existing knowledge be exploited? How can a computer learn with limited computational resources? How can learned results be made understandable by us?

My research addresses these and related questions. Research topics that I'm vworking on, or have recently worked on, include:

Learning concepts represented by sets of rules
Using examples as implicit definitions of concepts
Using probabilistic representations and analyses to address the uncertainty inherent in learning
Automating the process of selecting representations for concepts
Learning several models and combining them to improve accuracy and stability
Evaluating and selecting candidate models to avoid "overfitting" (i.e., to distinguish between genuine regularities and chance occurrences)
Learning models that can be easily understood by people
Using pre-existing knowledge to guide and improve learning
Developing knowledge discovery algorithms that run in linear or near-linear time, and so scale up to large databases
Using subsampling techniques to scale up pre-existing approaches
Developing algorithms that take into account the costs of decisions
Understanding the probabilistic properties and foundations of data mining algorithms
Developing techniques for mining semi-structured data sources (e.g., text, the Web)

Current Projects

Statistical Relational Learning: Learning from noisy data in rich representations
Tractable Deep Learning: Learning deep models where inference is tractable
Machine Reading: Extracting knowledge bases from text
Collective Knowledge Bases: Merging knowledge from a multitude of sources
Large-Scale Machine Learning: Mining massive data streams

Students and Postdocs

Corin Anderson, Software Engineer, Google.
Jesse Davis, Professor, University of Leuven.
AnHai Doan, Professor, University of Wisconsin, Madison. (Winner of the 2003 ACM Distinguished Dissertation Award.)
Abram Friesen, Research Scientist, DeepMind.
Robert Gens, Research Scientist, Google.
Vibhav Gogate, Professor, University of Texas at Dallas.
Geoff Hulten, VP of Engineering, Dropbox.
Stanley Kok, Assistant Professor, National University of Singapore.
Tessa Lau, Founder/CEO, Dusty Robotics.
Daniel Lowd, Associate Professor, University of Oregon.
Xu Miao, Applied Researcher, Microsoft Corp.
Aniruddh Nath, Software Engineer, Google.
Mathias Niepert, Professor, University of Stuttgart.
Hoifung Poon, General Manager, Microsoft Health Futures.
Matt Richardson, Senior Principal Researcher, Microsoft Research.
Parag Singla, Professor, IIT Delhi.

Software

Alchemy: Statistical relational AI.
SPN: Sum-product networks for tractable deep learning.
RDIS: Recursive decomposition for nonconvex optimization.
BVD: Bias-variance decomposition for zero-one loss.
NBE: Bayesian learner with very fast inference.
RISE: Unified rule- and instance-based learner.
VFML: Toolkit for mining massive data sources.

Selected Talks

How AI Will Exponentially Increase Our Collective Intelligence. TEDAI, San Francisco, 2024.
Deep Networks Are Kernel Machines. University of Lisbon, 2024.
Machine Learning: The Last 20 Years and the Next. University of Washington, 2022.
Managing in the Age of AI. C3 Transform, Miami, 2022.
Unifying Logical and Statistical AI with Markov Logic. IBIS-20, 2020.
The Dangers of AI: Real and Imaginary. University of Toronto, 2019.
How Will AI Change Ethics? Institute for Advanced Study, Princeton, 2019.
Sum-Product Networks: The Next Generation of Deep Models. Georgia Tech, Atlanta, 2017.
The Next Hundred Years of Your Life. TEDxLA, Los Angeles, 2016.
The Master Algorithm. Google, Mountain View, 2015.
The Quest for the Master Algorithm. TEDxUofW, Seattle, 2015.
Symmetry-Based Learning. ICLR-14, Banff, 2014.
Principles of Very Large Scale Modeling. KDD-14, New York, 2014.

Books

2040: A Silicon Valley Satire. Random Noise Books, 2024.
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. Basic Books, 2015.
Markov Logic: An Interface Layer for Artificial Intelligence, with Daniel Lowd. Morgan & Claypool, 2009.
Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, with C. Faloutsos, T. Senator, H. Kargupta and L. Getoor (eds.). ACM Press, 2003.
A Unified Approach to Concept Learning. Ph.D. Dissertation, Department of Information and Computer Science, University of California, Irvine, 1997.

Selected Book Chapters

What's Missing in AI: The Interface Layer. In P. Cohen (ed.), Artificial Intelligence: The First Hundred Years. Menlo Park, CA: AAAI Press. To appear.
Markov Logic, with various coauthors. In L. De Raedt, P. Frasconi, K. Kersting and S. Muggleton (eds.), Probabilistic Inductive Logic Programming (pp. 92-117), 2008. New York: Springer.
Markov Logic: A Unifying Framework for Statistical Relational Learning, with Matt Richardson. In L. Getoor and B. Taskar (eds.), Introduction to Statistical Relational Learning (pp. 339-371), 2007. Cambridge, MA: MIT Press.
Combining Link and Content Information in Web Search, with Matt Richardson. In M. Levene and A. Poulovassilis (eds.), Web Dynamics (pp. 179-193), 2004. New York: Springer.
Ontology Matching: A Machine Learning Approach, with AnHai Doan, Jayant Madhavan and Alon Halevy. In S. Staab and R. Studer (eds.), Handbook on Ontologies in Information Systems (pp. 385-403), 2004. New York: Springer.
Machine Learning. In W. Klosgen and J. Zytkow (eds.), Handbook of Data Mining and Knowledge Discovery (pp. 660-670), 2002. New York: Oxford University Press.
Learning Repetitive Text-Editing Procedures with SMARTedit, with Tessa Lau, Steve Wolfman and Dan Weld. In H. Lieberman (ed.), Your Wish Is My Command: Giving Users the Power to Instruct their Software (pp. 209-225), 2001. San Francisco, CA: Morgan Kaufmann.

Selected Essays

There's Only One Good Way to Regulate AI. Medium, 2024.
Is AI a Danger to Democracy?. Medium, 2024.
AI's Greatest Risk Is Not Having Enough of It. Medium, 2024.
No, the AI Sky Isn't Falling. Medium, 2023.
Pay Researchers for Results, Not Plans. Times Higher Education, 2022.
Beating Back Cancel Culture. Quillette, 2021.
Stop the Politicization of AI. Spectator, 2020.
Artificial Intelligence Will Serve Humans, Not Enslave Them. Scientific American, 2018.
How Not to Regulate the Data Economy. Medium, 2018.
Ten Myths About Machine Learning. Medium, 2016.
The Race for the Master Algorithm Has Begun. Wired, 2016.
The Business Opportunity of the Century. The Globe and Mail, 2016.
A Mystery in the Machine. OECD Yearbook, 2016.
How to Train Your AI. Medium, 2016.
Get Ready for Your Digital Model. Wall Street Journal, 2015.
Five Profound Ways that AI Will Change the Way You Live. Omnivoracious: The Amazon Book Review, 2015.
Solving AI: We Need a New Language for Artificial Intelligence. MIT Technology Review, 2009.

Selected Preprints

Relevance-Guided Modeling of Object Dynamics for Reinforcement Learning, with William Agnew. ArXiv, 2021.
Every Model Learned by Gradient Descent Is Approximately a Kernel Machine. ArXiv, 2020.

Selected Journal Papers

Unifying Logical and Statistical AI with Markov Logic, with Daniel Lowd. Communications of the ACM, 62 (7), 74-83, 2019.
On the Latent Variable Interpretation in Sum-Product Networks, with Robert Peharz, Robert Gens and Franz Pernkopf. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39 (10), 2030-2044, 2017.
Probabilistic Theorem Proving, with Vibhav Gogate. Communications of the ACM, 59 (7), 107-115, 2016.
A Few Useful Things to Know about Machine Learning. Communications of the ACM, 55 (10), 78-87, 2012.
Structured Machine Learning: Ten Problems for the Next Ten Years (Section 5 in Structured Machine Learning: The Next Ten Years). Machine Learning, 73, 3-23, 2008.
Toward Knowledge-Rich Data Mining (position paper). Data Mining and Knowledge Discovery, 15, 21-28, 2007.
Markov Logic Networks, with Matt Richardson. Machine Learning, 62, 107-136, 2006.
Mining Social Networks for Viral Marketing (short paper). IEEE Intelligent Systems, 20(1), 80-82, 2005.
Learning to Match Ontologies on the Semantic Web, with AnHai Doan, Jayant Madhavan, Robin Dhamankar and Alon Halevy. VLDB Journal 12, 303-319, 2003.
Programming by Demonstration Using Version Space Algebra, with Tessa Lau, Steve Wolfman and Dan Weld. Machine Learning, 53, 111-156, 2003.
Tree Induction for Probability-Based Ranking, with Foster Provost. Machine Learning, 52, 199-216, 2003.
Learning to Match the Schemas of Data Sources: A Multistrategy Approach, with AnHai Doan and Alon Halevy. Machine Learning, 50, 279-301, 2003.
A General Framework for Mining Massive Data Streams, with Geoff Hulten (short paper). Journal of Computational and Graphical Statistics, 12, 2003.
Prospects and Challenges for Multi-Relational Data Mining (position paper). SIGKDD Explorations, 5, 80-83, 2003.
The Role of Occam's Razor in Knowledge Discovery. Data Mining and Knowledge Discovery, 3, 409-425, 1999.
Knowledge Discovery Via Multiple Models. Intelligent Data Analysis, 2, 187-202, 1998.
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss, with Michael Pazzani. Machine Learning, 29, 103-130, 1997.
Context-Sensitive Feature Selection for Lazy Learners. Artificial Intelligence Review, 11, 227-253, 1997.
Unifying Instance-Based and Rule-Based Induction. Machine Learning, 24, 141-168, 1996.
Two-Way Induction. International Journal on Artificial Intelligence Tools, 5, 113-125, 1996.

Selected Conference Papers

Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity, with William Agnew and others, Conference on Robot Learning, 2020.
Submodular Field Grammars: Representation, Inference, and Application to Image Parsing, with Abram Friesen. Advances in Neural Information Processing Systems 31, 2018. Montréal, Canada: NIPS Foundation.
Deep Learning as a Mixed Convex-Combinatorial Optimization Problem, with Abram Friesen. Proceedings of the Sixth International Conference on Learning Representations, 2018. Vancouver, Canada: CBLS.
Compositional Kernel Machines, with Robert Gens. Proceedings of the Fifth International Conference on Learning Representations, 2017. Toulon, France: CBLS.
The Sum-Product Theorem: A Foundation for Learning Tractable Models, with Abram Friesen. Proceedings of the Thirty-Third International Conference on Machine Learning, 2016. New York, NY: JMLR.
Learning Tractable Probabilistic Models for Fault Localization, with Aniruddh Nath. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016. Phoenix, AZ: AAAI Press.
Recursive Decomposition for Nonconvex Optimization, with Abram Friesen. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015. Buenos Aires, Argentina: AAAI Press. Winner of the Distinguished Paper Award.
Learning and Inference in Tractable Probabilistic Knowledge Bases, with Mathias Niepert. Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015. Amsterdam, Netherlands: AUAI Press.
Learning Relational Sum-Product Networks, with Aniruddh Nath. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015. Austin, TX: AAAI Press.
On Theoretical Properties of Sum-Product Networks, with Robert Peharz, Sebastian Tschiatschek and Franz Pernkopf. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015. San Diego, CA: JMLR.
Deep Symmetry Networks, with Robert Gens. Advances in Neural Information Processing Systems 27, 2014. Montréal, Canada: Curran Associates.
Symmetry-Based Semantic Parsing, with Chloé Kiddon. Proceedings of the ACL-2014 Workshop on Semantic Parsing, 2014. Baltimore, MD.
Exchangeable Variable Models, with Mathias Niepert. Proceedings of the Thirty-First International Conference on Machine Learning, 2014. Beijing, China: Omnipress.
Approximate Lifting Techniques for Belief Propagation, with Parag Singla and Aniruddh Nath. Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014. Quebec City, Canada: AAAI Press.
Learning the Structure of Sum-Product Networks, with Robert Gens. Proceedings of the Thirtieth International Conference on Machine Learning, 2013. Atlanta, GA: Omnipress.
Structured Message Passing, with Vibhav Gogate. Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, 2013. Bellevue, WA: AUAI Press.
Tractable Probabilistic Knowledge Bases with Existence Uncertainty, with Austin Webb. Proceedings of the Third International Workshop on Statistical Relational Artificial Intelligence, 2013. Bellevue, WA.
Discriminative Learning of Sum-Product Networks, with Robert Gens. Advances in Neural Information Processing Systems 25, 2012. Red Hook, NY: Curran Associates. Winner of the Outstanding Student Paper Award. [Talk video]
A Tractable First-Order Probabilistic Logic, with Austin Webb. Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012. Toronto, Canada: AAAI Press.
Sum-Product Networks: A New Deep Architecture, with Hoifung Poon. Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, 2011. Barcelona, Spain: AUAI Press. Winner of the Best Paper Award.
Probabilistic Theorem Proving, with Vibhav Gogate. Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, 2011. Barcelona, Spain: AUAI Press.
Approximation by Quantization, with Vibhav Gogate. Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, 2011. Barcelona, Spain: AUAI Press.
Coarse-to-Fine Inference and Learning for First-Order Probabilistic Models, with Chloé Kiddon. Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011. San Francisco, CA: AAAI Press.
Implementing Weighted Abduction in Markov Logic, with Jim Blythe, Jerry Hobbs, Rohit Kate and Ray Mooney. Proceedings of the Ninth International Conference on Computational Semantics, 2011. Oxford, UK, 2011: ACL SIGSEM.
Learning Efficient Markov Networks, with Vibhav Gogate and Austin Webb. Advances in Neural Information Processing Systems 23, 2010. Red Hook, NY: Curran Associates.
Approximate Inference by Compilation to Arithmetic Circuits, with Daniel Lowd. Advances in Neural Information Processing Systems 23, 2010. Red Hook, NY: Curran Associates.
Formula-Based Probabilistic Inference, with Vibhav Gogate. Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, 2010. Catalina Island, CA: AUAI Press. The algorithm described in this paper was co-winner of the UAI-2010 Inference Challenge.
Unsupervised Ontology Induction from Text, with Hoifung Poon. Proceedings of the Forty-Eighth Annual Meeting of the Association for Computational Linguistics, 2010. Uppsala, Sweden: ACL.
Efficient Lifting for Online Probabilistic Inference, with Aniruddh Nath. Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010. Atlanta, GA: AAAI Press.
Efficient Belief Propagation for Utility Maximization and Repeated Inference, with Aniruddh Nath. Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010. Atlanta, GA: AAAI Press.
Bottom-Up Learning of Markov Network Structure, with Jesse Davis. Proceedings of the Twenty-Seventh International Conference on Machine Learning, 2010. Haifa, Israel: Omnipress.
Learning Markov Logic Networks Using Structural Motifs, with Stanley Kok. Proceedings of the Twenty-Seventh International Conference on Machine Learning, 2010. Haifa, Israel: Omnipress.
Unsupervised Semantic Parsing, with Hoifung Poon. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009. Singapore: ACL. Winner of the Best Paper Award.
Deep Transfer via Second-Order Markov Logic, with Jesse Davis. Proceedings of the Twenty-Sixth International Conference on Machine Learning (pp. 217-224), 2009. Montréal, Canada: Omnipress.
Learning Markov Logic Network Structure via Hypergraph Lifting, with Stanley Kok. Proceedings of the Twenty-Sixth International Conference on Machine Learning (pp. 505-512), 2009. Montréal, Canada: Omnipress.
A Language for Relational Decision Theory, with Aniruddh Nath. Proceedings of the Sixth International Workshop on Statistical Relational Learning, 2009. Leuven, Belgium.
Joint Unsupervised Coreference Resolution with Markov Logic, with Hoifung Poon. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (pp. 649-658), 2008. Honolulu, HI: ACL.
Extracting Semantic Networks from Text via Relational Clustering, with Stanley Kok. Proceedings of the Nineteenth European Conference on Machine Learning (pp. 624-639), 2008. Antwerp, Belgium: Springer.
Learning Arithmetic Circuits, with Daniel Lowd. Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (pp. 383-392), 2008. Helsinki, Finland: AUAI Press.
Lifted First-Order Belief Propagation, with Parag Singla. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (pp. 1094-1099), 2008. Chicago, IL: AAAI Press.
Hybrid Markov Logic Networks, with Jue Wang. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (pp. 1106-1111), 2008. Chicago, IL: AAAI Press.
A General Method for Reducing the Complexity of Relational Inference and its Application to MCMC, with Hoifung Poon and Marc Sumner. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (pp. 1075-1080), 2008. Chicago, IL: AAAI Press.
Efficient Weight Learning for Markov Logic Networks, with Daniel Lowd. Proceedings of the Eleventh European Conference on Principles and Practice of Knowledge Discovery in Databases (pp. 200-211), 2007. Warsaw, Poland: Springer.
Markov Logic in Infinite Domains, with Parag Singla. Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (pp. 368-375), 2007. Vancouver, Canada: AUAI Press.
Joint Inference in Information Extraction, with Hoifung Poon. Proceedings of the Twenty-Second National Conference on Artificial Intelligence (pp. 913-918), 2007. Vancouver, Canada: AAAI Press.
Statistical Predicate Invention, with Stanley Kok. Proceedings of the Twenty-Fourth International Conference on Machine Learning (pp. 433-440), 2007. Corvallis, Oregon: ACM Press.
Recursive Random Fields, with Daniel Lowd. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (pp. 950-955), 2007. Hyderabad, India: AAAI Press.
Entity Resolution with Markov Logic, with Parag Singla. Proceedings of the Sixth IEEE International Conference on Data Mining (pp. 572-582), 2006. Hong Kong: IEEE Computer Society Press.
Unifying Logical and Statistical AI, with various coauthors. Proceedings of the Twenty-First National Conference on Artificial Intelligence (pp. 2-7), 2006. Boston, MA: AAAI Press.
Sound and Efficient Inference with Probabilistic and Deterministic Dependencies, with Hoifung Poon. Proceedings of the Twenty-First National Conference on Artificial Intelligence (pp. 458-463), 2006. Boston, MA: AAAI Press.
Memory-Efficient Inference in Relational Domains, with Parag Singla. Proceedings of the Twenty-First National Conference on Artificial Intelligence (pp. 488-493), 2006. Boston, MA: AAAI Press.
Object Identification with Attribute-Mediated Dependences, with Parag Singla. Proceedings of the Ninth European Conference on Principles and Practice of Knowledge Discovery in Databases (pp. 297-308), 2005. Porto, Portugal: Springer. Winner of the Best Paper Award.
Learning the Structure of Markov Logic Networks, with Stanley Kok. Proceedings of the Twenty-Second International Conference on Machine Learning (pp. 441-448), 2005. Bonn, Germany: ACM Press.
Naive Bayes Models for Probability Estimation, with Daniel Lowd. Proceedings of the Twenty-Second International Conference on Machine Learning (pp. 529-536), 2005. Bonn, Germany: ACM Press.
Discriminative Training of Markov Logic Networks, with Parag Singla. Proceedings of the Twentieth National Conference on Artificial Intelligence (pp. 868-873), 2005. Pittsburgh, PA: AAAI Press.
Markov Logic: A Unifying Framework for Statistical Relational Learning, with Matt Richardson. Proceedings of the ICML-2004 Workshop on Statistical Relational Learning and its Connections to Other Fields (pp. 49-54), 2004. Banff, Canada: IMLS.
Multi-Relational Record Linkage, with Parag. Proceedings of the KDD-2004 Workshop on Multi-Relational Data Mining (pp. 31-48), 2004. Seattle, CA: ACM Press.
Adversarial Classification, with Nilesh Dalvi, Mausam, Sumit Sanghai and Deepak Verma. Proceedings of the Tenth International Conference on Knowledge Discovery and Data Mining (pp. 99-108), 2004. Seattle, WA: ACM Press.
Learning Bayesian Network Classifiers by Maximizing Conditional Likelihood, with Dan Grossman. Proceedings of the Twenty-First International Conference on Machine Learning (pp. 361-368), 2004. Banff, Canada: ACM Press.
iMAP: Discovering Complex Semantic Matches between Database Schemas, with Robin Dhamankar, Yoonkyong Lee, AnHai Doan and Alon Halevy. Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (pp. 383-394), 2004. Paris, France: ACM Press.
Building Large Knowledge Bases by Mass Collaboration, with Matt Richardson. Proceedings of the Second International Conference on Knowledge Capture (pp. 129-137), 2003. Sanibel Island, FL: ACM Press.
Learning Programs from Traces Using Version Space Algebra, with Tessa Lau and Dan Weld. Proceedings of the Second International Conference on Knowledge Capture (pp. 36-43), 2003. Sanibel Island, FL: ACM Press.
Trust Management for the Semantic Web, with Matt Richardson and Rakesh Agrawal. Proceedings of the Second International Semantic Web Conference (pp. 351-368), 2003. Sanibel Island, FL: Springer.
Learning with Knowledge from Multiple Experts, with Matt Richardson. Proceedings of the Twentieth International Conference on Machine Learning (pp. 624-631), 2003. Washington, DC: Morgan Kaufmann.
Mining Massive Relational Databases, with Geoff Hulten and Yeuhi Abe. Proceedings of the IJCAI-2003 Workshop on Learning Statistical Models from Relational Data (pp. 53-60), 2003. Acapulco, Mexico: IJCAII.
Research on Statistical Relational Learning at the University of Washington, with various coauthors. Proceedings of the IJCAI-2003 Workshop on Learning Statistical Models from Relational Data (pp. 43-47), 2003. Acapulco, Mexico: IJCAII.
Relational Markov Models and their Application to Adaptive Web Navigation, with Corin Anderson and Dan Weld. Proceedings of the Eighth International Conference on Knowledge Discovery and Data Mining (pp. 143-152), 2002. Edmonton, Canada: ACM Press.
Mining Knowledge-Sharing Sites for Viral Marketing, with Matt Richardson. Proceedings of the Eighth International Conference on Knowledge Discovery and Data Mining (pp. 61-70), 2002. Edmonton, Canada: ACM Press.
Mining Complex Models from Arbitrarily Large Databases in Constant Time, with Geoff Hulten. Proceedings of the Eighth International Conference on Knowledge Discovery and Data Mining (pp. 525-531), 2002. Edmonton, Canada: ACM Press.
Representing and Reasoning about Mappings between Domain Models, with Jayant Madhavan, Phil Bernstein and Alon Halevy. Proceedings of the Eighteenth National Conference on Artificial Intelligence (pp. 80-86), 2002. Edmonton, Canada: AAAI Press.
Learning to Map between Ontologies on the Semantic Web, with AnHai Doan, Jayant Madhavan and Alon Halevy. Proceedings of the Eleventh International World Wide Web Conference (pp. 662-673), 2002. Honolulu, HI: ACM Press.
Learning from Infinite Data in Finite Time, with Geoff Hulten. Advances in Neural Information Processing Systems 14 (pp. 673-680), 2002. Cambridge, MA: MIT Press.
The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank, with Matt Richardson. Advances in Neural Information Processing Systems 14 (pp. 1441-1448), 2002. Cambridge, MA: MIT Press.
Mining the Network Value of Customers, with Matt Richardson. Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining (pp. 57-66), 2001. San Francisco, CA: ACM Press.
Mining Time-Changing Data Streams, with Geoff Hulten and Laurie Spencer. Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining (pp. 97-106), 2001. San Francisco, CA: ACM Press.
Adaptive Web Navigation for Wireless Devices, with Corin Anderson and Dan Weld. Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (pp. 879-884), 2001. Seattle, WA: Morgan Kaufmann.
A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering, with Geoff Hulten. Proceedings of the Eighteenth International Conference on Machine Learning (pp. 106-113), 2001. Williamstown, MA: Morgan Kaufmann.
Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach, with AnHai Doan and Alon Halevy. Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (pp. 509-520), 2001. Santa Barbara, CA: ACM Press.
Personalizing Web Sites for Mobile Users, with Corin Anderson and Dan Weld. Proceedings of the Tenth International World Wide Web Conference (pp. 565-575), 2001. Hong Kong: ACM Press.
Mixed Initiative Interfaces for Learning Tasks: SMARTedit Talks Back, with Steve Wolfman, Tessa Lau and Dan Weld. Proceedings of the 2001 Conference on Intelligent User Interfaces (pp. 167-174), 2001. Santa Fe, NM: ACM Press.
Mining High-Speed Data Streams, with Geoff Hulten. Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (pp. 71-80), 2000. Boston, MA: ACM Press. Winner of the 2015 SIGKDD Test of Time Award.
A Unified Bias-Variance Decomposition for Zero-One and Squared Loss. Proceedings of the Seventeenth National Conference on Artificial Intelligence (pp. 564-569), 2000. Austin, TX: AAAI Press.
Version Space Algebra and its Application to Programming by Demonstration, with Tessa Lau and Dan Weld. Proceedings of the Seventeenth International Conference on Machine Learning (pp. 527-534), 2000. Stanford, CA: Morgan Kaufmann.
A Unified Bias-Variance Decomposition and its Applications. Proceedings of the Seventeenth International Conference on Machine Learning (pp. 231-238), 2000. Stanford, CA: Morgan Kaufmann.
Bayesian Averaging of Classifiers and the Overfitting Problem. Proceedings of the Seventeenth International Conference on Machine Learning (pp. 223-230), 2000. Stanford, CA: Morgan Kaufmann.
Learning Source Descriptions for Data Integration, with AnHai Doan and Alon Levy. Proceedings of the Third International Workshop on the Web and Databases (pp. 81-86), 2000. Dallas, TX: ACM SIGMOD.
MetaCost: A General Method for Making Classifiers Cost-Sensitive. Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining (pp. 155-164), 1999. San Diego, CA: ACM Press. Winner of the Best Paper Award for Fundamental Research.
Process-Oriented Estimation of Generalization Error. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (pp. 714-719), 1999. Stockholm, Sweden: Morgan Kaufmann.
Occam's Two Razors: The Sharp and the Blunt. Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (pp. 37-43), 1998. New York, NY: AAAI Press. Winner of the Best Paper Award for Fundamental Research.
A Process-Oriented Heuristic for Model Selection. Proceedings of the Fifteenth International Conference on Machine Learning (pp. 127-135), 1998. Madison, WI: Morgan Kaufmann.
How to Get a Free Lunch: A Simple Cost Model for Machine Learning Applications. Proceedings of the AAAI-1998/ICML-1998 Workshop on the Methodology of Applying Machine Learning (pp. 1-7), 1998. Madison, WI: AAAI Press.
Knowledge Acquisition from Examples Via Multiple Models. Proceedings of the Fourteenth International Conference on Machine Learning (pp. 98-106), 1997. Nashville, TN: Morgan Kaufmann.
Why Does Bagging Work? A Bayesian Account and its Implications. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (pp. 155-158), 1997. Newport Beach, CA: AAAI Press.
Bayesian Model Averaging in Rule Induction. Preliminary Papers of the Sixth International Workshop on Artificial Intelligence and Statistics (pp. 157-164), 1997. Ft. Lauderdale, FL: Society for Artificial Intelligence and Statistics.
Linear-Time Rule Induction. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (pp. 96-101), 1996. Portland, OR: AAAI Press.
Using Partitioning to Speed Up Specific-to-General Rule Induction. Proceedings of the AAAI-1996 Workshop on Integrating Multiple Learned Models (pp. 29-34), 1996. Portland, OR: AAAI Press.
Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier, with Michael Pazzani. Proceedings of the Thirteenth International Conference on Machine Learning (pp. 105-112), 1996. Bari, Italy: Morgan Kaufmann.
From Instances to Rules: A Comparison of Biases. Proceedings of the Third International Workshop on Multistrategy Learning (pp. 147-154), 1996. Harpers Ferry, WV: AAAI Press.
Two-Way Induction. Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence (pp. 182-189), 1995. Herndon, VA: IEEE Computer Society Press.
Rule Induction and Instance-Based Learning: A Unified Approach. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (pp. 1226-1232), 1995. Montréal, Canada: Morgan Kaufmann.
The RISE System: Conquering Without Separating. Proceedings of the Sixth IEEE International Conference on Tools with Artificial Intelligence (pp. 704-707), 1994. New Orleans, LA: IEEE Computer Society Press.

Teaching

Winter 2018: Statistical Methods in Computer Science (CSE 515).
Autumn 2017: PMP Data Mining (CSEP 546).
Spring 2017: PMP Data Mining (CSEP 546).
Winter 2017: Statistical Methods in Computer Science (CSE 515).
Spring 2016: PMP Data Mining (CSEP 546).
Winter 2016: Statistical Methods in Computer Science (CSE 515).
Spring 2015: Machine Learning (CSE 446).
Winter 2015: Statistical Methods in Computer Science (CSE 515).
Spring 2014: Artificial Intelligence I (CSE 573).
Winter 2014: Machine Learning (CSE 446).
Spring 2012: PMP Data Mining (CSEP 546).
Autumn 2011: Artificial Intelligence II (CSE 574).
Spring 2011: Foundations of Computing II (CSE 312).
Autumn 2010: Introduction to Artificial Intelligence (CSE 473).
Winter 2010: Machine Learning (CSE 546).
Spring 2009: Statistical Methods in Computer Science (CSE 515).
Winter 2009: Machine Learning (CSE 446).
Autumn 2008: Markov Logic Networks (Carnegie Mellon 10-803).
Spring 2008: Data Mining (CSE 546).
Autumn 2007: Applications of Artificial Intelligence (CSEP 573).
Spring 2007: PMP Data Mining (CSEP 546).
Autumn 2006: Artificial Intelligence I (CSE 573).
Spring 2005: Artificial Intelligence II (CSE 574).
Autumn 2004: PMP Data Mining (CSEP 546).
Spring 2004: Statistical Methods in Computer Science (CSE 590ST).
Autumn 2003: Data Mining (CSE 546).
Spring 2003: PMP Data Mining (CSEP 546).
Autumn 2002: Introduction to Artificial Intelligence (CSE 473).
Spring 2002: Introduction to Artificial Intelligence (CSE 473).
Autumn 2001: Data Mining (CSE 546).
Spring 2001: Applications of Artificial Intelligence (CSE 592).
Winter 2001: Machine Learning and Data Mining (CSE 590PD).
Autumn 2000: Introduction to Artificial Intelligence (CSE 473).
Spring 2000: Introduction to Artificial Intelligence (CSE 473).
Winter 2000: Artificial Intelligence II (CSE 574).
Autumn 1999: Artificial Intelligence I (CSE 573).

Other Interests

Literature, cinema, music, travel. Sports: swimming, long-distance running.

Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle

Last modified: March 13, 2025