Scalable Probabilistic Inference for Large Knowledge Bases

Description

Small: Scalable Probabilistic Inference for Large Knowledge Bases

NSF Proposal Number: NSF III-1614738

NSF Abstract page: https://www.nsf.gov/awardsearch/showAward?AWD_ID=1614738

The proposal addresses the query evaluation problem in Probabilistic Databases. It is motivated by Knowledge Base Construction, an important, new problem in data man- agement, which constructs a large structured database from unstructured data. The approach to query evaluation is based on a novel technique, called lifted inference, which allows very efficient evaluation of SQL queries over large probabilistic databases. Lifted inference computes the probability of a SQL query inductively on the structure of the query, without having to first ground the query to compute the large factor graph. While lifted inference is every efficient, it is possible only for some queries. The project pursues several venues to develop lifted inference, such as combining lifted inference with sampling, extending lifted inference to symmetric databases, and extending lifted inference to queries with negation. It also examines a variety of applications of probabilistic databases and lifted inference.

Principal Investigator: Dan Suciu


Supported by:

NSF III-1614738

Members

Mahmoud Abo Khamis,
Hung Ngo,
Dan Suciu,
Shumo Chu,
Daniel Li,
Chenglong Wang,
Alvin Cheung,
Laurel Orr,
Magdalena Balazinska,

Publications

Mahmoud Abo Khamis, Hung Ngo, Dan Suciu,
What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?
In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pp. 429--444, 2017
Shumo Chu, Daniel Li, Chenglong Wang, Alvin Cheung, Dan Suciu,
Demonstration of the Cosette Automated SQL Prover
In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, pp. 1591--1594, 2017
Laurel Orr, Magdalena Balazinska, Dan Suciu,
Probabilistic Database Summarization for Interactive Data Exploration
Published in CoRR, vol. abs/1703.03856 , 2017