# Scalable Probabilistic Inference for Large Knowledge Bases

### Description

Small: Scalable Probabilistic Inference for Large Knowledge Bases

NSF Proposal Number: NSF III-1614738

NSF Abstract page:
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1614738

The proposal addresses the query evaluation problem in Probabilistic Databases. It is motivated by Knowledge Base Construction, an important, new problem in data man- agement, which constructs a large structured database from unstructured data. The approach to query evaluation is based on a novel technique, called lifted inference, which allows very efficient evaluation of SQL queries over large probabilistic databases. Lifted inference computes the probability of a SQL query inductively on the structure of the query, without having to first ground the query to compute the large factor graph. While lifted inference is every efficient, it is possible only for some queries. The project pursues several venues to develop lifted inference, such as combining lifted inference with sampling, extending lifted inference to symmetric databases, and extending lifted inference to queries with negation. It also examines a variety of applications of probabilistic databases and lifted inference.

Principal Investigator:
Dan Suciu

### Supported by:

NSF III-1614738

### Members

Mahmoud Abo Khamis,

Hung Ngo,

Dan Suciu,

Shumo Chu,

Daniel Li,

Chenglong Wang,

Alvin Cheung,

Laurel Orr,

Magdalena Balazinska,

### Publications

Mahmoud Abo Khamis, Hung Ngo, Dan Suciu,

*What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?*

In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pp. 429--444, 2017

Shumo Chu, Daniel Li, Chenglong Wang, Alvin Cheung, Dan Suciu,

*Demonstration of the Cosette Automated SQL Prover*

In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, pp. 1591--1594, 2017

Laurel Orr, Magdalena Balazinska, Dan Suciu,

*Probabilistic Database Summarization for Interactive Data Exploration*

Published in CoRR, vol. abs/1703.03856 , 2017