Scalable Probabilistic Inference for Large Knowledge Bases

Description

Small: Scalable Probabilistic Inference for Large Knowledge Bases

NSF Proposal Number: NSF III-1614738

NSF Abstract page: https://www.nsf.gov/awardsearch/showAward?AWD_ID=1614738

The proposal addresses the query evaluation problem in Probabilistic Databases. It is motivated by Knowledge Base Construction, an important, new problem in data man- agement, which constructs a large structured database from unstructured data. The approach to query evaluation is based on a novel technique, called lifted inference, which allows very efficient evaluation of SQL queries over large probabilistic databases. Lifted inference computes the probability of a SQL query inductively on the structure of the query, without having to first ground the query to compute the large factor graph. While lifted inference is every efficient, it is possible only for some queries. The project pursues several venues to develop lifted inference, such as combining lifted inference with sampling, extending lifted inference to symmetric databases, and extending lifted inference to queries with negation. It also examines a variety of applications of probabilistic databases and lifted inference.

Principal Investigator: Dan Suciu


Supported by:

NSF III-1614738

Members

Babak Salimi,
Dan Suciu,
Shumo Chu,
Alvin Cheung,
Mahmoud Abo Khamis,
Hung Ngo,
Daniel Li,
Chenglong Wang,
Laurel Orr,
Magdalena Balazinska,

Publications

Babak Salimi, Dan Suciu,
Bias in OLAP Queries: Detection, Explanation, and Removal
In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018, pp. 1021--1035, 2018
Shumo Chu, Alvin Cheung, Dan Suciu,
Axiomatic Foundations and Algorithms for Deciding Semantic Equivalences of SQL Queries
Published in CoRR, vol. abs/1802.02229 , 2018
Mahmoud Abo Khamis, Hung Ngo, Dan Suciu,
What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?
In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pp. 429--444, 2017
Shumo Chu, Daniel Li, Chenglong Wang, Alvin Cheung, Dan Suciu,
Demonstration of the Cosette Automated SQL Prover
In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, pp. 1591--1594, 2017
Laurel Orr, Magdalena Balazinska, Dan Suciu,
Probabilistic Database Summarization for Interactive Data Exploration
Published in CoRR, vol. abs/1703.03856 , 2017
Laurel Orr, Dan Suciu, Magdalena Balazinska,
Probabilistic Database Summarization for Interactive Data Exploration
Published in PVLDB, vol. 10 , no. 10 , pp. 1154--1165 , 2017
Shumo Chu, Alvin Cheung, Dan Suciu,
HoTTSQL: proving query rewrites with univalent SQL semantics
In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, pp. 510--524, 2017
Shumo Chu, Daniel Li, Chenglong Wang, Alvin Cheung, Dan Suciu,
Demonstration of the Cosette Automated SQL Prover
In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, pp. 1591--1594, 2017