Query Processing with Optimal Communication Cost

Description

AITF: FULL Query Processing with Optimal Communication Cost

NSF Proposal Number: NSF AITF 1535565

NSF Abstract page: https://www.nsf.gov/awardsearch/showAward?AWD_ID=1535565

The project develops new algorithms for query processing over large distributed systems, which are optimized for the cost of communication, then implements and evaluates these algorithms using an open-source big data management system and service. To optimize the communication cost, the project studies a new approach to query evaluation that computes the entire query at once, replacing the traditional approach based on a query plan. The theoretical part of this project builds on a new model, called Massively Parallel Communication model (MPC), where the communication is the only cost. The systems development is performed over the Myria big data management system and service.

Principal Investigator: Dan Suciu


Supported by:

NSF AITF-1535565

Members

Babak Salimi,
Dan Suciu,
Paraschos Koutris,
Mahmoud Abo Khamis,
Hung Ngo,
Shumo Chu,
Daniel Li,
Remy Wang,
Chenglong Wang,
Alvin Cheung,
Laurel Orr,
Magdalena Balazinska,
Bas Ketsman,

Publications

Babak Salimi, Dan Suciu,
Bias in OLAP Queries: Detection, Explanation, and Removal
In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018, pp. 1021--1035, 2018
Paraschos Koutris, Dan Suciu,
Algorithmic Aspects of Parallel Query Processing
In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018, pp. 1659--1664, 2018
Mahmoud Abo Khamis, Hung Ngo, Dan Suciu,
What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?
In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pp. 429--444, 2017
Shumo Chu, Daniel Li, Remy Wang, Chenglong Wang, Alvin Cheung, Dan Suciu,
Demonstration of the Cosette Automated SQL Prover
In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, pp. 1591--1594, 2017
Dan Suciu,
Communication Cost in Parallel Query Evaluation: A Tutorial
In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pp. 319, 2017
Laurel Orr, Magdalena Balazinska, Dan Suciu,
Probabilistic Database Summarization for Interactive Data Exploration
Published in CoRR, vol. abs/1703.03856 , 2017
Bas Ketsman, Dan Suciu,
A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries
In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pp. 417--428, 2017
Shumo Chu, Alvin Cheung, Dan Suciu,
HoTTSQL: proving query rewrites with univalent SQL semantics
In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, pp. 510--524, 2017