III: Small: BeliefDB - Adding Belief Annotations to Databases


The goal of this project is the design and implementation of a new data annotation model. In this model, annotations allow users of a shared data repository to retain different possible "belief worlds" of what should be in the database (with partly overlapping and conflicting information) and to have a structured argumentation on content and annotations between the users. The key idea is to give annotations a clearly defined semantics that lets a database understand and can manage them efficiently. The project studies data models, language design, algorithm for conflict resolution, and query evaluation on uncertain or inconsistent data.

Our motivation comes from many large-scale scientific data applications today, where a community of users is working together to assemble, revise, and curate a shared repository of data. The traces of individual users in such databases are commonly known as annotations. Examples of such collaborations include curated protein databases, where users store known protein functions and comment on the stored data, and registries of animal sightings, where scientists strive to keep track of animal populations by having volunteers register animal sightings. As the community accumulates knowledge and the database content evolves over time, it may contain conflicting information and members can disagree on the information it should store. In these instances, the database serves not only as repository for data, but also as means of communication within the community.

Supported by:

NSF IIS-0915054


Paraschos Koutris,
Dan Suciu,
Paul Beame,
Sudeepa Roy,
Wolfgang Gatterbauer,
Daniel Li,
Gerome Miklau,
Prasang Upadhyaya,
Magdalena Balazinska,
Bill Howe,
Nodira Khoussainova,
Alexandra Meliou,
Abhay Jha,

Web Page:



Paraschos Koutris, Dan Suciu,
A Dichotomy on the Complexity of Consistent Query Answering for Atoms with Simple Keys
In ICDT, 2014
Paul Beame, JerryLi, Sudeepa Roy, Dan Suciu,
Model Counting of Query Expressions: Limitations of Propositional Methods
In ICDT, 2014
Wolfgang Gatterbauer, Dan Suciu,
Oblivious Bounds on the Probability of Boolean Functions
Published in ACM TODS, vol. 39 , no. 1 , pp. 191-208 , 2014
Daniel Li, Daniel Li, Gerome Miklau, Dan Suciu,
A theory of pricing private data
In ICDT, pp. 33-44, 2013
Paul Beame, JerryLi, Sudeepa Roy, Dan Suciu,
Lower Bounds for Exact Model Counting and Applications in Probabilistic Databases
In UAI, pp. 52-61, 2013
Paul Beame, Paraschos Koutris, Dan Suciu,
Communication steps for parallel query processing
In PODS, pp. 273-284, 2013
Dan Suciu,
Big Data Begets Big Database Theory
In BNCOD, pp. 1-5, 2013
Prasang Upadhyaya, Magdalena Balazinska, Bill Howe, Dan Suciu,
Stop That Query! The Need for Managing Data Use
In CIDR, 2013
Prasang Upadhyaya, Magdalena Balazinska, Bill Howe, Dan Suciu,
The power of data use management in action (demonstration)
In sigmod13, pp. 1117-1120, 2013
Nodira Khoussainova, Magdalena Balazinska, Dan Suciu,
PerfXplain: Debugging MapReduce Job Performance
Published in PVLDB, vol. 5 , no. 7 , pp. 598-609 , 2012
Alexandra Meliou, Wolfgang Gatterbauer, Dan Suciu,
Reverse Data Management
Published in PVLDB, vol. 4 , no. 12 , pp. 1490-1493 , 2011
Wolfgang Gatterbauer, Abhay Jha, Dan Suciu,
Dissociation and Propagation for Efficient Query Evaluation over Probabilistic Databases
In MUD, 2010
Wolfgang Gatterbauer, Dan Suciu,
Data conflict resolution using trust mappings
In SIGMOD Conference, pp. 219-230, 2010
Wolfgang Gatterbauer, Dan Suciu,
Integrating and Ranking Uncertain Scientific Data
In ICDE, pp. 1235-1238, 2009
Nodira Khoussainova, Magdalena Balazinska, Wolfgang Gatterbauer, Dan Suciu,
A Case for A Collaborative Query Management System
In CIDR, 2009
Wolfgang Gatterbauer, Magdalena Balazinska, Nodira Khoussainova, Dan Suciu,
Believe It or Not: Adding Belief Annotations to Databases
Published in PVLDB, vol. 2 , no. 1 , pp. 1-12 , 2009