Data Security in Global Information Exchange
Description
How can we ensure data confidentiality (i.e. only authorized parties
are allowed to see the data) and data integrity (i.e. only changes
permitted by the author have been performed on the data) in global
information exchange ? This project studies several fundamental
problems related to security in data exchange.
-
Information theoretic definitions of data privacy.
In our
SIGMOD'04 paper we have proposed to use probabilistic data to reason
about information leakages in data exchange and proposed a notion of
perfect security
. In our ICDT'2005 paper we relaxed this to
practical security
and developed algorithm for checking whether
a query/view pair is practically secure.
-
Encryption and anonymization methods
In our VLDB'04 paper we
have proposed a new model for enforcing access control policies over a
large XML document through encryption; this method was implemented by
Colin Pop
a.pop@comcast.net
, and
the tool is available here
http://www.cs.washington.edu/homes/suciu/CRYPTOXML/cryptoXML.tar.gz
.
In our VLDB'2007 paper we are proposing a new anonymization algorithm
for published data based on adding
random noise
to the private
data set, and prove that this algorithm achieves almost optimal
security/utility tradeoffs. In our VLDB'07 paper we have described a
novel, randomized algorithm for anonymizing tabular data. The
algorithm, called the
alpha-beta algorithm
, randomly
removes some tuples from a table, then randomly inserts some new,
spurious tuples in the table (it constructs such tuples by randomly
combining attribute values from other, existing tuples in the table).
This algorithm is both provably secure, and has provable utility
guarantees.
-
Access control on uncertain data
In our VLDB'2008 paper we
study the following problem: The system needs to make grant/deny
decisions, but the data that the system typically uses to make that
decision is uncertain. For example when managing RFID data, access
control rules grant access to data conditional on information such as
locations, co-locations, possession: a user is allowed to query for
the location of another user if they were physically co-located; or
she is allowed to ask "where is my coffee mug ?" only if her coffee
mug is not in the possession of another person. But RFID data is
often uncertain, hence probabilistic, and the system needs to make
grant/deny decisions based on probabilist data. We have developed an
access control mechanism based on random perturbations. In response
to a request, the system never denies the request. Instead it returns
a perturbed answer, where the amount of perturbation is proportional
to its degree of confidence it has that the user is allowed to access
the data.
-
Differential v.s. adversarial privacy
In our PODS'2009 paper
we ask the following question. Suppose a data perturbation
algorithm is differentially private. What are the adversarites that
A protects against ? We have given a precise characterization of
these adversaries: they are, intuitively, those adversaries that do
not know
positive correlations
between the tuples in the
private database. We this insight, we describe a practical
algorithm for perturbing queries over networked data, such as social
networks, which cannot be answered accurately enough using a
differentially private algorithm. By relaxing the class of
adversaries, we show how queries over networked data can be answered
while preserving privacy.
-
Boosting accuracy of differentially private mechansisms by
using consistency constraints
In our VLDB 2010 paper we
describe novel ways to improve the accuracy of differentially
private mechanisms, by enforcing consistency constraints. We
showed that it is possible to signicantly improve the accuracy of
a general class of histogram queries while satisfying differential
privacy. Our approach carefully chooses a set of queries to
evaluate, and then exploits consistency constraints that should
hold over the noisy output. In a postprocessing phase, we compute
the consistent input most likely to have produced the noisy
output. The final output is differentially-private and consistent,
but in addition, it is often much more accurate. We show, both
theoretically and experimentally, that these techniques can be
used for estimating the degree sequence of a graph very precisely,
and for computing a histogram that can support arbitrary range
queries accurately.
Students, past and current:
-
Vibhor Rastogi (graduated in 2010; Encryption and anonymization
methods, Access control on uncertain data, Differential
v.s. adversarial privacy, Boosting accuracy of differentially
private mechansisms by using consistency constraints).
-
Nodira
Koussainova
(Access control on uncertain data)
Supported by:
NSF IIS-0415193, NSF IIS-0428168 (Purdue) and NSF IIS-0627585 (Cornell)
Members
Vibhor Rastogi,
Gerome Miklau,
Dan Suciu,
Nodira Khoussainova,
Magdalena Balazinska,
Wolfgang Gatterbauer,
Evan Welbourne,
Nilesh Dalvi,
Travis Kriplean,
Gaetano Borriello,
Publications
Vibhor Rastogi, Gerome Miklau, Dan Suciu,
Boosting the Accuracy of Differentially Private Histograms Through Consistency
In VLDB, 2010
Nodira Khoussainova, Magdalena Balazinska, Wolfgang Gatterbauer, Dan Suciu,
A Case for A Collaborative Query Management System
In CIDR, 2009
Vibhor Rastogi, Gerome Miklau, Dan Suciu,
Relationship privacy: output perturbation for queries with joins
In PODS, pp. 107-116, 2009
Vibhor Rastogi, Dan Suciu, Evan Welbourne,
Access Control over Uncertain Data
In VLDB, 2008
Nilesh Dalvi,
Query Evaluation on a Database Given by a Random Graph
In ICDT, pp. 149-163, 2007
Travis Kriplean, Evan Welbourne, Nodira Khoussainova, Vibhor Rastogi, Magdalena Balazinska, Gaetano Borriello, Dan Suciu,
Physical Access Control for Captured RFID Data
Published in IEEE Pervasive Computing (Special issue on Security and Privacy in Pervasive Computing), vol. 6 , no. 4 , pp. 48-55 , October-December , 2007
Gerome Miklau, Dan Suciu,
A formal analysis of information disclosure in data exchange
Published in J. Comput. System Sci., vol. 73 , no. 3 , pp. 507-534 , 2007
Vibhor Rastogi, Dan Suciu,
The Boundary Between Privacy and Utility in Data Publishing
In VLDB, 2007
Nilesh Dalvi, Gerome Miklau, Dan Suciu,
Asymptotic Conditional Probabilities for Conjunctive Queries
In ICDT, 2005
Gerome Miklau, Dan Suciu,
A Formal Analysis of Information Disclosure in Data Exchange
In SIGMOD, 2004
Gerome Miklau, Dan Suciu,
Controlling Access to Published Data Using Cryptography
In VLDB, pp. 898-909, September, 2003
Gerome Miklau, Dan Suciu,
Cryptographically Enforced Conditional Access for XML
In Proceedings of WebDB, 2002