Kunal Talwar: The Geometry of Approximate Differential Privacy
In this talk, I will discuss trade-offs between accuracy and privacy
in the context of linear queries over histograms. This is a rich class
of queries that includes contingency tables and range queries, and has
been a focus of a long line of work. For a set of $d$ linear queries
over a database $x \in \R^N$, we seek to find the differentially
private mechanism that has the minimum mean squared error. I will
first describe a $O(\log^2 d)$ approximation algorithm for this
problem, for the case of $(\eps,\delta)$-differential privacy. The
mechanism is simple, efficient and adds correlated Gaussian noise to
the answers. We prove its approximation guarantee relative to the
hereditary discrepancy lower bound of Muthukrishnan and Nikolov, using
tools from convex geometry.
We will next consider this question in the case when the number of
queries exceeds the number of individuals in the database, i.e. when
$d > n = \|x\|_1$. It is known that better mechanisms exist in this
setting. I will then describe an $(\eps,\delta)$-differentially
private mechanism which is optimal up to a $\polylog(d,N)$ factor for
any given query set $A$ and any given upper bound $n$ on $\|x\|_1$.
This approximation is achieved by coupling the Gaussian noise addition
approach with a linear regression step. We give an analogous result
for the $\eps$-differential privacy setting. We also improve on the
mean squared error upper bound for answering counting queries on a
database of size $n$ by Blum, Ligett, and Roth, and match the lower
bound implied by the work of Dinur and Nissim up to logarithmic
factors.
The connection between hereditary discrepancy and the privacy
mechanism also enables us to derive the first polylogarithmic
approximation to the hereditary discrepancy of a matrix $A$.
This talk is based on joint work with Alex Nikolov and Li Zhang
