Introduction
Alice wants to query a database but she does not want the database to learn what she is querying. She can ask for the entire database. Can she get her query answered with less communication? We model a database as (1) an n-bit string x=x1 x2 ... xn, together with (2) a computational agent that can do computations based on both x and queries made to it. Alice wants to obtain xi such that the database does not learn i.
The earliest references for problems of this sort are Rivest et al. [40], Blakely [12] and Feigenbaum [22]. The model in [22] was refined by Abadi et al. [1]. This refined model was the basis for several later papers [4,3].
We will consider a later formulation by Chor et al. [19]-FIRSTPAPER.ps-FIRSTPAPER.pdf. Informally, the database is an n bits string x and Alice wants to access xi without revealing i.
There is another website of papers on PIRs: PRACTICALPIR It is about more practical PIR's then I consider here.
I have also written a survey of the material which appeared in the EATCS; however, I link to an updated survey. MYSURVEY.PS, MYSURVEY.PDF.
Information Theoretic PIR
Assume you have k >= 2 copies of the database. Then there are PIR s of complexity « n which achieve complete information theoretic security. The following list summarizes results.
Conjectures that Imply sublinear PIR
Gentry and Ramzan [23]- LOGN.pdf have an -bit PIR based on a variant of the -hiding problem.
Lipmaa [34]- LOGSQ.ps LOGSQ.pdf have a -bit PIR based on a public key system.
Kushilevitz and Ostrovsky [32]-QR.ps: If Quadratic Residue is hard then there is a 1-DB -bit PIR.
Yamamur and Saito have generalized this scheme to any group in [46]-GROUP.pdf:
Cachin et al. [15]-PHI.ps: If Phi-Hiding Problem is hard then there is a 1-DB polylog PIR .
Stern [42]-H-ENCRTYP-Stern.pdf, and Mann [35]-H-ENCRYPT-MannMasters.ps: If there is a homomorphic encryption then there is an o(n) 1-DB PIR .
Kushilevitz and Ostrovsky [33]-TRAPDOOR.ps: if there exists a One-way Trapdoor Permutation then there is an n-o(n) 1-DB PIR .
Kiayias and Yung, in a paper about secure games, showed that certain assumptions about oblivious transfer imply 1-DB polylog-bits PIRs [31]-GAMES.pdf. Coppersmith and Sudan have recently shown that the underlying hardness assumptions of this scheme are false [43]-BREAKPIR.ps. Bleichenbacher, Kiayias, and Yung also showed how to break it [20]-BREAKPIR2.pdf.
Boneh, Kushilevitz, Ostrovsky, and Skeith show how to get PIR from a particular Public Key Encryption. [14]-publickey.pdf.
Chor and Gilboa [16]-ONEWAY-IMPLIES.ps: if one-way functions exist then there is a 2-DB PIR.
What Do 1-DB Sublinear PIRs Imply?
1-DB (n-o(n))-bit PIR implies One-Way[8]-IMPLIES-ONEWAY.ps.
1-DB (n-o(n))-bit PIR implies Oblivous Transfer[37]-OT.ps.
Practical PIR
A position paper on why PIR might not be practical[41]. PIR-NOT-PRACTICAL.pdf
A paper which has Usable PIR protocols[44]. PIR-IS-PRACTICAL.pdf
Retrieving Different Types of Data
In the standard model Alice only wants one bit. It is more realistic that Alice wants a block of bits. What if the data is partitioned into blocks of m each and Alice wants an entire block. She could invoke a PIR m times. Can she do better? This question was raised by Chor et al. [19]-FIRSTPAPER.ps, FIRSTPAPER.pdf.
What if the database is a list of good stocks to buy and Alice just wants to know if BEATCS Inc. is a good stock? This does not fit our framework since she does not know exactly where in the database that information would be. This problem was considered by Chor and Gilboa [17]-KEYWORD.ps, KEYWORDS.pdf.
Variants of PIR and CPIR
In the standard PIR model the databases never break down (return no answer) and are never Byzantine (return a false answer). Beimel and Stahl [11]-ROBUST.pdf consider what can be done if some of the databases break down or return false answers.
In the standard model Alice may end up learning more than the one bit she is curious about. Gertner et al. [25]-SPIR.ps considered Symmetric PIR where Alice learns only the one bit she asked about. The notion of SPIR has also been looked at in the context of computational PIR by Mishra and Sarkar [36]-CSPIR-CONF.pdf.
In the standard model there are several copies of the database, which may be a security risk. This problem was addressed by Gertner et al. [24]-NO-REPLICATE.ps who suggested that the data not be replicated.
In the basic model we assumed that none of the databases talk to each other. Chor et al. looked at t-private s [18] where no subset of t of them can determine anything about i. This was later pursued by [6]-PRIVATE-PIR.pdf, [13]-PRIVATE2-PIR.ps, and [7]-GENERAL.ps.
In all of the PIR s discussed in the prior sections the database has to do O(n) work. Beimel et al. [10]-PREPROCESS.ps looked at cutting the work down by preprocessing.
In the standard model of PIR there is a lot of communication between Alice and the databases. If random bits could be supplied to both parties, this might help. DiCrescenzo [21]-RANDBITS.ps.
Lower Bounds
Lower Bounds For 2-DB 1-Round PIR Schemes:
Assume only linear queries are allowed. Goldreich et al. [26]-LINEARQUERIES show that if the database sends back a query of length a then Alice must send a query of length at least c(n/) for some constant .
Assume only linear queries are allowed. Chor et al. [19]-FIRSTPAPER.ps, FIRSTPAPER.pdf. show that if the database sends back an answer of length one then each database must get a query of length at least n-1 bits.
(No restrictions on the query.) Kerenidis and de Wolf [30]-QUANTUM.ps. show that if the database sends back an answer of length a then Alice must send a query of length at least c(n/) for some constant . (Note that it was ) in the conference version.) In the case a=1 at least (1-H(11/14))n-40.25n bits are required. They use quantum techniques.
(No restrictions on the query.) Wehner and De Wolf QUANTUM2.ps have used quantum techniques to improve the results of Kerendis and de Wolf to . They also show that, in the general case, every 2-server PIR has total communication at least .
(No restrictions on the query.) Beigel et al. [5]-EXACTLOWER.ps, EXACTLOWER.pdf show that if the database sends back a query of length one then Alice must send a query of length least n-2.
General lower bounds:
Mann [35]-LOWERBOUND-MannMasters.ps: Let k >= 2 and epsilon>0. Every k-DB alpha(n)-bit PIR where every database receives the same number of bits has alpha(n) >= (k*k/(k-1)-epsilon) log n. In particular, taking k=2 and epsilon=1/2, any 2-DB PIR where every database receives the same number of bits has complexity at least 3.5 log n.
Itoh [29]-LOWERBOUNDS-GENERAL.ps. showed the following. Let k,L be naturals. Let epsilon>0. Let P be any k-DB L-multilinear PIR . Let alpha(n) be its complexity. For almost all n, alpha(n)>= .
Lower bounds via locally decodable codes are in [38]. LOWER-LCD.PS.