Privacy becomes an issue when a system collects information about its user, so important social issues arise on an individual scale as well. In commercial applications, for example, it may be desirable to restrict access to profile information in order to protect a competitive advantage. And users with personal applications may demand that their profile remain private simply on moral grounds.
For content-based filtering systems, the privacy issue has two aspects: preventing unauthorized access to the profile and preventing reconstruction of useful information about the profile. The first issue is a straightforward security problem for which a variety of techniques such as password protection and encryption may be appropriate depending on the nature of the anticipated threat. But preventing reconstruction of useful information about the profile is a much more subtle problem. In Tapestry, for example, it would be possible to infer a good deal of information about the profile registered at the server by simply noting which documents were forwarded. An unauthorized observer who can detect which documents are being forwarded to specific users could conceivably build a second text filtering system (e.g., a social filter with an implicit user model) and then train it using the observed document forwarding decisions. Preventing such an attack would require that unauthorized observers be denied access to information about the sources and destinations of individual messages. In the computer security field, this is known as the ``traffic analysis problem,'' and cryptographic techniques which address it have been devised (c.f., [5,6]).
In the case of collaborative filtering, the situation is further complicated by the imperative to share document annotations. A simple approach (which is used by GroupLens) is to allow each user to adopt a pseudonym. While use of pseudonyms makes it more difficult to associate annotations with users, traffic analysis can still be used to determine which users would read a document. Unfortunately, information about who is reading specific documents is exactly what other authorized users must know to perform social filtering. Furthermore, Hill has observed that users choosing which information to examine may find it useful to know the identity (not merely the pseudonym) of the users who made the annotations . While encrypted transmission of annotations to other authorized users is a possibility in such cases, significantly limiting the user group in that way may prevent a social filtering system from reaching the necessary critical mass. This tension between a desire for privacy and the benefit of free exchange of information may ultimately limit the applications to which social filtering can be applied.
The level of protection which must be afforded to privacy varies widely across applications. By common agreement, many details of our private lives (e.g., birth, marriage and death) are a matter of public record. On the other hand, in the state of Maryland it is a crime to divulge the borrowing history of a library patron without a court order. One can even envision applications in which a user might prefer not to know information represented in their own profile. Where these lines should be drawn is a matter of judgement that must ultimately be resolved by those who control the information resources that are being used.