Opinion
Security and Privacy Viewpoint

Inverse Privacy

Seeking a market-based solution to the problem of a person's unjustified inaccessibility to their private information.
Posted
Inverse Privacy, illustrative photo
  1. Introduction
  2. Personal Infoset
  3. Classification
  4. Provenance
  5. The Rise of Inverse Privacy to Dominance
  6. The Inverse Privacy Entitlement Principle
  7. The Inverse Privacy Problem
  8. Going Forward
  9. References
  10. Authors
  11. Figures
Inverse Privacy, illustrative photo

Call an item of your personal information inversely private if some party has access to it but you do not. The provenance of your inversely private information can be totally legitimate. Your interactions with various institutions—employers, municipalities, financial institutions, health providers, police, toll roads operators, grocery chains, and so forth—create numerous items of personal information, for example, shopping receipts and refilled prescriptions. Due to progress in technology, institutions have become much better than you in recording data. As a result, shared data decays into inversely private. More inversely private information is produced when institutions analyze your private data.

Your inversely private information, whether collected or derived, allows institutions to serve you better. But access to that information—especially if it were presented to you in a convenient form—would do you much good. It would allow you to correct possible errors in the data, to have a better idea of your health status and your credit rating, and to identify ways to improve your productivity and quality of life.

In some cases, the inaccessibility of your inversely private information can be justified by the necessity to protect the privacy of other people and to protect the legitimate interests of institutions. We argue that there are numerous scenarios where the chances to hurt other parties by providing you access to your data are negligible. The inaccessibility of your inversely private information in such safe scenarios is the inverse privacy problem. A good solution to the problem should not only provide you accessibility to your inversely private information but should also make that access convenient.

We analyze the root causes of the inverse privacy problem and discuss a market-based solution for it. We concentrate here on the big picture, leaving many finer points for later analysis.

Some explanations are more natural in a dialogue, and so we include here some discussions between Quisani, ostensibly a former student of the first author, and the authors.

Back to Top

Personal Infoset

For brevity, items of information are called infons.11 An infon is tangible if it has a material embodiment, for example, written down on a piece of paper or recorded in some database. The same infon (as an abstract item of information) may have distinct material embodiments. Herein we restrict attention to infons that are tangible.

We are interested in scenarios where a person interacts with an institution, for example, a shop, a medical office, a government agency. We say that an infon x is personal to an individual P if (a) x is related to an interaction between P and an institution and (b) x identifies P. A typical example of such an infon is a receipt for a credit-card purchase by a customer in a shop.

Define the personal infoset of an individual P to be the collection of all infons personal to P. Note that the infoset evolves over time. It acquires new infons. It may also lose some infons. But, because of the tangibility restriction, the infoset is finite at any given moment.

Q: Give me an example of an intangible infon.

A: A fleeting impression that you have of someone who just walked by you.

Q: What about information announced but not recorded at a meeting? One can argue that the collective memory of the participants is a kind of embodiment.

A: Such a case of unrecorded information becomes less and less common. People write notes, write and send email messages, tweet, use their smartphones to make videos, and so forth. Companies tend to tape their meetings. Numerous sensors, such as cameras and microphones, are commonplace and growing in pervasiveness, even in conference rooms. But yes, there are border cases as far as tangibility is concerned. At this stage of our analysis, we abstract them away.

Q: In the shopping receipt example, the receipt may also mention the salesclerk that helped the customer.

A: The clerk represents the shop on the receipt.

Q: But suppose that something went wrong with that particular purchase, the customer complained that the salesclerk misled her, and the shop investigates. In the new context, the person of interest is the salesclerk. The same infon turns out to be personal to more than one individual.

A: This is a good point. The same infon may be personal to more than one individual but we are interested primarily in contexts where the infon in question is personal to one individual.

Back to Top

Classification

The personal infoset of an individual P naturally splits into four buckets.

  1. The directly private bucket comprises the infons that P has access to but nobody else does.
  2. The inversely private bucket comprises the infons that some party has access to but P does not.
  3. The partially private bucket comprises the infons that P has access to and a limited number of other parties do as well.
  4. The public bucket comprises the infons that are public information.

Q: Why do you call the second bucket “inversely private?”

A: The Merriam-Webster dictionary defines “inverse” as “opposite in order, nature, or effect.” The description of bucket 2 is the opposite of that of bucket 1.

Q: As far as I can see, you discuss just two dimensions of privacy: whom a given infon is personal to, and who has access to the infon. The world is more complex, and there are other dimensions to privacy. Consider for example the pictures in the directly private bucket of my infoset that are personal to me only. Some of the pictures are clearly more private than others; there are degrees of privacy.

A: Indeed, we restrict attention to the two dimensions. But this restricted view is informative, and it allows us to carry on our analysis. Recall that we concentrate here on the big picture leaving many finer points for later analysis.

Q: Concerning the public bucket of my infoset, how can public information be personal? Personal and public are the opposites.

A: You may be confusing personal information with its sensitive part. Not every personal infon is sensitive. For example, the name of our president is personal information as well as public.

Back to Top

Provenance

With time, the personal infoset of an individual acquires new infons. She may create new infons on her own, for example, by making a selfie, by writing down some observation, or by writing down some conclusions she inferred from information available to her.

But the infoset acquires many more new infons due to the interactions of the individual with other parties. The other parties could be people, such as relatives, neighbors, coworkers, clerks, waiters, and medical personnel. They could be institutions, such as employers, banks, Internet providers, brick-and-mortar shops, online shops, and government agencies. The new infons could be factual records, gossip, rumors, or derived information.


A good solution to the problem should not only provide you accessibility to your inversely private information but should also make the access convenient.


The infoset may also lose some infons, especially if they have a unique embodiment. For example, the individual may destroy old letters or delete a selfie without sending it to anybody. Institutions also may lose or delete (embodiments of) infons, but in general, these days, institutions are much better then people in keeping records.

New items of a personal infoset do not necessarily stay in the bucket where they arose. Because of modern superiority of institutional bookkeeping, there is a flow of information from the partial privacy bucket to the inverse privacy bucket—we look into these dynamics next.

Back to Top

The Rise of Inverse Privacy to Dominance

People have always interacted among themselves, and people have interacted with institutions for a very long time, certainly from the times that ancient governments started to collect taxes. Until recently the capacity of a person to take and keep records was comparable to that of institutions. Yes, the government kept tax records but, by and large, the people knew about their taxes as much as the government did. Traditionally, the partial privacy bucket easily dominated the inverse privacy bucket.

Later on, governments, especially dictatorial governments, could marshal resources to collect information on people; a novelist illustrated this power the best.13 The most radical change, however, is due to technology introduced in the last 20-30 years. The capacity of public and private institutions to take and keep records became vastly superior to that of a regular person. As a result, the large majority of items in the personal infoset is now generated as inversely or partially private. Often infons start as partially private but then quickly decay into inversely private because the institutions remember it all while the person often hardly remembers that the interaction took place.

For a regular citizen of an advanced society today, the volume of the inverse privacy bucket vastly exceeds that of the partial privacy bucket. Of course it may be simplistic to count bits or even items. A picture of a car has many bits but only so much useful information; even many pictures of the same car may have only so much useful information. It makes more sense to speak about the value of information rather than its volume.

Determining the value of personal information is a difficult problem, particularly because of a gap between what people are willing to pay for keeping an item of information directly private and what they are willing to accept for sharing that same item of information; see Acquisti et al.1 and its references. Nevertheless, we posit that typically the value of the inverse privacy bucket exceeds that of the partial privacy bucket and grows much faster.

Thus, in advanced societies today, the inverse privacy bucket of a typical personal infoset dominates the whole infoset. We see the dominance of inverse privacy as a problem. In this connection, it is important to understand legal, political, sociological, ethical, technological implications of the inverse privacy domination.

It is worth emphasizing that the main reason that we live now in a world dominated by inverse privacy is not the invasion of privacy (the tremendous importance of that issue notwithstanding) but the gross disparity in the capability to take and keep records.

Back to Top

The Inverse Privacy Entitlement Principle

Enterprises have legitimate reason to collect data about their customers; this allows them to serve their customers better. Medical institutions have legitimate reasons to collect data about their patients; this helps them diagnose and treat diseases. Governments have legitimate reason to collect data about their citizens; this helps them address societal problems.

As noted earlier, institutions are much better than individuals in collecting data. So, in the process of all the collection of data about customers, patients, and citizens, partially private data is quickly becoming inversely private. Aside from any surreptitious collection of personal information, this conversion of data from partially private to inversely private is critical to the provenance of inversely private information.

Access to your inversely private infons would allow you to correct possible errors in the data, to have a better idea of your health status and your credit rating, and so on.

From an ethical point of view, it is only fair to give you access to your personal infons. Already the 1973 HEW report16 advocated that “[t]here must be no personal-data record-keeping systems whose very existence is secret,” and “[t]here must be a way for an individual, to find out what information about him is in a record and how it is used.” And the 1970 Fair Credit Reporting Act (FCRA) stipulated that, subject to various technical exceptions, “[e]very consumer reporting agency shall, upon request, … clearly and accurately disclose to the consumer” all information in the consumer’s file, the sources of the information, and so on.6

Concentrating on the big picture, we ignore technical exceptions here. But we cannot ignore that governments have legitimate security concerns, and businesses have legitimate competition concerns. The 2012 Federal Trade Commission (FTC) report on “Protecting Consumer Privacy in an Era of Rapid Change” is more nuanced: “Companies should provide reasonable access to the consumer data they maintain; the extent of access should be proportionate to the sensitivity of the data and the nature of its use.”8 To this end, we posit:

The Inverse Privacy Entitlement Principle. As a rule, individuals are entitled to access their personal infons. There may be exceptions, but each such exception needs to be justified, and the burden of justification is on the proponents of the exception.

One obvious exception is related to national security. The proponents of that exception, however, would have to justify it. In particular, they would have to justify which parts of national security fall under the exception.

Back to Top

The Inverse Privacy Problem

We say that an institution shares back your personal infons if it gives you access to them. This technical term will make the exposition easier. Institutions may be reluctant to share back personal information, and they may have reasonable justifications: the privacy of other people needs to be protected, there are security concerns, there are competition concerns. But there are numerous safe scenarios where the chances are negligible that sharing back your personal infons would violate the privacy of another person or damage the legitimate interests of the information holding institution or any other institution.

The inverse privacy problem is the inaccessibility to you of your personal information in such safe scenarios.

Q: Give me examples of safe scenarios.

A: Your favorite supermarket has plentiful data about your shopping there. Do you have that data?

Q: No, I don’t.

A: But, in principle you could have. So how can sharing that data with you hurt anybody? Similarly, many other businesses and government institutions have information about you that you could in principle have but in fact you do not. Some institutions share a part of your inversely private information with you but only a part. For example, Fitbit sends you weekly summaries but they have much more information about you.


As a rule, individuals are entitled to access their personal infons.


Q: As you mentioned earlier, institutions have not only raw data about me but also derived information. I can imagine that some of that derived information may be sensitive.

A: Yes, there may be a part of your inversely private information that is too sensitive to be shared with you. Our position is, however, that the burden of proof is on the information-holding institution.

Q: You use judicial terminology. But who is the judge here?

A: The ultimate judge is society.

Q: Let me raise another point. Enabling me to access my inversely private information makes it easier for intruders to find information about me.

A: This is true. Any technology invented to allow inverse privacy information to be shared back has to be made secure. Communication channels have to be secure, encryption has to be secure, and so forth. Note, however, that today hackers are in a much better position to find your inversely private information about you than you are. Sharing that information with you should improve the situation.

Back to Top

Going Forward

As we pointed out previously, the inverse privacy problem is not simply the result of ill will of governments or businesses. It is primarily a side effect of technological progress. Technology influences the social norms of privacy.17 In the case of inverse privacy, technology created the problem, and technology is instrumental in solving it. Here, we argue that the inverse privacy problem can be solved and will be solved. By default we restrict attention to the safe scenarios described previously.

Social norms. Individuals would greatly benefit from gaining access to their inversely private infons. They will have a much fuller picture of their health, their shopping history, places they visited, and so on. Besides, they would have an opportunity to correct possible errors in inversely private infons. To what extent do people understand the great benefits of accessing their inversely private infons? We do not have data on the subject, but one indication appears in Leon et al.12: “We asked participants to [t]hink about the ability to view and edit the information that advertising companies know about you. How much do you agree or disagree with the following, showing them six statements. 90% of participants believed (agreed, strongly agreed) they should be given the opportunity to view and edit their profiles. A large percentage wanted to be able to decide what advertising companies can collect about them (85%) and saw benefits in being able to view (79%) and edit profiles (81%). The majority thought that the ability to edit their profiles would provide companies with more accurate data (70%) and allow them to better serve the participants (64%).”

As people realize the benefits at issue more and more, they will demand access to their inversely private infons louder and louder. Indeed, it is easy to underestimate the amount of information that businesses have about their clients. The story of Austrian privacy activist Max Schrems is instructive. “In 2011, Schrems demanded that Facebook give him all the data the company had about him. This is a requirement of European Union (EU) law. Two years later, after a court battle, Facebook sent him a CD with a 1,200-page PDF.”14

Social norms will evolve accordingly, toward a broad acceptance of the Inverse Privacy Entitlement Principle described earlier. Institutions should share back personal information as a matter of course. Furthermore, they should do so in a convenient way. Your personal infons should be available to you routinely and easily—just as the photos that you upload to a reputable cloud store. You do not have to file a legal request to obtain them.

The evolving social norms influence the law, and the law helps to shape social norms. Here, for brevity, we restrict attention to the U.S. law. We already quoted the 1970 Fair Credit Reporting Act, the 1973 HEW report, and the 2012 FTC report. Here are some additional laws and FTC reports of relevance:

  • A 2000 report of an FTC Advisory Committee on “providing online consumers reasonable access to personal information collected from and about them by domestic commercial Web sites, and maintaining adequate security for that information”7;
  • The 2003 Fair and Accurate Credit Transactions Act providing consumers with annual free credit reports from the three nationwide consumer credit reporting companies;5
  • California’s “Shine the light” law of 2003, according to which a business cannot disclose your personal information secretly from you to a third party for “direct marketing purposes”2; and
  • A 2014 FTC report that calls for laws making data broker practices more transparent and giving consumers greater control over their personal information.9

Clearly the law favors transparency and facilitates your access to your inverse private infons.

Market forces. The sticky point is whether companies will share back our personal information. This information is extremely valuable to them. It gives them competitive advantages, and so it may seem implausible that companies will share it back. We contend that companies will share back personal information because it will be in their business interests.

Sharing back personal information can be competitively advantageous as well. Other things being equal, wouldn’t you prefer to deal with a company that shares your personal infons with you? We think so. Companies will compete on (a) how much personal data, collected and derived, is shared back and (b) how convenient that data is presented to customers.

The evolution toward sharing back personal information seems slow. This will change. Once some companies start sharing back personal data as part of their routine business, the competitive pressure will quickly force their competitors to join in. The competitors will have little choice.

There is money to be made in solving the inverse privacy problem. As sharing back personal information gains ground, the need will arise to mine large amounts of customers’ personal data on their behalf. The benefits of owning and processing this data will grow, especially as the data involves financial and quality-of-life domains. We anticipate the emergence of a new market for companies that compete in processing large sets of private data for the benefits of the data producers, that is, consumers.


There is money to be made in solving the inverse privacy problem.


The miners of personal data will work on behalf of consumers and compete on how helpful they are to the customers, how trustworthy they are. This emerging market will generate its own pressure on the personal data holders and potentially might find ways to benefit them as well. For example, if you shop at some retailer R your personal data miner M may show you a separate webpage devoted to R, suggest ways for you to save money as you shop there, and show you how R intends to improve your shopping experience. The last part may even be written by R, but—working on your behalf—M may also suggest to you better deals or shopping experiences elsewhere. The retailer R will benefit if it can beat the competition.

Better record keeping. Finally, technology can enhance people’s capacity to take and keep records. For example, your smartphone or wearable device may eventually become a trusted and universal recorder of many things you do. Technology will help people maintain a personal diary effortlessly.

The project “Small Data” lead by Deborah Estrin at Cornell Tech3 pioneers such an approach in the domain of health. “Consider a new kind of cloud-based app that would create a picture of your health over time by continuously, securely, and privately analyzing the digital traces you generate as you work, shop, sleep, eat, exercise, and communicate.”4

The “small” in “Small Data” reflects the fact that the personal health-related data of one individual isn’t big data.18 In contrast to Estrin’s work, we do not restrict attention to any particular data vertical. In our case, inversely private data of an individual tends to be on the biggish side10—recall the story of Max Schrems described earlier in this Viewpoint.

Back to Top

Back to Top

Back to Top

Figures

UF1 Figure. Watch the authors discuss their work in this exclusive Communications video. http://cacm.acm.org/videos/inverse-privacy

Back to top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More