A fundamental shift is under way in pervasive computing. Within academic research, pervasive computing in the form of embedded networked sensing has leapt from the laboratory to the natural environment . Simultaneously, in the domain of personal communication and corporate marketing, pervasive computing has entered the backpack, purse, and coat pocket in the form of mobile phones, laying the groundwork for Mark Weiser's vision of ubiquitous computing . We characterize this contextual shift as "urban sensing," which augurs a fundamental transition from science and engineering into the realms of politics, aesthetics, interpretation, and motivation. More than a change in degree, this is a change in kind that warrants careful, transdisciplinary study.
In bucolic Lake Fulmor in the San Jacinto mountains, seven incongruous buoys dangle strings of thermistors to acquire time series of temperature at several different depths (see Figure 1). Suspended from each buoy, at half a meter below the surface, is a submersible fluorometer recording chlorophyll concentrations. A team of biologists and engineers from the University of Southern California oversees the system and collects sensor data wirelessly from shore; visualization tools help this group examine both the physical and biological dynamics in the lake. For a more complete picture of the local environment, data from the buoys is combined with wind speed and other microclimate measurements from a nearby weather tower at the James Reserve, a biological field station that is part of the University of California Natural Reserve System.
A robotic sensing device is installed at the deep end of the lake, sponsored by UCLA's Center for Embedded Networked Sensing (CENS), an NSF Science and Technology Center (CCF-0120778). The robotic system consists of a cable that spans the lake at its widest point, oriented perpendicular to the line of buoys. A small shuttle rides along this cable carrying with it a sensor node that is dipped into the lake at regular intervals. The shuttle submerges the node and its cluster of sensors, taking measurements at several depths. The resulting data forms a grid, profiling temperature, chlorophyll concentration, and about a dozen other variables in the plane of the cable system. When paired with the static buoy data, a model can be formed that captures the important chemical, physical, and biological processes in the lake.
Nearby, buried wireless sensor nodes record soil temperature, moisture, and CO2 concentration, and a robotic camera rides through an acrylic tube shooting pictures of roots and fungi. Still other devices monitor activity in nestboxes, where an image is collected every 15 minutes, then subjected to a series of processing algorithms that recognize whether the box is occupied as well as higher-level events like nest building, egg laying, and hatching. The use of imagers as biological sensors is a new research thread for CENS.
The rollout of these embedded networked sensors has coincided with other advances within the larger area of information technologies, and specifically the proliferation of geocoded data and the accompanying GIS platforms for its visualization. Services such as Google Earth have driven to nearly zero the cost of this visualization measured in terms of dollars, time-to-deploy, and technical sophistication required. So-called "mashups" with Google Maps provide anyone with a Web browser the ability to display data (sounds, images, video, statistics, and so on) in map layers. In combination with the embedded networked sensors, such systems have greatly reduced the technical barrier to visualize data in real space, to construct maps of layered information, and to analyze locational phenomena over time.
The move by CENS from the lab to the forest has been a radical leap forward, pushing the capabilities of sensors and robots, as well as offering rich new understandings of the forest itself. In the last five years, we have seen a shift in the emphasis of sensing research, with greater importance being placed on data, data processing, and mathematical and statistical models for environmental phenomena. While the move to the forest directly furthers CENS' mission to grow technology in the context of specific scientific questions, the forest was an ideal site for scientists to conduct a series of fast experiments that sidestep the thorny cultural problems of ubiquitous surveillance that have entered public debate. With James Reserve as today's reality, we can ask: What happens tomorrow, when pervasive computing comes out of the woods and goes urban?
The James Reserve represents what might be called a full centralization model: sensors, the data they collect, and the ways in which the data is processed is subject to centralized control by the scientists who plan the sensor deployments. This model cannot, however, scale to the city. Even if the enormous funds were available, scientists lack the property rights to instrument everywhere, and individuals enjoy privacy rights not granted to sparrows.
How then will pervasive computing ever manifest itself in the city? We believe it will be through the little device that 70% of U.S. citizens already carry: the cell phone. Although we think of cell phones as communication devices that we episodically and intentionally use, we should recognize they are also passive sensors that can silently collect, exchange, and process information all day long. Obviously, they are engineered to sense soundour voicesbut they also can sense images and movement through their built-in cameras. Still more interesting, they can sense location through GPS receivers or basic cell phone triangulation. In addition to sight, sound, and location, inside of 15 years, cheap sensors that detect other aspects of the environment like pollution, will be available as plug-ins. Although various factors such as infrastructural rollout and pricing plans will influence adoption rates, we are confident that within this 15-year period, in most urban centers around the world, processing, visualizing, and uploading sensor dataeven large amounts of itwill be accessible to a large percentage of their populations.
If the vector of entry will be an individual's cell phone, we necessarily move away from the James Reserve's full centralization toward a model of distributed citizen-sensing, sometimes called participatory sensing [2, 3]. In this model, although some central authority maintains the basic terms and conditions of data collection as well as the centralized data repository, that authority employs local data collectors (people like us) who voluntarily and idiosyncratically record data. A good example of distributed citizen-sensing is the Great Backyard Bird Count. Still further along this spectrum, we could imagine a fully decentralized model, with no central authority beyond some actor providing basic storage and search. This approach is even more in line with the Web 2.0 ethos, which values unconstrained user participation.1 Another model of decentralized sensing comes from architecture, when measurements and models are shared between buildings with control systems that allow one building to shade another or mitigate the so-called urban canyon effect (for examples of these models, see the sidebar "Making Urban Sense").
Although it is difficult to predict where precisely on this spectrum we will end up, urban sensing shifts focus and control away from the scientist at the center. We can anticipate new forms of science built from large-scale citizen-initiated data collection. Data will also be collected, then interpreted, in ad hoc ways by everyday citizens going about their daily lives. This suggests that urban sensing will go not only beyond scientists, but beyond science itself. Should we be worried?
There are at least two concerns: bad data processing and the "observer effect." First, when amateurs collect data through cheap, unverified, uncalibrated sensors, the immediate fear is "junk data." This may be merely incidental as when cell phone images frame what the photographer wants to show, with no pretense about neutrality or comprehensiveness. Or it may be more purposeful when the data collector has no commitment to epistemic objectivity. For example, neighbors documenting traffic congestion will rarely record traffic-free periods. Further, when statistically unsophisticated individuals interpret data, the immediate fear is "garbage analysis." With many eyes watching a set of data, the opportunities for incorrect inference multiply.
Second, observation generally and surveillance specifically alters human behavior. For example, video cameras for traffic or security are explicitly intended to alter conduct. When data collection is situated "outside" the thing being studied, observation remains arguably neutral. But when data collection is embedded among the actors within a setting, as in participant observation, a cycle of interactivity is launched in which observation changes behavior that changes observation and so on.
These concerns are serious, but not insurmountable. Various forms of distributed accountability can make data collection more reliable. For instance, a user may tell the network that certain sensors should "agree" with her measurements in order to register as acceptable or valid. Given its ubiquity, the sensing network itself can provide the redundancy necessary to identify and interrogate faulty data. For example, if one cell phone reports that it is moving 75 mph on a freeway, whereas all adjacent phones moving in the same direction report 45 mph, we can be skeptical of the outlier datum. Even in so-called bottom-up systems, guarantors of data quality exist. In some sense, this is how we have come to identify reliable sources on the Web, when search engines like Google return millions of possibly relevant sources of information. In addition, data sharing projects like ManyEyes at IBM offer a kind of "social data analysis," in which graphics are open to discussion; and through such interactions, inferences improve.
More important, we embrace the idea that urban sensing can and should go beyond science, unabashedly, into the realm of art and politics. In these arenas, data quality may not be what is most important. To make this discussion more concrete, consider, for example the D-Tower, created by Lars Spruybroek/NOX for the city of Doetinchem in the Netherlands (see Figure 2). The D-Tower is a 12-meter tall public sculpture activated by responses to a Web site that surveys the mood of the townspeople. If most of the Doetinchemers are feeling fearful, it glows yellow, but when they're in love, the beacon burns red. Here, we've added an entirely new dimension to urban sensingan aesthetic one in which data (responses to the online mood survey) is subjected to a heuristic devised for purposes of pleasure, humor, curiosity, and to a lesser extent, scientific truth. Numerous cities, public institutions, and designers are collaborating on urban-sensing projects, creating dynamic events that engage the citizenry. By generating publicity, these sensing mechanisms spark action and interaction, rather than merely record it. Although standards of data quality and participation apply to political and artistic projects, we can hardly ask whether the citizens are truthfully feeling happy when we observe a purple D-Tower. But we can debate whether the tower is responsive to and provokes some community feeling.
The data commons and citizen-initiated sensing will provide answers, pose new questions, and open new opportunities for public discourse.
In going beyond science, urban sensing has the potential to generate a "data commons." By this, we mean a data repository generated through decentralized collection, shared freely, and amenable to distributed sense-making not only for the pursuit of science but also advocacy, art, play, and politics. One might ask how the data commons differs from the Web we already have, and indeed, the same question might be asked of the blogosphere and its relationship to the Web. Specific technological choices (syndication, linking, commenting) provide the blogosphere with its unique character. The same can be said of the myriad Web 2.0 applications that elicit user-generated content. Whether we view these new developments as an outgrowth of the open source movement or of the success of a few participatory models (Wikipedia, YouTube), the applications on the Web today and the way data is structured and shared are fundamentally different than they were a decade ago. We think of the evolution of the data commons as an extension of this movement, offering a host of new applications, new data types, and data processing tools. As Natalie Jeremijenko contends, every sensor in the environment is a question. The data commons and citizen-initiated sensing will provide answers, pose new questions, and open new opportunities for public discourse.
The data commons resembles what we have previously called a public sphere . In prior work, Kang and Cuff provided a minimalist definition of the public sphere with four principal attributes: the public sphere must be accessible to diverse members; provide opportunity for multiple uses; encourage some sort of (and not always political) exchange among participants (in the case of a data commons, this implies both the sharing and consumption of information); and be recognizable as such a space. Although these attributes were used to describe physical realms and social practices, they can also be usefully applied to the data commons.
We are enthusiastic about a flourishing data commons for the same reasons we care about a vibrant public sphere. In particular, we are skeptical that democratic commitments will continue to be manifestedif they ever werethrough stylized practices of voting, political contributions, and face-to-face participation in local town hall meetings. Instead, individuals are increasingly manifesting their civics and politics through engagement of a public sphere understood far more broadly. This includes, for instance, "political shopping," in which individuals use pervasive urban computing to inform their marketplace decisions to further non-marketplace values, such as environmentalism or fair trade. Indeed, in the modern globalized capitalist environment, individuals express their social and political values as much through consumption choices, like (Product) Red, as they do through voting.
But practices such as political shopping (or "political eating" or "political commuting") require access to the varied sorts of information that a sprawling data commons could best provide. Indeed, civic participation itself may be measured by our contributions to the data commons. Take, for instance, the use of YouTube in the recent Presidential debates or the activity recorded by VideoTheVote.org in the last round of nationwide elections. Other examples that foreshadow a process of deeper civic engagement include reviews of products, tagging of data, comments on blogs, uploading of photographs and other information when newsworthy events take placewhether natural disasters or armed invasionsand citizen sensors are there to capture the moment. Consider a recent development, the so-called "placeblog." These are sites that function somewhere between a local paper and a blog; they aspire to record the details of a particular place. Similar moves can be found in daylife.com and newassignment.net. Such initiatives are consistent with the spirit of participatory GIS, which explicitly enlists the community to make a case or to study some aspect of life locally.
A data commons is valuable because it allows all of us to engage each other about what we newly "see" in the places and communities we inhabit. And we cannot take the building of the data commons for granted. Notwithstanding the buzz over Web 2.0 and sites like Flickr, it is presumptuous to think we will naturally and inevitably have a vibrant data commons, and the best possible one at that. What sort of data commons gets built depends on legal, policy, and technological (especially user interface) decisions we make now.
Property. Consider, for instance, the law of intellectual property. Copyright law only protects creative expressions; it does not protect the underlying data. Accordingly, one might be anxious that "data" will be underproduced because there will be no easy way to incentivize its collection and distribution. This fear misunderstands what drives our contributions to the data commons. Countless examples of cooperation, collaboration, and even play, especially mediated through the Internet, demonstrate that many substantial projects are not motivated by the prospect of significant financial remuneration (for example, Amazon's Mechanical Turk, weatherunderground. com, and the Google Image Labeler).
Instead of financial gain, one of the biggest motivators for citizen-sensors to share data may be attribution. Content providers tabulate hits and the number of blog links, while the USGS's "Did you feel it?," which asks citizen-sensors to record how strongly they felt earthquakes, allow data collectors to see how their data contributes to a larger whole. Such attribution can be designed without any expansion of intellectual property rights over data.
In fact, creating robust intellectual property rights over data risks a tragedy of the anticommons, a concept introduced by Michael Heller . If too many property rights are created, the cost of coordinating permissions among multiple, fragmented property rights owners prevents otherwise interesting, useful, and dynamic engagement with the data. To provide an urban sensing example, imagine geocoded digital images uploaded to some photo-sharing site. If some 3D visualization mashup required IP clearances for each and every photograph, the transaction costseven imagining efficient intermediarieswould be prohibitive.
Privacy. Because urban sensing collects information in environments inhabited by and directly connected to human beings, the data collected will often constitute personal information. Accordingly, urban sensing raises serious privacy concerns in a way that surveillance in the woods largely avoids. To be precise, by privacy, we mean information privacy: an individual's claim to control how personal data is collected, distributed, and processed [1, 3, 6, 10]. A patchwork of privacy laws already pertains to various aspects of urban sensing, especially when it takes place in private property not generally accessible to the public. The common law tort of invasion of privacy as well as statutory limitations on video and audio taping could prevent various forms of urban sensingan obvious issue to consider. Two less obvious aspects of the privacy problem are worth mentioning: self-surveillance and network solutions.
We tend to think of privacy claims being stated by the target of observation, and infractions as generated by others, be they corporate, state, or individual. While the threats such agents could wage in a distributed network should not be discounted, there is another type of incursion that is rarely debated. Because sensors will be carried on our bodies, in our automobiles, or sited on our real property, the persons about whom most information will be collected are ourselves. Persuading individuals to engage in such constant self-surveillance and then subsequently to share that data pose nontrivial hurdles entirely independent of the privacy claims raised by third parties. This is so even in the world of JennyCam and YouTube exhibitionism.
Whether we decide to engage in self-surveillance for the purposes of urban sensing depends in part on what the underlying computing technologies allow us to do. For example, if computer security is weak and information collected for personal use is vulnerable to third-party hacks, we will be less likely to collect that information in the first place. Similarly, if personal data cannot be easily scrubbed to become anonymous or pseudonymous or if it is difficult to control the granularity of data being released, people may be less likely to share that data publicly.
Today's exotic and disturbing data collection practices may appear banal 10 years hence. To the extent that privacy preferences are adaptive to the environment in this manner, we must be aware that today's policy choices will have long-term path-dependent effects.
The network itself can develop services to help individuals negotiate their various privacy relationships [3, 11]. For example, two of the most extensively studied problems in traditional sensor networks are localization and time synchronization. The network knows (or will shortly know) precisely when and where data is published. While these two pieces of data are critical for scientific applications, they raise privacy concerns in urban settings. The network could, if properly designed, implement a kind of resolution control by verifying data up to whatever resolution that a user permits. The tighter the resolution, the more useful the data downstream; but this choice could be left up to the individual provider.
Of course, whenever we think about "choice," we must recognize not only the cognitive limitations in the exercise of such choice but also that privacy preferences depend heavily on the background culture. Privacy preferences are adaptive, as should be evident from a cursory analysis of the kinds of information disclosure deemed sensitive over time. Thus, today's exotic and disturbing data collection practices may appear banal 10 years hence. To the extent that privacy preferences are adaptive to the environment in this manner, we must be aware that today's policy choices will have long-term path-dependent effects.
Interface. Even if individuals are motivated to participate and the underlying legal regimes make it possible to do so, user interface is critical to both data collection and interpretation. For example, if collecting and uploading (known as sensor blogging or "slogging") local pollution data is too difficult or costly, people will simply avoid the hassle, or if searching for useful data is futile, the data commons will not grow.
Part of the success of blogs can be attributed to extremely simple tools for creating and publishing content. Perhaps more important than content creation, the simplicity of sharing this information is key, as are distribution mechanisms like RSS that allow people to register interest in content and (thanks to reblogging tools) republish selected portions. This pipeline model is not dissimilar from what we might expect from citizen-sensing. Easy data discovery, subscription, and republication will be crucial.
Far from a database query, it would be a reasonable outgrowth of existing technologies if the data commons were built from disparate sources of shared data (following simple publication mechanisms), informally organized (as with meta tags), open for discovery, visualization, and comparison, and subject to republication (modeling feed-forward). In part, discovery in the data commons might borrow from these existing services, relying on republication/aggregation/modeling as a kind of link between data sources (for example, if I generate an interesting graphic or fit a regression using your data, a link is automatically established) .
The mere existence of a data commons is not a panacea since it is essentially public infrastructurea powerful resource that can be misused or under-realized. Here, we forecast potential fates of the data commons by drawing on Greek mythology.
Sirens. A mantra in the field of embedded network sensing is that it will "make the invisible visible." This has already taken place in scientifically controlled natural environments. It will soon take place within our cities, through decentralized processes, often without scientific goals or consequences. New sensing capabilities can make insidious urban qualities, such as buried toxic waste sites or ground water toxins, more visible to citizens. However, the new vision may be more like the Siren's song seducing us to make poor choices.
More information does not necessarily produce more rational (in the sense of instrumentally efficacious) decision making. Well-known cognitive biases might lead us to pay more attention to particular types of data than they rightly deserve. Consider, for example, the rough and ready risk calculation that individuals make in deciding where to live. If the data commons offer ready depictions of violent crime rates in the city, such information might persuade people to move to the distant suburbs in spite of the far greater mortality risk created through the increased highway driving. Relying on this highly salient, unidimensional "crime statistic" could produce a self-fulfilling prophecy that makes those areas with high crime rates grow more dangerous, while other areas get ever lower crime rates.
In response, we encourage data presentation practices that self-critically examine how individuals might easily misunderstand the data. And since none of us is entirely objective, we encourage what might be called "rights of reply." Just as blogs often provide spaces for comments, we envision data visualizations linking interpretations to counter the Sirens with their own melody.
Cyclops. In one version of the tale, the Cyclops is granted the power to see the future: again, the invisible suddenly becomes visible. Unfortunately, the Cyclops is deeply saddened because the only future he is permitted to see is the circumstance of his inevitable death. As they say, when the gods want to punish you, they grant you your wish.
The tragedy of the Cyclopsthat is, the impossibility of effecting change notwitstanding foreknowledgemight be visited upon us as part of the data commons. For example, it is possible that distributed environmental sensors could detail with alarming precision the nature and extent of our environmental poisoning. Those without financial or political means may be left with debilitating information about the nature of their demise without any practical ability to change their circumstance. Might there be some way out of this fate?
Without attempting any grand theory about how and when new information might catalyze change in political, social, and economic systems, we offer one novel idea: arbitrage our ignorance. This draws on the idea of a "veil of ignorance" offered by the widely read political philosopher John Rawls , who famously argued in favor of adopting principles of justice that would be agreed upon by persons in an ideal choice position (called the "original position"), which included deliberation behind a "veil of ignorance." This veil prevented persons from knowing what station of life they would find themselves in. If urban sensing lifts the veil by making the invisible visible, we must find ways to create some consensus before we learn the new information. After the information arrives, the predictable reaction is for the rich and powerful to respond to that information in self-serving ways. By precommitting to a particular principled response before the veil is lifted, we may be able to mobilize the collective resources necessary to avoid a tragedy of the Cyclops.
Embedded network sensing has made the leap from the laboratory to the natural environment through the careful design of professional scientists. It is now crossing into the urban context, but leaving behind the primacy of both scientists and science. The widespread use of cell phones, availability of GIS-related technologies, growth of Web 2.0, along with advances in sensor technologies have unleashed urban sensing. This new arena is fertile ground for participatory, collaborative efforts between citizens and scientists, artists, urbanists, and business people. As a form of public infrastructure, the data commons is essential for citizen participation in politics, civics, and aestheticsas well as science. What we do today will influence what the data commons becomes tomorrow. And only through deliberative effort and political engagement can citizens navigate around Siren calls and the tragic Cyclops.
1. Abdelzaher, T., Anokwa, Y.,Boda, P., Burke, J., Estrin, D., Guibas, L., Kansal, A., Madden, S., and Reich, J. Mobiscopes for human spaces. IEEE Pervasive ComputingMobile and Ubiquitous Systems 6, 2 (Apr.June 2007).
2. Burke, J., Estrin, D., Hansen, M., Parker, A., Ramanathan, N., Reddy, S., and Srivanstava, N.B. Participatory sensing. In Proceedings of the World Sensor Web Workshop, ACM SENSYS (Boulder, CO, 2006); www.sensorplanet.org/wsw2006/6_Burke_wsw06_ucla_final.pdf.
3. Campbell, A.T., Eisenman, S.B., Lane, N.D., Miluzzo, E., and Peterson, R. People-Centric Urban Sensing. In Proceedings of the 2nd ACM/IEEE Annual International Wireless Internet Conference (Boston, MA, Aug. 25, 2006).
9. Reddy, S., G. Chen, B. Fulkerson, S.-J. Kim, U. Park, N. Yau, J. Cho, M. Hansen and J. Heidemann. Sensor-Internet share and search: Enabling collaboration of citizen scientists. In Proceedings of the Workshop for Data Sharing and Interoperability, 2007.
11. Srivastava, M.B., Burke, J.A., Hansen, M., Parker, A., Reddy, S., Schmid, T., Chang, K., Ganeriwal, S., Allman, M., Paxson, V., and Estrin, D. Network System Challenges in Selective Sharing and Verification for Personal, Social, and Urban-Scale Sensing Applications. Technical report. Center for Embedded Network Sensing, 2006.
1See Tim O'Reilly's What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software. O'Reilly Media, (Sept. 30, 2005); www.oreillynet.com/pub/a/ oreilly/tim/news/2005/09/30/what-is-web-20.html.
This work was supported in part by UCLA School of Law, UCLA cityLAB, UCLA Academic Senate, the Center for Embedded Newtork Sensing (NSF CCF-0120778). Preliminary versions and drafts of this article were presented at workshops at Google, UCLA School of Law, Center for Embedded Networked Sensing, Computer Science Technology Board of the National Academy of Sciences. Hansen's research was supported by NeTS-FIND: Sensor-Internet Sharing and Search (NSF CNS-0626702), NeTS-FIND: Network Fabric for Personal, Social and Urban Sensing Applications (NSF CNS-0627084).
©2008 ACM 0001-0782/08/0300 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.
No entries found