Data Optimization in Developing Nations

Nathan Eagle (pictured), Eric Horvitz, and others are creating an Artificial Intelligence for Development community to address problems in economically developing countries.

By now, many scientists and CEOs have begun to seize the opportunities that lie within the exabytes of data being generated each day. Banks trawl data to detect criminal fraud, marketers to spot emerging trends, researchers to uncover new patterns, and governments to reduce crime and provide better services.

Most data analyses thus far have focused on developed societies. Yet, a growing community of computer scientists is calling for new applications that would harness these data-analysis methods to improve the lives of people in developing nations. Machine learning and artificial intelligence, they say, are perfectly poised to promote socioeconomic development, respond more effectively to natural disasters, expand access to health care, and improve the quality of education. Now, thanks to the efforts of Eric Horvitz, a distinguished scientist at Microsoft Research, and Nathan Eagle, a researcher who lives in Kenya and holds faculty appointments at the Massachusetts Institute of Technology (MIT) Media Lab and Northeastern University, a small but diverse group of computer scientists is banding together to share ideas and information, and to define itself as a community.

Interest about the developing world has been growing in the field of Information and Communication Technology for Development (ICT-D), which encompasses projects that range from managing the delivery of basic services like health care and education to developing network infrastructure, but ICT-D has rarely focused on opportunities to apply artificial intelligence or mine data from developing nations. Last year, ICT-D experts set out to rectify that situation with the formation of the ACM Special Interest Group on Global Development (SIGDEV), which held its first conference at the University of London in December. What Horvitz, Eagle, and others aim to do is foster the creation of a subfield within ICT-D to address these deficiencies. The name they’ve proposed for it: Artificial Intelligence for Development, or A-ID.

It began two years ago at a Princeton University conference called Studying Society in a Digital World, which was organized by Edward W. Felton, director of the university’s Center for Information Technology Policy. Eagle presented a paper about using large data sets—in this case, phone calls in Britain—to test American sociologist Mark Granovetter’s “The Strength of Weak Ties” theory, which argues that innovation often travels most effectively via weak social connections. Did factors like the geographical distance between callers correlate with socioeconomic indicators like income and education? As it turns out, they did: Regions with a higher volume of geographically diverse calls scored lower on the Index of Multiple Deprivation, a statistical study that covers factors like employment, crime, and health care. Horvitz was intrigued. “I’m passionate about machine intelligence and its applications,” he explains. “And I realized there’s a lot we can do to stimulate thought.” Horvitz was president for the Association for the Advancement of Artificial Intelligence (AAAI); with Eagle’s help, he set up an AAAI symposium titled Artificial Intelligence for Development at Stanford University, which took place last March.

“Our idea was that we have so much data, and the majority of it is being generated by people in the developed world,” says Eagle. “There’s a real opportunity for us to repurpose that data and serve these under-served communities.”

The diverse set of projects presented at the Artificial Intelligence for Development symposium underscored his point. Much of the research was preliminary, but the initial results were promising. Shawndra Hill, an assistant professor in Operations and Information Management at the Wharton School of the University of Pennsylvania, who has also taught at Addis Ababa University (AAU), spoke of efforts to improve Ethiopia’s road safety. Ethiopia has the world’s highest rate of traffic fatalities, according to the World Health Organization, with a reported 114 deaths per 10,000 vehicles per year. By comparison, the U.K. has one death per 10,000 vehicles per year.

“The Ethiopian Traffic Enforcement Agency collects data on every accident that’s reported,” Hill explains. “Where did the accident happen, what did the intersection look like, what’s the road quality, was it raining, and so on.” Working with AAU lecturer Tibebe Beshah, Hill investigated the role of road-related factors in accident severity. The researchers tested classification models to predict the severity of more than 18,000 car accidents and used a projective adaptive resonance theory algorithm to identify the data’s significant patterns. One research finding: Severe physical injuries were more likely to occur on straight, flat roads than on all other types of roads in the same area.

“The methods don’t change,” says Hill. “You could do the same analysis with data from the United States.” In a country that has the highest rate of traffic fatalities in the world, however—and those accidents being among the nation’s leading causes of death—the potential socioeconomic impact is huge. In the future, Hill and her fellow researchers hope to develop new predictive models that combine road data with driver information, and develop a decision support tool for the Ethiopian Traffic Office.

“Our idea was that we have so much data, and the majority of it is being generated by people in the developed world,” says Nathan Eagle. “There’s a real opportunity for us to repurpose that data and serve these underserved communities.”

At the Artificial Intelligence for Development symposium, Eagle and Horvitz presented research in which they deduced the impact of seismic activity in the Lac Kivu region of the Democratic Republic of the Congo from three years of mobile phone data in neighboring Rwanda. “By watching anomalous call behavior, we could infer the epicenter of the earthquake,” Horvitz explains. The researchers could then make inferences about which areas in the Lac Kivu region were likely to have suffered the greatest damage and be of higher priority for emergency assistance workers. Eagle has used the same data to better understand the dynamics of urban slums and model the effects of social networks on infectious disease outbreaks. And University of California, Berkeley postdoctoral research fellow Emma Brunskill spoke of using traveling salesman techniques to help community health workers in the developing world—some of whom can be responsible for up to 4,000 people—improve the efficiency and timing of their visits to patients in rural areas. The data analysis was exploratory, but Brunskill says she is encouraged by the potential of existing techniques. Another area she finds promising is education. Schools in developing nations often rely on a single computer per classroom. In experimental trials in Bangalore, India, Brunskill and a team of researchers built on foundational studies in multi-input interfaces to test the efficacy of an adaptive multi-user learning game. Initial trials suggested that customizing each student’s experience could increase her or his engagement by reducing the likelihood that a single student dominated the game.

Constraints, Costs, Challenges

While AI-D research methods may be the same as they are in mainstream Western science, other factors in developing nations are quite different. First and foremost are the technology constraints. Access to electricity, computers, and the Internet is limited in many areas. Language presents another barrier, as does cost. “The design considerations are much different,” says Lakshmi Subramanian, an assistant professor at the Courant Institute of Mathematical Sciences at New York University. Subramanian’s research includes the use of document classification and focused crawling methods to build offline educational portals, and computer vision techniques to detect diabetic retinopathy, the world’s leading cause of adult blindness. Yet, according to Subramanian, constraints are what make the problems interesting. “If you can only use SMS, what can you do? Turns out, you can do a lot, thanks to semantic compression and other tools,” he says. “In fact, we’ve built an SMS search engine in Kenya.”

Gaining access to useful data can also be a challenge. “There’s no culture of data like there is in the West,” says Hill. “Even businesses in Ethiopia aren’t collecting data like we are.” As a result, one of Horvitz and Eagle’s priorities is to create a central data repository to support new research projects. They began by compiling a list of useful resources at the AI-D symposium Web site, http://www.ai-d.org, from organizations like the World Bank, World Trade Organization, and UNICEF. They are also working with regional organizations, such as telephone companies, to share additional data.

“We’re trying to set up a Switzerland for data sets,” says Horvitz.

Beyond that, Horvitz and Eagle hope to get more computer scientists involved. Not surprisingly, in such a young field, there are differences of opinion about research, strategies, and direction. “There is a tension inherent in this area, as in the broader computing for development community, about whether you’re trying to solve a problem or define a new research area,” admits Brunskill. Many successful projects are trying to solve immediate societal problems, but researchers in the community are nonetheless convinced there are possibilities for serious long-term science.

“For example, I believe computer scientists can lead the way in combining data- and model-centric methods to design proactive plans that can mitigate the spread of diseases or the shortages of food and water following a disaster,” Horvitz explains. “The idea is to use computational models and data from similar situations to make inferences about actions that promise to have a high expected value. Partial plans might be generated proactively with the goals of maximizing the survival of people who have been injured or are trapped. The details of such ‘contingency plans’ for transporting medications, food, and water could be instantiated in real time based on sensor data.”

In the meantime, the field’s lack of definition suits many scientists just fine. “Fundamental research in this domain is about understanding the ground realities,” says Subramanian, who points out that in “the 1970s, you didn’t have specialties like architecture or networking. You were trying to get things done.”

“I believe computer scientists can lead the way in combining data- and model-centric methods to design proactive plans that can mitigate the spread of diseases or the shortages of food and water following a disaster,” says Horvitz.

“It’s been said that computer science has failed to attract young people seeking a ‘noble profession’ like medicine or law, where they can directly help people,” says Eric Horvitz. “This area of research highlights how computer scientists can touch the lives of people in need.”

Further Reading

Beshah, T. and Hill, S.
Mining road traffic accident data to improve safety: role of road-related factors on accident severity in Ethiopia, Proceedings of AAAI Artificial Intelligence for Development (AI-D’10), Stanford, CA, March 2224, 2010.

Brunskill, E. and Lesh, N.
Routing for rural health: optimizing community health worker visit schedules, Proceedings of AAAI Artificial Intelligence for Development (AI-D’10), Stanford, CA, March 2224, 2010.

Eagle, N. and Horvitz, E.
Artificial intelligence for development (AI-D), Proceedings of the CCC Workshop on Computer Science and Global Development, Berkeley, CA, Aug. 12, 2009.

Kapoor, A, Eagle, N. and Horvitz, E.
People, quakes, and communications: inferences from call dynamics about a seismic event and its influences on a population, Proceedings of AAAI Artificial Intelligence for Development (AI-D’10), Stanford, CA, March 2224, 2010.

Silberman, N., Ahrlich, K., Fergus, R., and Subramanian, L.
Case for Automated Detection of Diabetic Retinopathy, Proceedings of AAAI Artificial Intelligence for Development (AI-D’10), Stanford, CA, March 2224, 2010.

Figures

Figure. Nathan Eagle (above), Eric Horvitz, and others are creating an Artificial Intelligence for Development community to address problems in economically developing countries.