How machine learning can improve understanding of global poverty
One of the biggest challenges in fighting poverty is the lack of reliable information. In order to aid the poor, agencies need to map the dimensions of distressed areas and identify the absence or presence of infrastructure and services. But in many of the poorest areas of the world such information is rare.
“There are very few data sets telling us what we need to know,” says Marshall Burke, an assistant professor of Earth system science at Stanford University. “We have surveys of a limited number of households in some countries, but that’s about it. And conducting new surveys in hard-to-reach corners of the world, such as parts of sub-Saharan Africa, can be extremely time-consuming and expensive.”
A new technique to map poverty offers cause for hope. It’s based on millions of high-resolution satellite images of likely poverty zones. To analyze these images, the researchers used machine learning, a discipline within the broader field of artificial intelligence.
In machine learning, scientists provide a computational model with raw data and an objective—but do not directly program the system to solve the problem. Instead, the idea is to design an algorithm that learns how to solve the puzzle by combing through the data without direct human intervention.
The researchers began their poverty-mapping project knowing that nighttime lights provide an excellent proxy for economic activity by revealing the presence of electricity and the creature comforts it represents. That was half of the raw data that their system needed.
“Basically, we provided the machine-learning system with daytime and nighttime satellite imagery and asked it to make predictions on poverty,” says Stefano Ermon, assistant professor of computer science. “The system essentially learned how to solve the problem by comparing those two sets of images.”
The method is a variant of machine learning known as transfer learning. Ermon likens this to how the skills for driving a car are transferable to riding a motorcycle. In the case of poverty mapping, the model used daytime imagery to predict the distribution and intensity of nighttime lights—and hence relative prosperity.
It then “transferred” what it learned to the task of predicting poverty. It did this by constructing “filters” associated with different types of infrastructure that are useful in estimating poverty. The system did this time and again, making day-to-night comparisons and predictions and constantly reconciling its machine-devised analytical constructs with details it gleaned from the data.
“As the model learns, it picks up whatever it associates with increasing light in the nighttime images, compares that to daytime images of the same area, correlates its observations with data obtained from known field-surveyed areas and makes a judgment,” says David Lobell, an associate professor of Earth system science.
Those judgments were exceptionally accurate. “When we compared our model with predictions made using expensive field-collected data, we found the performance levels were very close,” Ermon says.
Highly effective machine-learning models can be very complex.
The model the team developed has more than 50 million tunable, data-learned parameters. So although the researchers know what their mapping model is doing, they don’t know exactly how it is doing it.
“To a very real degree we only have an intuitive sense of what it is doing,” says Lobell. “We can’t say with certainty what associations it is making, or precisely why or how it is making them.”
Ultimately, the researchers believe, this model could supplant the expensive and time-consuming ground surveys currently used for poverty mapping.
“This offers an unbelievable opportunity for cheap, scalable, and surprisingly accurate measurement of poverty,” Burke says. “And the beauty with developing and working with these huge data sets is that the models should do a better and better job as they accumulate more and more information.”
The availability of information is something of a limiting factor. Right now satellite coverage of impoverished areas is spotty. More imagery, acquired on a more consistent basis, would be needed to give their system the raw material to take the next step and predict whether locales are inching toward prosperity or getting further bogged down in misery.
But such data restraints could soon be lifted—or at least mitigated.
“There’s a huge number of new high-resolution satellite images that are being taken right now that should be available in the next 18 months,” Burke says. “That should help us predict in time as well as space. Also, there are several micro-sat companies that plan to provide images of the planet almost daily, and we’re rapidly getting enough satellites up to do that.
“I don’t think it will be too long before we’re able to do cheap, scalable, highly accurate mapping in time as well as space.”
Even as they consider what they might be able to do with more abundant satellite imagery, the researchers are contemplating what they could do with different raw data—say, mobile phone activity. Mobile phone networks have exploded across the developing world, says Burke, and he can envision ways to apply machine-learning systems to identify a wide variety of prosperity indicators.
“We won’t know until we try,” Lobell says. “The beauty of machine learning in general is that it’s very useful at finding that one thing in a million that works. Machines are quite good at that.”
The team detailed their approach in a paper for the proceedings of the 30th AAAI Conference on Artificial Intelligence.
SOURCE: World Economic Forum