By Holli Riebeek, NASA Earth Observatory (http://earthobservatory.nasa.gov), Greenbelt, Md.
When Rebecca Moore, a Google employee, received a formal notice in 2006 about a proposed forest-thinning initiative, she nearly tossed it out. The letter included an unintelligible black-and-white topographic map of her central California community. The map marked the location of the forest and the path helicopters would take as they hauled logs off the steep hillsides. Just before the proposal hit the trash, it captured her curiosity.
“I wanted to understand what this map was saying, because I wanted to know if my community was threatened,” recalls Moore. “We have the largest remaining stand of old-growth redwood trees in Santa Clara County.”
Moore wondered how the map might look in the newly released Google Earth. When she looked at the forest-thinning plan on the satellite-based map, it became clear that helicopters would be towing trees over several high-traffic areas, including a school. The Google map helped persuade the community to stop the thinning initiative.
The experience got Moore thinking: Could Google Earth do more? Would it be possible to combine the satellite data behind Google Earth with the company’s powerful data-sorting algorithms and cloud computing?
“YouTube did a major machine learning (analysis) of the corpus of YouTube videos, and they found there are a lot of cats,” says Moore. That analysis is just one example of a growing phenomenon known as “big data” in which computers crunch vast quantities of information to identify useful patterns. Moore wondered: What patterns would emerge if you applied the big data approach to Earth imagery?
It’s a question Earth scientists also have been asking. In 2008, the U.S. Geological Survey took 3.6 million images acquired by Landsat satellites and made them free and openly available on the Internet. Dating back to 1972, the images are detailed enough to show the impact of human decisions on the land, and they provide the longest continuous view of Earth’s landscape from space.
“With the full Landsat record available, we can finally look at really big problems, like the global carbon cycle,” says Jeff Masek, a Landsat 7 project scientist and a researcher at NASA’s Goddard Space Flight Center.
Because carbon dioxide gas amplifies greenhouse warming, understanding how it moves into and out of the atmosphere through the carbon cycle is central to understanding Earth’s climate. Forests store carbon, and the Landsat series of satellites offers the most consistent, detailed and global means to measure changes in forest health—both natural and human-caused, such as deforestation. But to get a sense of how much carbon is entering the atmosphere from forests, scientists have to figure out how to sort through petabytes of data.
Evan Brooks, a statistician turned forestry scientist at Virginia Tech University, began his expedition into the Landsat record with a single forest. In western Alabama, row after row of loblolly pine trees grow in tall straight lines. For the Westervelt Co., that forest is a crop. For Brooks, the forest is a proving ground for a new method of understanding Landsat measurements.
“In the past, if we wanted to look for change, we would find a cloud-free scene and then find another cloud-free scene from the same season in another year,” explains Brooks. Before 2008, Earth scientists didn’t try anything more complex, because Landsat data cost as much as $600 per scene. Once all of the Landsat data became available for free, the statistician in Brooks wondered if he could find a better way to identify change by using the entire record.
“I didn’t want to think about still-lifes anymore,” he says. “I am now treating each Landsat measurement like a frame in a movie. When you start seeing things moving through time, you get a deeper sense of what is happening. If a picture is worth a thousand words, then the movie is worth a thousand pictures.”
To build his movie of change in Westervelt’s forest, Brooks broke each Landsat scene into its smallest elements. Landsat satellites measure reflected and emitted light, both visible and infrared, for each 30-square-meter parcel of land on Earth. This area, about the size of a baseball diamond, makes up the smallest element of a Landsat image: a pixel.
Brooks gathered every Landsat measurement of the Westervelt forest taken between 2009 and 2011 (roughly one image every two weeks) and charted the raw measurements for each pixel as a line graph. A forest, field or city block changes in predictable ways from year to year—greening in the spring, browning in the winter—so the measured values of light from a particular pixel take a curved shape that is pretty much the same from year to year.
“But where we see odd jumps or leaps, we know that some change occurred,” notes Brooks.
Westervelt keeps precise records of harvests and thinning and planting. When Brooks began to match his time-series, movie-like image analysis with the company’s records, he found he could observe even subtle changes. Because even a small change in the landscape alters the quantity and type of light reflected from a pixel, the light-curve approach allowed Brooks to detect changes smaller than a single Landsat pixel.
“Previous Landsat analyses might see the change from a forest to a golf course,” he says. “We can see the removal of 10 percent of a forest stand, well below Landsat’s pixel resolution.”
That level of detail is important, because human development and land management causes many of the changes Brooks sees.
“The scale of human development is getting finer,” says Randy Wynne, a geographer at Virginia Tech. “In Virginia, the average rural parcel size is under 70 acres, and it’s dropping. Some 85 percent of forests are privately owned in the eastern United States. If you want to see human influence in most of the world’s forests, you need this kind of analysis. Nothing else can get at it.”
Brooks can run his forest analyses on a laptop, but he can only survey a small area. How do you scale up a pixel-based analysis to figure out how forests are changing around the world across four decades?
“The issue is scaling it up,” says Robert Kennedy, a remote sensing scientist at Boston University who has developed a similar analysis tool. “Even if you have a simple algorithm, you need a lot of computing to manage all of the data.”
How much data?
“There are roughly 400 billion land pixels in a single global mosaic,” says Rama Nemani, a scientist who runs the NASA Earth Exchange (NEX), a supercomputing collaborative at NASA’s Ames Research Center. With at least one image of every location on Earth per season every year, the entire 43-year Landsat record contains more than 50 trillion pixels.
“How could you handle that on your desktop?” asks Rebecca Moore. “You can’t. This is where cloud computing comes in.”
It’s a conclusion that some University of Maryland scientists reached over the course of two decades. Matthew Hansen and Sam Goward are geographers and remote sensing specialists who have been part of a team mapping Earth’s land cover—forests and cities, farms and water—since the mid-1990s.
“We wanted to know the impact of disturbance—harvesting, thinning, fires, storms—things that lead to changes in forests,” explains Goward. “Every time you disturb a forest, it restarts the growth cycle, and when you do that, you impact the carbon cycle. Very few forests make it through a full growth cycle because of disturbances, but no one knows the patterns of forest disturbance or how they impact the carbon cycle.”
For years, Goward and Hansen worked with low-resolution data that didn’t have a lot of detail. But disturbance happens on a small scale, and to see it they needed something like Landsat’s 30-meter resolution. Until 2008, Landsat data were too expensive to consider a global map.
“We did the science we could afford, not the science we wanted to do,” says Goward.
Then in 2008, the game started to change.
“When the Landsat archive opened up, we mapped forests in Indonesia and European Russia,” says Hansen. “We then knew we could make a global map, but we didn’t have the computing power yet.”
Finally, while attending an international meeting about deforestation and forest disturbance, Hansen was introduced to Google’s Rebecca Moore. He saw an opportunity.
“[Google’s] computing expertise fit perfectly with our geographic knowledge,” explains Hansen. “So we ported our code for mapping forests to the Google system.”
In just a couple of days, a Google team applied the University of Maryland code to 700,000 Landsat scenes, discarding cloudy pixels and keeping clear pixels. The team analyzed the remaining sequence of pixels and assigned a flag to each—was it forested or not? The analysis noted the date that forests were cleared or the date when they had grown-in enough to be counted as forest again. The entire process took 1 million hours on 10,000 central processing units (CPUs). According to Moore, the analysis would have taken 15 years on a single computer.
The resulting map, released in 2013, shows how Earth’s forests changed between 2000 and 2013.
“It is the first global assessment of forest change in which you can see the human impact,” says Masek. And the message is: People have had a huge impact on forests.
“Less than 1 percent of old-growth forest remains in the United States,” says Hansen.
But the real surprise was how quickly tropical forests are disappearing. For example, according to Hansen, Brazil has deservedly gotten a lot of credit for reducing its deforestation rate in the past decade, but forest cover loss has increased so much in other tropical countries that the global rate is soaring. Such a revelation wouldn’t have been possible without the “big data” approach.
“In the past, we were confounded by clouds in the tropics,” relates Hansen. “Being able to mine the full Landsat archive allowed us to literally see places we haven’t seen before at this resolution. We have cloud-free data built from thousands of inputs over tropical locations like Gabon or Papua New Guinea.”
Apart from revealing patterns and trends in forest cover, the global forest map represents a major shift in the way Earth science is done.
“In the past, I used to bring data to my computer and analyze it,” notes Masek. “Now it’s impossible to bring all the data to my computer.”
Instead, scientists develop analysis tools and bring them to computation workhorses like Google Earth Engine or NEX. Until 2008, only 4 percent of the Landsat archive had been examined; since the opening of the archives, the big data approach is allowing scientists to dig deeper into all of the data. This allows them to make connections they couldn’t make before.
“You can look at changes over time and see how one process affects another,” says Kennedy. “We are now able to ask questions about where, when and why two processes interact.”
Kennedy has started to map biomass yearly, allowing him to quantify how much carbon is lost every year due to fire or clearing. And for the first time, he can ask questions like the following: Is the system responding differently now than it has in the past? Are we losing more carbon to fires? To insects? Are forests growing back more quickly?
And the scale and scope can grow even wider. For instance, knowing how forests have changed leads to other questions about global change.
Asks Masek, “How much carbon is going into the atmosphere through forest clearing and management? How are ecosystems changing because of climate change? What are the vegetation patterns of the planet going to look like in 200 to 300 years? Landsat gives a 40-year synopsis of what has happened, and that not only lets us see how the forests are changing now, but it could help us understand how life on Earth will change in the future.”
“We are doing statistics on the planet,” says Moore. “I’m really curious to see what people find in all of this satellite data.”
Editor’s Note: For more information on big data and image analysis, see “Orbital Insight Tackles Global Trends Through Advanced Image"