By Matteo Luccio, founder and president of Pale Blue Dot
The current explosion in the variety, capability and sheer number of sensors is generating a flood of data intended to enhance decision-making and human-system performance. However, this vast amount of data can easily lead to information overload and obscure the most relevant aspects of a situation. This is a critical problem for many users, most notably war fighters, who need to make rapid and correct tactical decisions. Increasingly, sensor fusion algorithms are merging and analyzing geospatial data to help extract actionable intelligence.
Sensor fusion can be accomplished in several ways. According to Alina Zare, an assistant professor in the Electrical & Computer Engineering Department at the University of Missouri, sensor fusion can include:
Sean Anklam, president of Exogenesis (https://exogenesis.solutions), a company that specializes in advanced analytics, algorithm development, and predictive modeling and simulation, cites site-selection modeling for solar energy panels as an example of sensor fusion’s power. Using a single data source for the entire United States—say a digital elevation model from which one can extract slope and aspect—one might find that locations which are both flat to mildly sloped and face south are the best for solar panels. One could refine the results by adding weather data, such as rainfall, humidity and temperature, and by calculating the mean incident radiation across the country to find the regions most exposed to solar energy.
“Those are all examples of fusing traditional geospatial datasets to arrive at a better answer,” says Anklam. “However, what that model doesn’t take into account is that in the United States it’s far better to build westward-facing solar panels, because peak electricity consumption occurs in the afternoon and evening, when people are home, watching TV, cooking, doing laundry and running the air conditioning. This is why completely expansive data fusion is so important.”
Sensor fusion is at the core of three of the seven U.S. Department of Defense (DOD) science and technology planning priorities for 2013-2017:
Data to Decisions—science and applications to reduce the cycle time and manpower requirements for analyzing and using large data sets
Cyber Science and Technology—science and technology for efficient, effective cyber capabilities across the spectrum of joint operations
Human Systems—science and technology to enhance human-machine interfaces to increase productivity and effectiveness across a broad range of missions
Sensor fusion is also an important factor in DOD’s other four priorities for the current five-year period (engineered resilient systems, electronic warfare/electronic protection, counter weapons of mass destruction and autonomous systems).
Sensor fusion and other automated algorithms can greatly benefit war fighters by drawing their attention to important features in the data or to key anomalies, according to Zare.
“Often, even with just one sensor, an individual can miss key information without algorithmic help due to variation both within and between operator performance,” she explains. “Given all of this variation with ‘human-in-the-loop’ analysis, automated/algorithmic analysis of incoming sensor data can provide consistent, reliable performance and help mitigate much of these variations. If you have many sensors that provide consistent or orthogonal [i.e., statistically independent] information, algorithms that can fuse the information from all of them and highlight important information for an operator are needed and can be extremely helpful.”
Although the military leverages every type of sensor conceivable, it relies primarily on imaging satellites and aircraft to gather data. During the last decade, U.S. armed forces have operated predominantly in what in military jargon are known as “permissive environments.” For example, in Iraq and Afghanistan U.S. military aircraft aren’t at significant risk of being jammed, let alone shot down, and have been able to collect intelligence, surveillance and reconnaissance (ISR) data fairly easily. However, in the future, U.S. forces may be operating in more contested domains, where it will be much harder to collect ISR data.
“I think that’s the key driver pushing us to look at new ways of integrating the kinds of data we have,” says Mica Endsley, chief scientist with the U.S. Air Force. “That’s where I see unique innovation occurring now.”
A concept closely related to sensor fusion, “big data,” which is now a buzzword, already had become a concern for the defense and intelligence communities prior to 9/11. According to Anklam, for data to be truly big, it must be produced rapidly, in large quantities, from multiple sources and from widely variable pedigrees. Many entities within DOD and the intelligence agencies draw a distinction between traditional big data and “big geodata.”
“Unlike big data, big geodata is concerned with information that’s always spatially and temporally enabled,” explains Anklam. “It primarily consists of geospatial content—imagery collected from satellites and aircraft, GPS, maps, charts, surveys and geographic information systems.”
On the battlefield, Anklam points out, the “Internet of things” already existed more than a decade ago, and DOD was already contending with big data as it introduced new space, air, ground and maritime sensors as well as acquired many satellites, unmanned aircraft systems and other collection platforms.
“UAVs alone were collecting more than 8,000 man hours of data daily,” says Anklam. “Individual marines and soldiers were transformed into data collectors via handheld GPS units and blue force trackers, all of which were similar to the mobile devices and handsets available in the commercial world now. A geosocial universe evolved in Iraq and Afghanistan. Use cases for big data and the need for data fusion expanded and changed daily.”
Research on sensor fusion has been under way for at least 25 years.
“It’s a multifaceted problem,” explains Endsley. “Part of it relates to getting data on the same network or into the same database. Really, that’s only half the problem. The other half relates to transforming data that comes from a wide variety of sources into information the user or decision-maker needs.”
However, as Endsley points out, sensor fusion researchers often fail to integrate data in useful ways, leaving end users inundated with data and struggling to find in it the information they need.
“One reason sensor fusion researchers have had a hard time with this is because they’ve tried to do it from a data up standpoint,” she says. “They haven’t really understood how users need to use the information. The key opportunities in this field are in marrying it with a lot of what’s gone on in the cognitive engineering field, which has good methodologies for understanding user goals and decisions and their real situation-awareness requirements.”
According to Endsley, such requirements should guide data integration and fusion. Good methodologies for how to do that include goal-directed task analysis, which makes it possible to clearly map data users’ goals, decisions and situation-awareness requirements as well as guide fusion researchers in “deep, meaningful integrations” of data.
“If we can marry techniques like that with a lot of the new computer science techniques,” she explains, “I think we can make strong progress in getting through the information glut that exists out there.”
Another challenge with sensor fusion is ensuring the best sensors are being used, according to Zare.
“For any given problem/application, often only a small subset of sensors is applicable,” she says.
“Incorrect sensor selection may provide confusing/distracting information. I believe the best approach depends on the particular application and sensor. So, for a given set of sensors and a given problem (e.g., detecting explosive objects vs. mapping an area for scene exploration), a different sensor fusion approach may be best. A significant amount of application-specific algorithmic development needs to be conducted for effective and reliable sensor fusion. A single approach that encompasses all possible sets of sensors is unlikely to perform robustly for any given problem.”
A key concern in sensor fusion is the different levels of accuracy and reliability of the data, which are derived from many different kinds of sensors as well as human intelligence. Additionally, in the military and intelligence communities, many agencies and service branches produce and use geospatial data according to their own standards and practices.
“For data fusion to work, considerable pedigree has to be maintained in how disparate data sources are collected, processed, stored and analyzed,” explains Anklam.
“If the reliability of fused data gets lost, then you find that people actually have difficulty using that information,” says
Endsley. “How much confidence can be placed in a piece of information is a critical piece of situation awareness itself. So, when we start fusing data, it becomes really important to maintain the pedigree of that information in ways that will allow the user to understand how much confidence they should place in the fused data source.”
Tagging each piece of data in a database with a measure of its reliability is a straightforward computer science problem. However, it’s hard to then communicate that confidence level to human beings.
“You can’t just say this thing is 57 percent reliable or 82 percent reliable,” claims Endsley. “People don’t deal with digital probabilities that well. That information has to be readily accessible and almost inherent in the data. For instance, if you’re displaying the location of an aircraft, you can display a certain amount of fuzziness if it’s coming from sources that don’t actually have location precision. You don’t want to pretend that it’s precise and actually mislead people into understanding they have what they don’t have. That information can be encoded in symbology.”
According to Anklam, the veracity of data from vendors such as DigitalGlobe is unimpeachable.
“They have thoroughly calibrated sensors and data with traceable processing steps, and they adhere to vetted data standards,” he says. “But for every DigitalGlobe, there are thousands of data producers creating data of unknown quality.
Much of this data could be tremendously useful in a fusion context, but if you can’t verify its quality or trust its origin, you can’t use it.”
“The more advanced the sensor is the more difficult it becomes to create a portable ‘reliability’ score that is actually technically correct between two sensors,” says Nima Negahban, chief technology officer at GIS Federal (www.gisfederal.com).
According to Zare, the geospatial industry needs fusion methods that incorporate the accuracy and reliability of a particular sensor given a particular problem. In fact, this is one of her major research interests, trying to answer questions such as the following: How can we estimate the accuracy and reliability of a data source? How can we make use of inaccurate and unreliable data? How can we fuse data with varying levels of uncertainty, imprecision and incompleteness in an automated way to present information and highlight key features to an operator?
According to Endsley, the main challenge in scaling up sensor fusion isn’t due to the explosion in the number and variety of sensors but in the many different types of users of this information.
“Your challenge becomes not so much one of the proliferation of technologies but one of having individually tailored algorithms that are specifically focused on the needs of individual decision-makers,” she says. “That’s where we need to focus.”
Explains Anklam, “The difficult part is maintaining consistency in the fusion algorithms being used, common data fusion spaces, and predicting and adapting to every type of sensor and platform.”
As new types of sensors and platforms come online every day, one of the biggest limiting factors to processing imagery at scale is a user’s ability to access new collections as soon as they’re collected. Traditionally, commercial users have to log into a Web portal, search for images, contact third-party sales people and wait for a price quote.
“To truly process big geodata, more content providers need to invest in cloud-ready infrastructure to allow frictionless accessibility,” explains Anklam. “Imagery and other geospatial datasets need to be instantly discoverable and buyable.”
According to Negahban, the initial hurdle in sensor fusion has been processing scale, flexibility and visualization.
“We have worked hard on solving the latter problem,” he says, “and we’re trying to understand how to integrate with standards to achieve better automatic conceptual alignment as data is ingested into our platform. Making sensor fusion algorithms scalable will take innovation in data structures, programming concurrency models and semantic ontologies.”
According to Zare, the key issues with scaling depend on the approach one takes to sensor fusion.
“For example, if you’re trying to do target detection offsite after collecting aerial imagery,” she says, “you may have access to large computing power and can leverage advances in high-performance computing. Thus, algorithmic computational complexity may be less of an issue than if you were attempting target detection in real time while flying over an area. In the second case, you’ll have constraints on computational complexity, as you need to process that data as it comes in to be able to make real-time decisions.”
What would help end users the most, says Negahban, are “large-scale semantic data models that can intelligently and flexibly relate two entities in a given context, high-speed distributed data processing systems that can process user queries at run time without a-priori indexing at scale and the ability to create high visualizations for the end user at run time.”
When looking at the future of sensor fusion, Endsley points to improvements in dealing with things like wide-area motion imagery data.
“We need to process a lot more motion imagery as well as stills,” she says, “and be able to deal with integrating that with geographical information and information coming from other types of sensors.”
By Alina Zare, assistant professor, Electrical & Computer Engineering Department, University of Missouri (http://engineering.missouri.edu/ece), Columbia, Mo.
An interesting area of sensor fusion research is considering the relationships among sensor responses and how they change in different contexts.
For example, if Algorithm A applied to Sensor 1 and Algorithm B applied to Sensor 2 agree that Target Type 1 is present in an image, after fusion you may have high confidence the target is in the scene. However, if algorithmic responses on sensors diverge, you may trust one sensor more than the other for certain targets in certain contexts/environments (e.g., thermal sensors may be more reliable for a particular problem, such as detecting vehicles at night).
Such relationships are based on physics and/or features of the algorithms applied to each sensor. A relationship may change depending on context (daytime vs. nighttime, under tree canopy vs. in the open). Being able to learn the relationships among all sets of sensors, as well as all possible algorithms and all contexts, requires a thorough understanding of a problem’s physics and significant amounts of training data for various problems and scenarios.
Collecting and accurately labeling such training data to develop sensor fusion methods can be time consuming and difficult to construct in a statistically reliable way. As the number of sensors increases, the amount of training data needed to characterize all the relationships increases exponentially.
However, such training data is essential to support algorithm development for sensor fusion so the fusion methods can be trained and adequately tested to understand each system’s capabilities and limits. Algorithmic techniques, such as sparsity promotion or automatically identifying data contexts, can help manage the inherent complexity of a multisensor environment. Building machine-learning methods that can tolerate incomplete, uncertain and unspecific labels can make more data usable for learning.