Alan Fern and Margaret Burnett walking together.

Pulling back the curtain on neural networks

Introduction

When researchers at 精东影视created new tools to evaluate the decision-making algorithms of an advanced artificial intelligence system, study participants assigned to use them did, indeed, find flaws in the AI鈥檚 reasoning. But once investigators instructed participants to use the tools in a more structured and rigorous way, the number of bugs they discovered increased markedly. 

鈥淭hat surprised us a bit, and it showed that having good tools for visualizing and interfacing with AI systems is important, but it鈥檚 only part of the story,鈥 said Alan Fern, professor of computer science at 精东影视 State.

Since 2017, Fern has led a team of eight computer scientists funded by a four-year, $7.1 million grant from the Defense Advanced Research Projects Agency to develop explainable artificial intelligence, or XAI 鈥 algorithms through which humans can understand, build trust in, and manage the emerging generation of artificial intelligence systems. 

Dramatic advancements in the artificial neural networks, or ANNs,  at the heart of advanced AI have created a wave of powerful applications for transportation, defense, security, medicine, and other fields. ANNs comprise tens of thousands, even millions, of individual processing units. Despite their dazzling ability to analyze mountains of data en route to learning and solving problems, ANNs operate as 鈥渂lack boxes鈥 whose outputs are unaccompanied by decipherable explanations or context. Their opacity baffles even those who design them, yet understanding an ANN鈥檚 鈥渢hought processes鈥 is critical for recognizing and correcting defects. 

For trivial tasks 鈥 choosing movies or online shopping 鈥 explanations don鈥檛 much matter. But when stakes are high, they鈥檙e vital. 鈥淲hen errors can have serious consequences, like for piloting aircraft or medical diagnoses, you don鈥檛 want to blindly trust an AI鈥檚 decisions,鈥 Fern said. 鈥淵ou want an explanation; you want to know that the system is doing the right things for the right reasons.鈥

In one cautionary example, team member and Assistant Professor of Computer Science Fuxin Li developed an XAI algorithm that revealed serious shortcomings in a neural network trained to recognize COVID-19 from chest X-rays. It turned out that the ANN was using, among other features, a large letter 鈥淩,鈥 which simply identified the right side of the image, in its classification of the X-rays. 鈥淲ith a black-box network, you鈥檇 never know that was the case,鈥 he said. 

To pull back the curtain on neural networks, Fern and his colleagues created Tug of War, a simplified version of the popular real-time strategy game StarCraft II and which involves multiple ANNs. In each game, two competing AI 鈥渁gents鈥 deploy their forces and attempt to destroy the opposing bases. 

In one study, human observers trained to evaluate the game鈥檚 decision-making process watched replays of multiple games. At any time, they could freeze the action to examine the AI鈥檚 choices using the explanation user interface tools. The interface displays information such as the actions an agent considered; the predicted outcomes for each considered action; the actions taken; and the actual outcomes. For example, large discrepancies between a predicted and actual outcome 鈥 or suspect strategic choices 鈥 indicate errors in the AI鈥檚 reasoning. 

The aim is to find bugs in the AI, particularly any dubious decisions that lead to losing a game. If, for instance, the ANN believes that damaged bases can sometimes be repaired (they can鈥檛), then decisions based on that belief may be flawed. 鈥淭he interface allows humans who aren鈥檛 AI experts to spot such common-sense violations and other problems,鈥 Fern said.  

At first, the reviewers were free to explore the AI using whatever ad hoc approach they chose, which resulted in a high variance of success rates across study participants. 鈥淭hat suggests that a structured process is an important component for being successful when using the tools,鈥 Fern said. 

So, the researchers added an after-action review to the study. An AAR is a well-established protocol established by the military to analyze what happened and why after missions. Using an AAR designed specifically to assess AI, study participants identified far more bugs with greater consistency. 鈥淭he results impressed the people at DARPA, which extended our funding for an additional year,鈥 Fern said. 

Throughout the project, the researchers also emphasized the human factors of XAI 鈥 another reason for DARPA鈥檚 continued interest. 鈥淲hen you鈥檙e explaining AI, you鈥檙e explaining it to a person, and you have to be sure they鈥檙e getting the greatest benefit from those explanations,鈥 said team member Margaret Burnett, Distinguished Professor of computer science, who noted that  attention to humans guided the development of the interface tools. 鈥淓xplainable AI is not something you produce or consume. It鈥檚 an educational experience, and the bottom line is that we need to focus on helping the humans to solve problems.鈥 

As they complete their work during the DARPA contract extension, Fern and Burnett, two of the original grantees, are seeking partners with whom to further validate the strategy of applying after-action reviews to the explainable AI interface tools.

In addition to collaborations with government and the military, they鈥檙e interested in pursuing connections in other important AI application domains, including agriculture, energy systems, and robotics. Fern and Burnett, along with 11 colleagues at 精东影视 State, recently became involved with a federally funded, $20 million AI institute for agriculture that will tackle some of the industry鈥檚 greatest challenges. Explainable AI will be part of the institute鈥檚 work. 

Dec. 30, 2021

Related People

Margaret Burnett

Margaret Burnett

Distinguished Professor

Alan Fern.

Alan Fern

Professor

Fuxin Li

Fuxin Li

Associate Professor

Related Stories