Awkward—A.I. Struggles to Understand Human Social Interactions, Study Finds

While A.I. excels at solving complex logical problems, it falls short when understanding social dynamics. A new study by researchers at Johns Hopkins University reveals that A.I. systems still struggle with reading the nuances of human behavior, a skill crucial for real-world applications like robotics and self-driving cars.

To test A.I.’s ability to navigate human environments, researchers designed an experiment in which both humans and A.I. models watched short, three-second videos of groups of people interacting at varying levels of intensity. Each participant—human or machine—was asked to rate how intense the interactions appeared, according to findings presented last week at the International Conference on Learning Representations.

When it comes to technologies like autonomous vehicles, the stakes are high, because human drivers make decisions based on not just traffic signals, but also predicting how other drivers will behave. “The A.I. needs to be able to predict what nearby people are up to,” said study co-author Leyla Isik, a cognitive science professor at Johns Hopkins. “It’s vital for the A.I. running a vehicle to be able to recognize whether people are just hanging out, interacting with one another or preparing to walk across the street.”

The experiment revealed a stark difference between human and machine performance. Among the 150 human participants, evaluations of the videos were remarkably consistent. In contrast, the 380 A.I. models’ assessments were scattered and inconsistent, regardless of their sophistication.

Dan Malinsky, a professor of biostatistics at Columbia University, said the study highlights key limitations of current A.I. technology, particularly “when it comes to predicting and understanding how dynamic systems change over time,” he told Observer.

Understanding the thinking and emotions of an interaction involving multiple people can be challenging even for humans, said Konrad Kording, a bioengineering and neuroscience professor at the University of Pennsylvania. “There are many things, like chess, that A.I. is better at and many things we might be better at. There are lots of things I would never trust an A.I. to do and some I wouldn’t trust myself to do,” Kording told Observer.

Researchers believe the problem may be rooted in the infrastructure of A.I. systems. A.I. neural networks are modeled after the part of the human brain that processes static images, which is different from the area of the brain that processes dynamic social scenes.

“There’s a lot of nuances, but the big takeaway is none of the A.I. models can match human brain and behavior responses to scenes across the board, like they do for static scenes,” Isik said. “I think there’s something fundamental about the way humans are processing scenes that these models are missing.”

“It’s not enough to just see an image and recognize objects and faces. That was the first step, which took us a long way in A.I. But real life isn’t static. We need A.I. to understand the story that is unfolding in a scene. Understanding the relationships, context, and dynamics of social interactions is the next step, and this research suggests there might be a blind spot in A.I. model development,” said Kathy Garcia, a co-author of the study.