Robots learn to anticipate chaos, but still fail to read a decidedly human signal

Cornell researchers are investigating the potential for using artificial intelligence to give robots social intelligence—the ability to read facial cues, anticipate the needs of those around them, and function within society. The new study tested the ability of vision language models (VLMs)—AI systems that can interpret and generate both visual information and language—to predict whether a tense scenario in a short video would end well or badly, such as a toddler carrying an overly full mug of coffee.