How Scientists Are Using AI to Talk to Animals

Portable sensors and artificial intelligence are helping researchers decode animal communication—and begin to talk back to nonhumans

By Sophie Bushwick

Portrait of Karen Bakker in nature — Irene Rinaldi

In the 1970s a young gorilla known as Koko drew worldwide attention with her ability to use human sign language. But skeptics maintain that Koko and other animals that “learned” to speak (including chimpanzees and dolphins) could not truly understand what they were “saying”—and that trying to make other species use human language, in which symbols represent things that may not be physically present, is futile.

“There's one set of researchers that's keen on finding out whether animals can engage in symbolic communication and another set that says, ‘That is anthropomorphizing. We need to understand nonhuman communication on its own terms,’” says Karen Bakker, a professor at the University of British Columbia and a fellow at the Harvard Radcliffe Institute for Advanced Study. Now scientists are using improved sensors and artificial-intelligence technology to observe and decode how a broad range of species, including plants, already share information with their own methods. This field of “digital bioacoustics” is the subject of Bakker's 2022 book The Sounds of Life: How Digital Technology Is Bringing Us Closer to the Worlds of Animals and Plants (Princeton University Press).

Scientific American spoke with Bakker about how technology can help humans communicate with creatures such as bats and honeybees—and how these conversations are forcing us to rethink our relationship with other species.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

[An edited transcript of the interview follows.]

Can you give us a brief history of humans attempting to communicate with animals?

There were numerous attempts in the mid-20th century to try to teach human language to nonhumans, primates such as Koko. And those efforts were somewhat controversial. As we look back, one view we have now (that may not have been so prevalent then) is that we were too anthropocentric in our approaches. The desire then was to assess nonhuman intelligence by teaching nonhumans to speak like we do—when in fact we should have been thinking about their abilities to engage in complex communication on their own terms, in their own embodied way, in their own worldview.

One of the terms used in the book is the notion of umwelt, which is this idea of the lived experience of organisms. If we are attentive to the umwelt of another organism, we wouldn't expect a honeybee to speak human language, but we would become very interested in the fascinating language of honeybees, which is vibrational and positional. It's sensitive to nuances such as the polarization of sunlight that we can't even begin to convey with our bodies. That is where the science is today. The field of digital bioacoustics—which is accelerating exponentially and unveiling fascinating findings about communication across the tree of life—is now approaching these animals and asking not “Can they speak like humans?” but “Can they communicate complex information to one another? How are they doing so? What is significant to them?” I would say that's a more biocentric approach, or at the very least it's less anthropocentric.

Taking a bigger view, I think it's also important to acknowledge that listening to nature, “deep listening,” has a long and venerable tradition. It's an ancient art that is still practiced in an unmediated form. There are long-standing Indigenous traditions of deep listening that are deeply attuned to nonhuman sounds. So if we combine digital listening—which is opening up vast new worlds of nonhuman sound and decoding that sound with artificial intelligence—with deep listening, I believe that we are on the brink of two important discoveries. The first is language in nonhumans. And that's a very controversial statement, which we can dig into. The second is: I believe we're at the brink of interspecies communication.

What kind of technology is enabling these breakthroughs?

Digital bioacoustics relies on very small, portable, lightweight digital recorders, which are like miniature microphones that scientists are installing everywhere from the Arctic to the Amazon. You can put these microphones on the backs of turtles or whales. You can put them deep in the ocean or on the highest mountaintop or attach them to birds. They can record continuously, 24/7, in remote places scientists cannot easily reach, even in the dark, and without the disruption that comes from introducing human observers in an ecosystem.

That instrumentation creates a data deluge, and that is where artificial intelligence comes in—because the same natural-language-processing algorithms that we are using to such great effect in tools such as Google Translate can also be used to detect patterns in nonhuman communication.

What's an example of these communication patterns?

In the bat chapter where I discuss the research of Yossi Yovel of Tel Aviv University, there's a particular study in which his team monitored [nearly two] dozen Egyptian fruit bats for two and a half months and recorded their vocalizations. They then adapted a voice-recognition program to analyze [15,000 of] the sounds, and the algorithm correlated specific sounds with specific social interactions captured via videos—such as when two bats fought over food. Using this, the researchers were able to classify the majority of bats' sounds. That is how Yovel and other researchers such as Gerry Carter of the Ohio State University have been able to determine that bats have much more complex language than we previously understood. Bats argue over food; they distinguish between genders when they communicate with one another; they have individual names, or “signature calls.” Mother bats speak to their babies in an equivalent of “motherese.” But whereas human mothers raise the pitch of their voices when talking to babies, mother bats lower the pitch—which elicits a babble response in the babies that learn to “speak” specific words or referential signals as they grow up. So bats engage in vocal learning.

That's a great example of how deep learning is able to derive these patterns from this instrumentation, all of these sensors and microphones, and reveal to us something that we could not access with the naked human ear. Because most of bat communication is in the ultrasonic, above our hearing range, and because bats speak much faster than we do, we have to slow it down to listen to it, as well as reduce the frequency. So we cannot listen like a bat, but our computers can. The next insight is that our computers can also speak back to the bat. The software produces specific patterns and uses those to communicate back to the bat colony or to the beehive, and that is what researchers are now doing.

How are researchers talking to bees?

The honeybee research is fascinating. A researcher named Tim Landgraf of Freie Universität Berlin studies bee communication, which, as I mentioned earlier, is vibrational and positional. When honeybees “speak” to one another, it's their body movements, as well as the sounds, that matter. Now computers, and particularly deep-learning algorithms, are able to follow this because you can use computer vision, combined with natural-language processing. They have now perfected these algorithms to the point where they're actually able to track individual bees, and they're able to determine what impact the communication of an individual might have on another bee. From that emerges the ability to decode honeybee language. We found that they have specific signals. Researchers have given these signals funny names. Bees toot; they quack. There's a “hush” or “stop” signal, a whooping “danger” signal. They've got piping [signals related to swarming] and begging and shaking signals, and those all direct collective and individual behavior.

The next step for Landgraf was to encode this information into a robot that he called RoboBee. Eventually, after seven or eight prototypes, he came up with a “bee” that could enter the hive, and it would essentially emit commands that the honeybees would obey. So Landgraf's honeybee robot can tell the other bees to stop, and they do. It can also do something more complicated, which is the very famous waggle dance—it's the communication pattern they use to convey the location of a nectar source to other honeybees. This is a very easy experiment to run, in a way, because you put a nectar source in a place where no honeybees from the hive have visited. You then instruct the robot to tell the honeybees where the nectar source is, and then you check whether the bees fly there successfully. And indeed, they do. This result happened only once, and scientists are not sure why it worked or how to replicate it. But it is still an astounding result.

This raises a lot of philosophical and ethical questions. You could imagine such a system being used to protect honeybees—you could tell honeybees to fly to safe nectar sources and not polluted ones that had, let's say, high levels of pesticides. You could also imagine this could be a tool to domesticate a previously wild species that we have only imperfectly domesticated or to attempt to control the behavior of other wild species. The insights about the level of sophistication and the degree of complex communication in nonhumans raise some very important philosophical questions about the uniqueness of language as a human capacity.

What impact is this technology having on our understanding of the natural world?

The invention of digital bioacoustics is analogous to the invention of the microscope. When Dutch scientist Antonie van Leeuwenhoek started looking through his microscopes, he discovered the microbial world, and that laid the foundation for countless future breakthroughs. So the microscope enabled humans to see anew with both our eyes and our imaginations. The analogy here is that digital bioacoustics, combined with artificial intelligence, is like a planetary-scale hearing aid that enables us to listen anew with both our prosthetically enhanced ears and our imagination. This is slowly opening our minds not only to the wonderful sounds that nonhumans make but to a fundamental set of questions about the so-called divide between humans and nonhumans, our relationship to other species. It's also opening up new ways to think about conservation and our relationship to the planet. It's pretty profound.