Sperm whale vocalizations are among the most intriguing communication systems in the animal kingdom. Traditionally, sperm whale codas, or groups of clicks, have been primarily analyzed in terms of the number of clicks and their inter-click timing. This paper argues that acoustic properties of clicks in codas are likely meaningful and actively controlled by whales. We present a visualization technique that allows us to describe several previously unobserved patterns. We argue that sperm whale codas are on many levels analogous to human vowels and diphthongs: vowel duration and pitch correspond to the number of clicks and their timing (traditional coda types), while spectral properties of clicks correspond to formants in human vowels. We identify two recurrent patterns that appear across individual sperm whales: the a-coda vowel and i-coda vowel. Both coda vowels are possible of different traditional coda types. Our discovery thus suggests that codas are highly compositional. We also show that sperm whales have diphthongal patterns on individual codas: rising, falling, rising-falling and falling-rising formant patterns are observed. Finally, we control for whale movement and present several pieces of evidence suggesting that the observed patterns are not artifacts, but are actively controlled by sperm whales. We also show that the two coda vowels (the a-vowel and i-vowel) are actively exchanged by sperm whales in dialogues. The uncovered spectral properties suggest that codas are highly compositional, more informative, and more complex than previously thought.
This paper follows up on predictions obtained from generative models in Approaching an unknown communication system by latent space exploration and causal inference and uncovers patterns corresponding to vowels and dipthongs in the communication of sperm whales.
The work has received significant attention, with more than 1.7 million impressions on X:
Sperm whales have equivalents to human vowels.
— Gašper Beguš (@begusgasper) December 5, 2023
We uncovered spectral properties in whales’ clicks that are recurrent across whales, independent of traditional types, and compositional.
We got clues to look into spectral properties from our AI interpretability technique CDEV. pic.twitter.com/8sEAzPkMfo
and coverage in media outlets (some collected in this Medium post).