image from work

The Songbird Speaks

  • by

  • Digital Poetry Webapp, listener model trained on database of animal signals, database of interpretations.
  • doi: https://doi.org/10.7273/bpw7-an93
  •  BEGIN 

Please note: this work is made specifically for smartphones. While it can be accessed via other devices, the listener model may not work correctly, and the screen sizing will be incorrect. See below for instructions.



When the Songbird Speaks: AI Interspecies Translation Interfacesby Alinta Krauth

Australian magpies (Gymnorhina tibicen - unrelated to Eurasian magpies) are known for their incredibly impressive improvisational singing abilities. Their songs sit outside of their general communicative vocalizations – these are songs that appear to be sung for no other reason than to sing, or perhaps as general territorial displays (Roper, 2007; Walsh et al, 2023). They perhaps express to other magpies “here we are on our land”. As vocalists, the magpie can also mimic almost any sound it hears, including complete human-made songs (Kaplan, 1999). It is uncommon in the wild, but my mother once taught a magpie to sing the children’s song Pop Goes the Weasel, a song he then only conjured up when in a particularly bad mood. My favorite magpie fact is that, according to Emma Roper (2007), magpies live in families of blood-related and nonblood related birds, within which it appears it is the nonrelated childless female birds who are the main vocalists. These aunties are welcomed into the family for the almost sole purpose of singing.

The Australian magpie’s songs are combinatory, in that they recombine and remix their own phrases to create vocalizations that can appear to be novel as the song progresses (Walsh et al, 2023). As a digital poet, I am intrigued by the connection between Australian magpies’ abilities for combinatory song, and the creative expressions of digital poets in combinatory poetry. You could say that magpies are inherently OuLiPo-ean in their approach to warbling, as if following an algorithm that recombines auditory information in a way that one might compare to Raymond Queneau’s Cent mille milliards de poèmes (1961), or for a more contemporary example, Nick Monfort and Stephanie Strickland’s Sea and Spar Between (2010).

Why am I telling you about the combinatory techniques of magpies? Firstly, because it is tremendously interesting. But importantly, because this is explored in "The Songbird Speaks." If there is still much left to learn about magpies, then I believe a speculative door is open for creative practitioners to interpret their songs in a variety of imaginative perspectives, such as imagining magpies as combinatory poets or storytellers, whose songs relay messages of things they have seen and memories they hold. The Songbird Speaks is born from a Leonardo Imagination Fellowship sponsored by Leonardo Journal and the Arizona State University Center for Science and the Imagination. This fellowship allowed me to ask a big, imaginative, speculative Science Fiction inspired question, and to then address that question through creative and technological development. The question that I chose to ask was: How can machine learning and artificially intelligent models help us to imagine animal vocalizations in a human language context?

"The Songbird Speaks" is an audio listening device that has been developed using machine learning and combinatory poety techniques. It can listen to the combinatory songs of Australian Magpies, and translate those into a human language in real time. Presented as a digital interface, members of the public can load the audio model to their smartphone, ask it to listen to Australian Magpies in situ, and then see visual and written ‘translations’ of the animal’s voice in real-time via their phone screen. It is currently capable of understanding sixty-eight different magpie song-bits, such that if a magpie were to sing in listening range of the device, it could give translations for any vocalization combinations that audibly correspond with those sixty-eight options.

The work use experimental and innovative techniques in the development of audio recognition AI models for interpreting these communications in real-time, and then uses techniques from combinatory and algorithmic poetry creation to present that data in a human language. It required the development of a vocalization corpus for Australian magpies on which to create an AI model – such a corpus and model did not previously exist, as models for species beyond humans are perhaps seen as less important both to culture and to technology. The proces of developing "The Songbird Speaks" has involved research into the known and observable signal (vocalization and behavioural) habits of Australian magpies, audio recording fieldwork, choosing strong audio samples, audio classification of vocalization samples into classes, ‘teaching’ these classes to an AI, and then importantly, considering how to present the outcomes of this. While this technology would be particularly useful in science or citizen science applications, I present it here as art. These interfaces do not look like the UI design of currently popular translator systems such as Google Translate. Instead, I want to explore the notion that animal communications shouldn’t be imagined as straight-forward and easily interpretable. Instead, I want them to require multimodal interpretation, and a linguistic awkwardness that reveals animal speakers as mysterious, rather than simple-minded. And secondly, because I can’t help but make interfaces look distinctly anti-UI design.

I use supervised machine learning, which means I create a magpie AI model that is taught on vocalizations that I have gathered predominantly through fieldwork. But rather than allow the AI to decide for itself which vocalization is which, I teach this to the AI. The magpie AI model does not use reinforcement learning, meaning that the public cannot disrupt or change what the model has learnt. This results in a narrow AI – an AI model made for one specific purpose, drawing from one specific corpus of data. To put this into a simple context, imagine an AI is taught everything it needs to know to sit a big multiple-choice exam, where each question represents a different magpie song-bit. Its job is to take that exam and answer each question correctly by selecting the correct translation. It is also constantly barraged with trick questions, which are the myriad of other environmental sounds around those it has been taught, for which it must not react. Though, inevitably, there will be some environmental sounds that emulate what it has been taught.

What would a future look like where humans can, more simply and directly, understand the communication signals of other animals? What does it mean for society, for culture, for religion? For our moral values? For our inbuilt biases regarding who is labelled conscious and thinking, and who isn’t? What would it mean for food production, for animal agriculture, for companionship? What would it mean for our philosophies and humanities? For our misguided human-exceptionalism in Western culture? Or would it change nothing at all; would humans continue to inflict harm and oppression towards other species no matter what they might say? We are a species capable of great atrocities, but I have hope that perhaps with a little nudge from the right technologies, that we can reconsider our relations with other species.


INSTRUCTIONS:

  1. To see the work in full you will need a smartphone with working microphone and a computer with working speakers. If you are located in Australia, you may only need the smartphone, and will be able to go outside and locate wild magpies in your neighbourhood. If not, you can use the provided audio track as an example that replaces the need for a real Australian magpie.
  2. Download the audio sample file to a computer or mp3 player connected to strong speakers. Do not use headphones.
  3. Go to https://www.alintakrauth.com/magpieai on your smartphone.
  4. Press ‘Enter’ to continue, then press ‘allow’ when asked to use your microphone permissions on your smartphone.
  5. Play the audio sample file at a good volume such that your phone can clearly hear the audio.
  6. See what emerges on your phone screen.

More about my processes for building animal vocalization AI models:

Krauth, A. (forthcoming). ‘AI and Care for Other(ed) Animals: Art, machine learning and interspecies futures’, Leonardo Journal.

Krauth, A. (forthcoming). ‘AI, what do I need you for when I have wings to fly?’, Leonardo Journal.

Reference list:

Kaplan, G. (1999). ‘Song Structure and Function of Mimicry in the Australian Magpie (Gymnorhina tibicen): Compared to Lyrebird (Menura ssp.’ International Journal of Comparative Psychology, 12(4). DOI: 10.46867/C4J30H

Roper, E. R. (2007). ‘Musical Nature: Vocalisations of the Australian Magpie (Gymnorhina Tibicen Tyrannica).’ Context: Journal of Music Research, (32), 59–72. DOI:

Walsh, S., Engesser, S., Townsend, S., Ridley, A. (2023). ‘Multi-level combinatoriality in magpie non-song vocalizations.’ Interface, 20(199). DOI:

Attributions:

The sample audio track given here to help those around the world engage with this work is a combination of recordings by the artist, as well as altered snippets from sample files held under CC license at xeno-canto.org by recordists: nick talbot, Marc Anderson, Richard Fuller, Toby Esplin, Mike FitzGerald, James Lambert.

is an Australian new media artist whose practices include interactive art and sound, digital literature, and digitally connected spaces. Over time she has worked in this capacity as a sole trader, a teacher, and in industry. Alinta’s drive toward creative practice comes from her personal connections to the environments she lives within. She is interested in climate, extreme weather, human/wildlife coexistence, and sharing their importance through her work. She is keenly interested in how her work might contribute in spaces of post-environmental crisis. She is the Leonardo-ASU Imagination Fellow for 2023/24, and has been recently shortlisted for the prestigious Ars Electronica S+T+ARTS Prize (2023) from the European Commission. Her works have been seen in spaces such as large screens in Times Square for ZAZ10st Gallery NY, Science Gallery Detroit USA, The Glucksman Gallery Ireland, HOTA Australia, Gallery 3.14 Norway, and Art Laboratory Berlin, Germany.