Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality.

Accepted Paper:

Sonic similitude the significance of similarity metrics in the context of AED  
Johan Malmstedt (Media and Communications Studies)

Paper short abstract:

This article explores audio event detection algorithms for human-machine interaction, analyzing the philosophical and mathematical challenges of audio similarity through both historical and media theoretical perspectives.

Paper long abstract:

Audio event detection (AED) is becoming an increasingly significant cog in contemporary human-machine interaction (Sterne, Mehak, Kalfou: 2022, Kang: 2022). Such algorithms are dependent on the metrical determination of similitude (Mackenzie: 2017). However, the task of determining similarity is never trivial and adapted to the framework of acoustics, it conceals a range of philosophical problems ranging from the objective qualities of the sound object to the mathematical problem of geometric distance. To start unpacking the assumption behind audio similarity metrics, this article examines and histories a sample of popular applications.

The analysis focuses on a set of contemporary models trained on the state-of-the-art data set AudioSet (Parker & Dockray: 2023). Close analysis of the data processing involved in these algorithms can reveal the key metrical intervention in producing similarity scores for audio files. This entails a critical reading of how audio fares under the condition of, for example, cosine distance, and K-means methods. After considering the implications of these components for the classification of sound, the analysis proceeds to trace the historical practices of these metrics, discussing the enduring influence of these mathematical trajectories on today's technological practices. In combining media theory and the history of applied mathematics, the aim is to contribute to our understanding of the cultural implications of sonic similarity in our contemporary digital landscape.

Panel P084
Machine listening: dissonance and transformation
  Session 1 Wednesday 17 July, 2024, -