Examining the origins of bias in speech recognition technology

Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality. Log in

Accepted Paper

Johann Diedrick (NYU)

Send message to Author

Short abstract

This paper examines the history of speech recognition technology, from pre-electronic formulations to contemporary AI systems, to understand how bias has been embedded in these systems throughout their history of development.

Long abstract

Machine listening technologies, specifically speech recognition, have been designed to not take into account variations in speech. This area of research has been (intentionally) neglected, and the creators of this technology have known that this problem has existed since the technology's earliest formulations.

Alexander Graham Bell, along with his father, devised a method of visualizing speech in order to educate the Deaf and hard of hearing to speak. This project of oralism is the starting point for disciplining bodies to conform to a type of regularized speech production, where one must perform speech for a hearing body (biological or technological) in order to be access services, society, and recognition of one’s identity, personhood, culture, and humanity.

This paper traces the early origins of producing a taxonomy of speech (Visible Speech), the development of theories of how speech is produced and how meaning is carried through vocal signal (Dudley and information theory), through the invention of electronic and digital technologies of speech recognition (and Bell Labs and others), and finally to our modern day technologies powered through AI systems, constructed with the logic of machine learning in mind. Through this interweaving is a desire to include how we as a society have evolved our thinking around speech and its recognition in popular culture, and how this has shifted how we relate to these technologies, our expectations around its function, and how we might internalize our own relationship to how we ought to speak in order to be heard by these technologies.

Traditional Open Panel P084
Machine listening: dissonance and transformation
Session 1 Wednesday 17 July, 2024, 10:30-12:00

A A A A A