Accepted Paper
Paper short abstract
This paper argues that AI transcription reproduces socio-digital inequality by treating language as computational capital. Dialect erasure and misrecognition render marginalized speakers algorithmically illegible, calling for participatory and linguistically just digital infrastructures.
Paper long abstract
Language is increasingly positioned as a core socio-digital infrastructure through which access to rights, services, and political recognition is mediated. Yet in contemporary digital systems, language is operationalized as computational capital, measurable, standardizable, and unevenly distributed. This paper theorizes AI-mediated transcription as a critical site where language-based socio-digital inequalities are produced and normalized, particularly in low-resource and high-stakes contexts.
Drawing on conceptual insights from linguistic anthropology, critical data studies, and migration scholarship, the paper reframes transcription not as a neutral technical process but as a socio-technical practice embedded in power relations. Automated transcription systems, trained on limited and hierarchical language datasets, privilege standardized linguistic forms while misrecognizing dialectal variation, affect, and strategic ambiguity. These omissions disproportionately affect marginalized speakers, transforming linguistic difference into algorithmic illegibility and constraining who can meaningfully engage with, interpret, or contest digital systems.
Using forced migration as a critical analytic lens, the paper shows how AI transcription operates as a gatekeeping mechanism within humanitarian and legal infrastructures, shaping credibility, visibility, and institutional legibility. Language misrecognition thus becomes a mechanism of epistemic injustice, whereby certain forms of speech are rendered inaudible within automated governance regimes.
The paper advances a theoretical intervention by proposing a shift from extractive, automation-centered language technologies toward participatory and plural models of linguistic mediation. By foregrounding language as a relational and political infrastructure rather than neutral data, it contributes to debates on inclusive digital futures and highlights the necessity of embedding linguistic justice and human agency into AI-driven systems.
Lost in translation: Linguistic infrastructures of inclusion in the age of AI