Accepted Poster
Poster Short Abstract
Maarallee.be is a citizen science project collecting and transcribing Flemish speech to improve AI’s understanding of regional accents and dialects. Anyone can contribute by recording short audio clips and helping refine transcripts. This poster presents our approach, faced challenges and results.
Poster Abstract
Speech technology is becoming part of everyday life — from automatic subtitles to voice assistants. But most AI systems are trained with data in the standard language, and often only sequentially or not on regional accents or dialects, which is the case in the Dutch-Flemish region. This makes it hard for this technology to understand how people in the periphery really speak. Regional accents, dialects, and speech patterns are often overlooked or ignored.
Maarallee.be aims to change that. It’s a citizen science project that invites people across Flanders to help build a large, representative database of spoken Flemish. Through a user-friendly web app, participants record short audio clips in their own voice, accent, and dialect. Volunteers then help improve the transcripts of these recordings. This data is used to train automatic speech recognition (ASR) models — systems that convert speech into text. By including a wide range of Flemish voices, we make AI more inclusive and better suited to real-life communication.
This poster presents the approach we co-designed with our target groups, highlights the barriers we encountered, and shares the solutions we tested to overcome them. The project specifically focuses on participation from underrepresented groups in speech and language data such as youth, women and multilingual speakers. Every voice matters.
Maarallee.be is a collaboration between KU Leuven (PSI), Scivil (the Flemish knowledge center for citizen science), and the Flemish government (WEWIS), supported by the Flemish AI policy plan.
Poster Session