Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality.

Accepted Paper:

Automatic transcriptions of folklore audio recordings  
Trausti Dagsson (The Árni Magnússon Institute for Icelandic Studies) Rosa Thorsteinsdottir (The Arni Magnusson Institute for Icelandic Studies) Luke O'Brien (Tiro) Finnur Ágúst Ingimundarson

Send message to Authors

Paper short abstract:

We present a project where speech recognition was used to transcribe over 2000 hours of Icelandic interviews from a time span from mid twentieth century to present times.

Paper long abstract:

The audio collection of The Árni Magnússon Institute for Icelandic Studies in Reykjavík, Iceland contains over 2000 hours of interviews taken mostly in the years 1960 to 1980 although some of them are from the first decade of the 20th century. This includes examples of music and poetry, but the majority are narratives containing legends, fairy tales and descriptions of everyday life in the early 20th century. The collection is, almost in its entirety, available digitally on the website ismus.is, but although keywords and a summary are presented for each audio entry, the user must still listen to the whole recording to find relevant information.

In this paper, we present the results of a one-year project where speech recognition was used to create automatic transcriptions of audio recordings in collaboration with a technology company specialized in speech recognition. This involves adapting a general ASR system to our specific audio collection. The project, expected to be completed in early 2023, will provide a great amount of transcription that will add to the database's search functionality and give better access to the material. Currently this material is only searchable via very short abstract and keywords, forcing researchers to spend a long time listening through potential findings in the database.

We will discuss the project's process and challenges and evaluate the quality of the transcriptions. We will also show how it will change how this material can be used in folklore research as well as a text corpus for language research.

Panel Arch05
Documenting and living uncertainty in tradition archives today and in the future [Working Group on Archives]
  Session 2 Saturday 10 June, 2023, -