to star items.

Accepted Contribution

No one’s face. A critical examination of the use of synthetic (face) data in algorithmic facial processing (AFP) development.   
Assia Wirth

Short abstract

Synthetic data is increasingly used in AFP development as a response to privacy concerns. However, its use appears to obfuscate rather than solve privacy issues related to AFP developement, which only reinforces racialized and gendered systems systems of control/domination enabled by AFP.

Long abstract

Synthetic data is increasingly used in AFP development as a response to privacy concerns. In fact, AFP is largely based on machine learning techniques that require, among other things, very large datasets to train their algorithmic components. However, assembling face training data can be taxing – as a result of data protection regulating face image use as well as the cost of collecting and annotating the required training data. The latter requires the most vulnerable people are responsible for annotating photographic images of people, often extracted from the internet or video surveillance recordings. As a result, synthetic data has increasingly been seen as an efficient option to generate face training data cheeply and with greater care to privacy.

            This article is based on a qualitative study based on 36 semi-structured interviews conducted with workers operating on freelance platforms from 11 different countries of the majority world and 11 interviews with AFP researchers based mostly in the global north. This paper makes sense of the production and use of synthetic data as a response to the legal and economic costs of AFP training data production. It offers an empirical analysis of AFP supply chains and argues that synthetic data use only makes the latter more opaque, as it itself requires original training data to be generated. The use of synthetic face data thus appears to obfuscate rather than solve privacy issues related to AFP developement, which only reinforces racialized and gendered systems systems of control/domination enabled by AFP.

Combined Format Open Panel CB027
Synthetic data and representation: The politics of AI generated computational practices
  Session 2