to star items.

Accepted Contribution

Reassembling the Local in Synthetic Data Infrastructures  
Svea Kiesewetter (Södertörn University) Francis Lee (Södertörn University) Ericka Johnson (Linköping University)

Short abstract

If all data are local achievements,how should we understand locality when data are generated artificially?This paper revisits locality through the lens of synthetic data, examining how representation, reference, and reality are (re-) configured under generative/ probabilistic /modelled conditions

Long abstract

Synthetic data are frequently framed as privacy-preserving alternatives to empirical data, positioned as solutions to technical, ethical, and political constraints (Nicolenko, 2021; Steinhoff, 2022). By contrast, STS scholarship has long emphasized data as practical achievements—ex. sublata (Latour, 1999). However, synthetic data seem to confound much analysis, even in STS, reintroducing dichotomous analyses about the boundaries between the “real” and the “synthetic”. Approaching synthetic data as achievement, begs us to think about the situated and local character of synthetic data (Loukissas, 2019) redirecting our analytic attention from data as a given but SynData as achievements in particular practices, settings, and infrastructures. Thus, when “ground truth” is constituted through iterative stabilization internal to generative modelling processes, resemblance becomes probabilistic and architecturally pre-structured, where is locality enacted and how can we look at the data setting instead of the data set? Engaging in recent work on fabrication, simulation, and synthetic data infrastructures (e.g. Suchman, 2023; van Voorst & Ahlin, 2022; Seta et al., 2024;), we argue that synthetic data do not negate locality but displace the sites at which it becomes analytically visible. Although synthetic datasets are framed as context-free, their capacity to function as evidence rests on modelling assumptions, validation practices, and institutional thresholds. Attending to these conditions foregrounds the sociomaterial practices through which probabilistic outputs become credible, comparable, and actionable. In doing so, the paper examines how representation, reference, and reality are stabilized under generative and synthetic conditions and considers how synthetic data shape ongoing politics of locality in knowledge production.

Combined Format Open Panel CB027
Synthetic data and representation: The politics of AI generated computational practices
  Session 2