to star items.

Accepted Contribution

Fabricated Fieldwork: Synthetic data as qualitative research's hidden companion  
Mark Friis Hau (Roskilde University)

Short abstract

Synthetic data is widely seen as threatening qualitative research. Drawing on anthropology's history of composites, pseudonyms, and constructed temporalities, this paper reframes LLM-generated data as continuing the fabrication practices through which qualitative knowledge was always produced.

Long abstract

Debates about synthetic data typically center on quantitative concerns: biased training sets, WEIRD behavioral models, the impossibility of demographic diversity. This paper shifts the terrain to qualitative knowledge production, where synthetic data poses a more fundamental epistemological challenge but where, paradoxically, synthetic fabrication has always been a constitutive method.

Drawing on anthropology's history of methodological fabrication, I demonstrate that qualitative research has long operated through structurally synthetic practices: pseudonymization that transforms observation into literary construction, composite characters, reconstructed dialogue, compressed temporalities, and the fieldnote itself as the first site where lived complexity becomes selective inscription. From Kroeber's "generalized dummies" of the 1920s through the Writing Culture debates to contemporary speculative ethnographies, the discipline has continuously trafficked in synthetic data while maintaining the fiction that its authority rests on unmediated empirical encounter.

This genealogy reframes LLM-generated qualitative data. Rather than asking whether synthetic interlocutors can substitute for "real" human responses, a framing that assumes we know what "real" means, I argue they should be evaluated by what thinking they enable. Following Trouillot's distinction between productive fiction and deceptive fake, the relevant criterion is transparency about constructedness, not fidelity to an empirical referent. Synthetic encounters function as methodological mirrors, externalizing interpretive labor qualitative researchers have always performed internally.

This argument carries broader implications for the ontological politics of synthetic data: the anxiety it provokes reveals less about machines than about disciplines' unexamined boundary work: the selective policing of which synthetic products count as empirical knowledge.

Combined Format Open Panel CB027
Synthetic data and representation: The politics of AI generated computational practices
  Session 3