Accepted Paper:

Scientific Open Data: Questions of Labor and Public Benefit  

Authors:

Irene Pasquetto (Harvard )
Ashley E. Sands (UCLA)

Paper short abstract:

While “Open data” has become a norm in science and public policy, the costs and benefits of making data open rarely are made explicit. These social costs translate in changes in the workforce dynamics, and vary by domains.

Paper long abstract:

Openness of publicly funded scientific data is policy enforced, and its benefits are normally taken for granted: increasing scientific trustworthiness, enabling replication and reproducibility, and preventing duplication of efforts.

However, when public data are made open, a series of social costs arise. In some fields, such as biomedicine, scientific data have great economic value, and new business models based on the reuse of public data are emerging. In this session we critically analyze the relationship between the potential benefits and social costs of opening scientific data, which translate in changes in the workforce and challenges for current science funding models. We conducted two case studies, one medium-scale collaboration in biomedicine (FaceBase II Consortium) and one large-scale collaboration in astronomy (Sloan Digital Sky Server). We have conducted ethnographic participant observations and semi-structured interviews of SDSS since 2010 and FaceBase since 2015. Analyzing two domains sharpened our focus on each by enabling comparisons and contrasts. The discussion is also based on extensive document analysis.

Our goal is to unpack open data rhetoric by highlighting its relation to the emergence of new mixed private and public funding models for science and changes in workforce dynamics. We show (1) how open data are made open "in practice" and by whom; (2) how public data are reused in private industry; (3) who benefits from their reuse and how. This paper contributes to the Critical Data Studies field for its analysis of the connections between big data approaches to science, social power structures, and the policy rhetoric of open data.

Panel T113
Critical data studies