"An afternoon hack" Enabling data driven scientific computing in the open

Accepted Paper

Charlotte Mazel-Cabasse (University of California, Berkeley)

Paper short abstract

The scientific computing, or e-science, has enabled the development of large data driven scientific initiatives. The research focuses on the socio-technical conditions of the development of free and reproducible computational scientific tools and the system of values that supports them.

Paper long abstract

The scientific computing, or e-science, has enabled the development of large data driven scientific initiatives. A significant part of these projects relies on the software infrastructures and tool stacks that make possible to collect, clean and compute very large data sets.

Based on an anthropological research among a community of open developers and/or scientists contributing to SciPy, the open source Python library used by scientists to enable the development of technologies for big data, the research focuses on the socio-technical conditions of the development of free and reproducible computational scientific tools and the system of values that supports it.

Entering the SciPy community for the first time is entering a community of learners. People who are convinced that for each problem there is a function (and if there is not, one should actually create one), who think that everybody can (and probably should) code, who have been living between at least two worlds (sometime more) for a long time: academia and the open software community, and for some, different versions of the corporate world.

Looking at the personal trajectories of these scientists that turned open software developers, this paper will investigate the way in which a relatively small group of dedicated people has been advancing a new agenda for science, defined as open and reproducible, through carefully designed data infrastructures, workflows and pipelines.

Panel T113
Critical data studies
Session 1 Saturday 3 September, 2016, 9:00-10:45

A A A A A