Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality.
Log in
- Convenors:
-
Jamie Wong
(Harvard University)
Wanheng Hu (Cornell University)
Send message to Convenors
- Discussant:
-
Danah Boyd
(Microsoft Research)
- Format:
- Traditional Open Panel
- Location:
- HG-02A36
- Sessions:
- Thursday 18 July, -, -
Time zone: Europe/Amsterdam
Short Abstract:
This panel seeks to investigate “data supply chains.” We invite contributions that help clarify the practices and techniques, and the assemblage of networks and channels – formal and informal, legal and illegal, regional and global – that enable the commodification and economization of digital data.
Long Abstract:
Data, like any commodity, do not come already commodified. While "raw data" may be an "oxymoron" (Gitelman 2013), the "rawness" of data is, nonetheless, relative to those who deal with digital data and the specific contexts where data are produced, traded, or consumed. To manufacture data as "commodity" and as "product" involves many stages, transformations, and negotiations, requiring skillful sourcing and combination of materials, quality control, and marketing. Important recent research has begun to unveil the hidden labors behind data-driven technologies and businesses (e.g. Gray and Suri 2019). Yet, little is understood about the configuration of networks and channels – formal and informal, legal and illegal – that enable the commodification and economization (Çalışkan and Callon 2009) of data. This panel seeks to further clarify the regional and global operations of “data supply chains” (Spanaki et al. 2018).
We invite papers that offer insights to ground speculative rhetorics and debates, especially those pertaining to the AI industry, about data and their value – economic or otherwise – using real-world examples. Possible perspectives include but are not limited to: What actors, practices, techniques, and technologies comprise the infrastructure necessary for data brokerage? How do data brokers make digital information fungible to be sold at set prices? What are the conventions of pricing data at different levels of “rawness,” and how do these vary between different domains and industries? What are the marketing practices and rhetorics around “valuable” data? How are understandings of data’s value construed differently along different stages of the “data supply chain”? What do real-world cases tell us about the “interoperability” of data (Ribes 2017) across different models and domains and its impact on the data market?
Accepted papers:
Session 1 Thursday 18 July, 2024, -Paper short abstract:
This study draws on three year ethnographic work and 154 interviews to explores data annotation reintermediation in China, highlighting mediator organizations’ role in re-embedding data annotation in local societies and shaping human-algorithm complementarities in data annotating process.
Paper long abstract:
Data annotation, often unseen yet vital for data-centric technologies, is the focus of this investigation (Gray and Suri, 2019). Drawing on three years of ethnographic research and 154 interviews within China's annotation industry, this study adds to the debate on data work reintermediation. It responds to the emerging trend of reintermediation, as identified by Graham's team, which emphasizes the role of intermediary institutions in balancing algorithmic control and labor, challenging previous literature favoring disintermediation and its technological advantages (Graham & Lehdonvirta, 2017).
The concept of Complementary Organizations to Algorithms (COTAs) is introduced, drawing on human-computer interaction research. Supported by economists and sociologists, such as Autor (2015) and Shestakofsky (2017), it highlights a move towards complementarities but not substitutions between human and computer. The study examines how COTAs mitigate computerization’s and algorithms’ shortcomings by providing organizational resources for China's data supply chain and data annotation industry. Additionally, it illustrates that local governments, NGOs, and vocational training institutions can act as COTAs. For instance, the Guizhou local government functioned as a COTA, creating mechanisms to stabilize demand fluctuations for annotations from tech hubs and adopting a "people-optimization" approach to aid "product-optimization" in data annotation ecosystem.
Data collection involved thorough three-year fieldwork across seven data annotation centers in China and semi-structured interviews with stakeholders from major tech firms like Alibaba, TikTok, Tencent, and Baidu, data bureau and exchange center. This comprehensive method uncovers the complex interplay between human labor, technology, and the evolving data annotation landscape, highlighting COTAs' role in connecting these areas.
Paper short abstract:
When working with data science and AI, data negotiation is crucial. Professionals must build trust, consider concerns, and navigate regulations when transferring and working with data. Understanding these principles is vital for data workers across various fields.
Paper long abstract:
When working with data science and AI, data is a crucial resource. It is necessary to demonstrate robust results, and there is competition in both academia and industry to obtain the most and best data as quickly as possible. Negotiating the supply of data is critical to the practice of data science.
This study investigates data negotiation through the lens of 'data diplomacy'. We conducted interviews with professionals from various organizations who work with data to explore this concept. Diplomacy refers to the practice of building relationships between organizations or within an organization, with data as the central resource.
Initial findings indicate that certain data negotiation skills share similarities with traditional diplomatic virtues. It is important to establish trust and consider the concerns and feelings of other parties when transferring data. Additionally, those who possess data express concerns about potential misuse and misinterpretation. It is worth noting that strict regulations within an organization can lead to tension and improper data handling.
The ability to understand and apply these principles is a crucial skill for data workers, as data plays an increasingly central role in a wide range of fields, from industry to government.
Paper short abstract:
This paper examines both the power and future-making capabilities of historical web data by turning its attention to the role that large-scale open access web archives (like the Internet Archive, CommonCrawl and others) are playing in the circulation and commodification of web data.
Paper long abstract:
The proliferation and commodification of new and emergent forms of data has been a key area of interest within the digital social sciences. Previous debates have focused on the ways that online platforms and technologies are implicated in the datafication of everyday life, as well as social science claims to expertise in the realm of so-called ‘big data’. Whereas studies of datafication have heavily focused on corporate-owned social media and communication platforms, this paper turns its attention to the role that large-scale open access web archives are playing in the circulation and commodification of web data. The paper conceptualises the sociotechnical significance of web archives through the lens of Thrift’s (2005) concept of ‘knowing capitalism’. The paper explores how web archives are fundamentally premised on the mass accumulation of web content over time, and outlines the ‘value chains’ that organisations (such as the Internet Archive, CommonCrawl and others) enact through the collection, maintenance and transformation of the Web into stable data archives. Example use-cases demonstrate how these archives embody and generate diverse forms of (social, cultural, economic and political) value when deployed online. This analysis enables a broader interrogation of web archives beyond repositories for web-based research data (as they are frequently framed), towards critical sites for examining both the power and future-making capabilities of historical web data. The paper concludes by mapping a research agenda for the study of web archival use to further understand these data infrastructures and their place in the digital economy.
Paper short abstract:
This empirical study traces a UN data supply chain in Jordan used to inform a cross-cutting set of humanitarian priorities, from operational needs to global comparability. Provisional findings point to how 'uncertainty' can be leveraged to align diverse stakeholders in humanitarian response.
Paper long abstract:
Since 2014, the UN humanitarian response in Jordan has used the Vulnerability Assistance Framework (VAF) initiative to produce data monitoring the situation in Jordan, enable comparisons with other humanitarian emergencies globally, and determining eligibility and prioritization of refugees for assistance. Making this data, and making it interoperable in these different domains, are an overlapping set of multinational, governmental, non-profit, and private-sector actors. Through a material-semiotic methodological approach inspired by Latour's Actor-Network Theory, I interview relevant staff and review the VAF bureaucratic literature in order to trace this data production pipeline and how these competing priorities are navigated. My preliminary findings point to the flexible operationalization of 'uncertainty' which enable the building of consensus across various technical and political domains. This empirical work has the potential to contribute to the academic literatures as a case study on data production and algorithmic governance.
Paper short abstract:
This presentation applies code studies to environmental communication and explores the challenges of environmental NGOs acting as environmental data brokers through the application of web scraping techniques, tracing the socio-techno-natural-political networks conditioning their activity.
Paper long abstract:
In an era marked by information superabundance and fragmentation, environmental communication faces both unprecedented challenges and transformative opportunities. This presentation explores how different social actors, particularly non-governmental organizations (NGOs), act as data brokers by harnessing web scraping techniques to aggregate dispersed data and pursue sustainability agendas. From the perspective of software studies, the paper investigates two cases: one regarding air quality data and the other regarding water quality data. The methodology draws from code studies, ethnography and interviews. The first case study regards an Italian citizen association scraping dispersed air quality data from various online sources to supplement their its own DIY monitoring network. The second case study focuses on the failure of another Italian citizen initiative to aggregate water quality data through web scraping. Both cases showcase the techno-social webs that propel and limit such approaches in fragmented information landscapes. We highlight how web scraping proves instrumental - often the only solution - to provide vital aggregation of dispersed environmental data, enabling NGOs to craft targeted and impactful communication strategies, fostering public awareness and engagement. At the same time, we identify material and immaterial costs and challenges associated with web scraping, including ethical considerations, data accuracy issues, and legal implications. This new role of NGOs as environmental information brokers emerge therefore as risky, expensive and marked with significant tradeoffs.
Paper short abstract:
Tracing data journeys in the context of traffic transformation in Frankfurt, we explore the intricate data supply chains involving municipality, enterprises, and citizens. We examine the multifaceted roles those actors obtain in data trading, shedding light on forms of data valorization at play.
Paper long abstract:
Urban traffic infrastructure matters for a variety of actors: the municipal administration, private enterprises, and most numerously, citizens. Just as manifold as the relations between these actors are the traffic data they generate for different purposes at different locations, and the evolving supply chains.
This presentation is based on an ongoing research project on data politics in the proclaimed “sustainable traffic transformation” (Verkehrswende) of Frankfurt, Germany. We use the concept of data journeys (Bates et al. 2016) to trace how data is produced, processed, and shared within the complex networks at play.
Many actors obtain ambivalent roles in this entanglement: Citizens are simultaneously customers, objects of datafication, data consumers, or sometimes even data manufacturers, while the municipality acts as both a key data provider and a customer. In addition, relevant legal regulations and political interests contribute to a mixture of commodification logics and practices. As a public actor within private markets, the municipality finds itself in various double binds.
We will illustrate the diversity of data trading practices on two examples. While the data purchases involved in the development of a new municipal traffic model follow traditional logics of marketization, other constellations elude those. In the case of pedestrian counters that were installed in Frankfurt, the agreements are based on an exchange of various services and mutual data provision instead of monetary payments. Attending to those diverse forms of bargain allows us to investigate the multifaceted practices of data valorization involved along different stages of the supply chains.