Log in to star items.
Accepted Contribution
Short abstract
Through interviews with expert data labelers and a socio-technical analysis of RLHF, this paper shows how expertise stabilizes uncertainty and legitimizes ground truths in non-convergent domains like the humanities. It also reveals the opacity of labor platforms when assigning expert credentials.
Long abstract
This paper examines the recent practice of hiring “experts“ – defined by model developers and labor platforms as Master’s and PhD degree holders in relevant domains – to enhance model performance in the Reinforcement Learning with Human Feedback(RLHF) phase of LLM development. It combines a socio-technical analysis of the RLHF process with in-depth interviews with “expert” workers across different fields on platforms like SurgeAI and Outlier to understand how model developers conceive of expertise, and the assumptions underlying the technical infrastructure that attempts to “encode” expertise.
Performance improvements in RLHF rely on human reviewers converging on one solution to a problem. While this works in STEM domains where there is usually one correct answer to a problem, in fields like social sciences and humanities that require debate, expertise is narrowed to mean fact retention over nuanced engagement. This epistemic weakness reaches its limits when “expert” logic is applied to creative fields, reducing creativity to credentials, foreclosing the potential for radical uncertainty.
Despite the aforementioned limitations, the insistence on using “experts” across fields shows that expertise acts as a brittle legitimizing category for ground truths rather than a step in model improvement. The contingent foundations of this ground truth are compounded by the erratic behavior of labor platforms, where the title of “expert” is granted opaquely, and is revoked arbitrarily, further undermining the epistemic foundation of this process. This paper highlights the limits of expertise in RLHF and raises questions about how best to encode knowledge in domains defined by uncertainty.
Ground truths and the epistemology of AI
Session 2