Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality.
Log in
- Convenor:
-
Dieuwertje Luitse
(University of Amsterdam)
Send message to Convenor
- Chair:
-
Anna Schjøtt Hansen
(University of Amsterdam)
- Discussant:
-
Tobias Blanke
(Kings College London)
- Format:
- Roundtable
Short Abstract
Why does the evaluation of ML systems remain both crucial and exceedingly problematic in ML development? This roundtable invites contributors from the Topical Collection on ‘The Politics of Machine Learning Evaluation’ in Digital Society to engage in a cross-cutting discussion on this urgent issue.
Description
Is the data good enough for training? Does the model perform accurately enough? Is the error rate low enough? Such questions of ‘good enough’ are at the very core of the evaluation of Machine Learning (ML) and can also be considered highly political processes in the development of ML systems. There is already growing interest in the political implications of ML, including, for example, dataset construction and the political capacities of specific ML models or foundational algorithmic techniques. However, there has been less focus on the politics of evaluation practices and techniques in ML. In this roundtable, we invite five contributors from the Topical Collection on ‘The Politics of Machine Learning Evaluation’ in Digital Society (Luitse et al., 2024) to engage in a cross-cutting discussion on why the evaluation of ML systems remains both crucial and exceedingly problematic in ML Development. The contributors will address these questions from different angles; Benedetta Catanzariti (University of Edinburgh) will discuss the politics of dataset construction and how ‘affect’ is tamed within these processes; Nanna Bonde Thylstrup (University of Copenhagen) will look towards the public sector and how different ecologies of evaluation shape how ML systems are implemented; Théophile Lenoir (Médialab, Sciences Po) brings attention to the growing evaluation practices that tries to capture environmental footprint of AI systems; Claudia Aradau Kings College, London) addresses the difficulties of contesting ML systems by pointing out their errors or bias, as evaluations of accuracy and error in ML are argued to produce ‘enriched things’; Finally, Louis Ravn (University of Amsterdam) critically addresses the risks and opportunities that synthetic data pose for evaluation. The roundtable will be an open panel discussion, where the contributors will be asked to prepare a short opening statement, followed by a Q&A first by the moderator and later by the audience.