Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality.

Accepted Paper:

“Ducks absorb heavy metals” and other quackery: lessons from a live ‘ethical hacking’ experiment on AI-generated scientific disinformation  
June Brawner (The Royal Society) Areeq Chowdhury (The Royal Society) Denisse Albornoz Rumman Chowdhury Jutta Williams

Long abstract:

The impact of AI on the generation and spreading of misinformation is a unique challenge for democracies, particularly in times of conflict or crisis. Detecting scientific misinformation is especially difficult: scientific topics are complex, consensuses evolve and trusted voices are required for translation and promoting understanding. Removing scientific disinformation from online platforms is not a solution, limiting scientific literacy and debate. With accessible AI tools, the volume and quality of scientific misinformation could increase, suggesting AI experiments must include expertise beyond ‘AI practitioners’.

This paper explores red-teaming or ‘ethical hacking’ as an experimental method to build AI assurance capabilities among scientists and the wider public. In red-teaming, participants interact with an AI system to intentionally produce undesired outcomes and identify vulnerabilities. A red-teaming event organised by the authors before the 2023 Global AI Safety Summit convened health and climate scientists to test the guardrails of Meta’s LLAMA2 model, eliciting scientific misinformation. Through event ethnography and prompt analysis, we explore the utility and limitations of red-teaming for AI safety. We find red-teaming useful for increasing understanding of LLM vulnerabilities in civil society, allowing a range of experts to contribute to societal challenges (eg scientific disinformation) and participate in the ‘making’ of AI for social good.

Traditional Open Panel P231
STS, AI Experiments, and the social good
  Session 2