AI in Peer Review: A Large-Scale Multidisciplinary Replication Study and the Impact of AI on Human Reviewer Performance
Ashia Livaudais
(SymbyAI)
Short abstract
This large-scale replication study investigates AI’s use in evaluation, comparing AI and human output while assessing AI’s impact on reviewer output. We provide insights into reproducibility trends, AI-driven evaluation systems, and the future of on-demand replication."
Long abstract
As the quality of AI systems continues to improve and the costs of developing and deploying these models plummet, there has been a growing discourse about the role of AI in scientific discovery. This work investigates the use of AI in multidisciplinary scientific evaluation, with a particular focus on peer review and experimental replication. We conducted a large-scale replication study across various scientific disciplines, examining reproducibility trends and differences across domains. Alongside this, we implemented a large-scale AI-assisted peer review process, comparing the outputs of AI systems to those of human reviewers. This comparison not only highlights the strengths and limitations of AI in evaluating scientific work but also provides, to our knowledge, the first empirical evaluation of the impact of AI on human reviewers' quality. How does the presence of AI influence human judgment, confidence, and decision-making in the peer review process? Our findings offer novel insights into these questions, shedding light on the evolving dynamics between human and machine in scientific evaluation.
Furthermore, this study explores the potential for AI to serve as the foundation for a scientific evaluation "operating system", capable of streamlining and enhancing the peer review process. We discuss the implications of our results for the future of peer review, including the possibility of increased automation and the ethical considerations that accompany it. By bridging the gap between AI and human expertise, this work contributes to the broader conversation about the role of technology in shaping an optimistic future for scientific discovery and evaluation.
Accepted Paper
Short abstract
Long abstract
As the quality of AI systems continues to improve and the costs of developing and deploying these models plummet, there has been a growing discourse about the role of AI in scientific discovery. This work investigates the use of AI in multidisciplinary scientific evaluation, with a particular focus on peer review and experimental replication. We conducted a large-scale replication study across various scientific disciplines, examining reproducibility trends and differences across domains. Alongside this, we implemented a large-scale AI-assisted peer review process, comparing the outputs of AI systems to those of human reviewers. This comparison not only highlights the strengths and limitations of AI in evaluating scientific work but also provides, to our knowledge, the first empirical evaluation of the impact of AI on human reviewers' quality. How does the presence of AI influence human judgment, confidence, and decision-making in the peer review process? Our findings offer novel insights into these questions, shedding light on the evolving dynamics between human and machine in scientific evaluation.
Furthermore, this study explores the potential for AI to serve as the foundation for a scientific evaluation "operating system", capable of streamlining and enhancing the peer review process. We discuss the implications of our results for the future of peer review, including the possibility of increased automation and the ethical considerations that accompany it. By bridging the gap between AI and human expertise, this work contributes to the broader conversation about the role of technology in shaping an optimistic future for scientific discovery and evaluation.
Peer review: pressures and possibilities
Session 1 Tuesday 1 July, 2025, -