PS01: Poster session

PS01

Poster session

Format:: Poster

Mode:: Presenting in-person

Add to Calendar:

Accepted papers

Session 1

Building What Works: Evaluation insights from King’s College London’s K+ widening participation programme.

Thomas Masterman (King's College London)

Send message to Author

Paper short abstract

The K+ programme is King’s College London’s flagship outreach activity to support progression into higher education. We demonstrate how evaluation activities, ranging from pre-post surveys to multiple RCT’s, have shaped and influenced the programme and how evaluation is embedded into the programme.

Paper long abstract

The K+ programme is King’s College London’s flagship outreach activity to support progression into higher education (HE) for sixth form students from under-represented backgrounds (e.g. non-selective state schools or first in family to progress to HE). The two-year programme consists of events and activities to equip students with the confidence, knowledge and skills to succeed at university. Approximately 600 students take part each year, for a total of 1200 students enrolled in the programme at any one time, organised into 9 pathways based on the subjects they wish to study.

Students begin their first year with a welcome induction event, also attended by parents and carers, to launch the programme followed by a non-residential spring or summer school to engage in academic lectures careers experiences and bespoke skills-based workshops. Their second-year focusses on the skills needed to succeed in their a-levels and their transitions to university including support with UCAS applications. Alongside the core K+ programme, additional intervention components are delivered supporting students with Black and mixed heritage, LGBTQ+ students, and aiming to raise attainment.

The K+ programme has undergone several rounds of revision throughout its lifecycle and has been subject to a range of evaluation activity ranging from pre-post surveys, to examinations of individual components, to two currently underway RCT’s (one in collaboration with TASO) aiming to demonstrate the causal impact of the programme on student progression to HE.

We demonstrate how the K+ programme has been shaped and informed by evaluation throughout its lifecycle and how this body of evaluative work has been built upon, including the emergence of longer-term findings. This covers multiple evaluation designs from pre-post surveys, process evaluations, participant focus groups, and the use of causal designs including two RCT’s. This also includes the use of validated measures such as the Access and Success Questionnaire (ASQ) and the Academic Behavioural Confidence – Revised (ABC-R) scales. The ABC-R is a recently validated measure developed between the Social Mobility and Widening Participation team and the Institute of Psychiatry, Psychology & Neuroscience at King’s.

We will showcase how evaluation activities are embedded throughout regular delivery, including the use of pre-post surveys within programme activities to permit consistent monitoring of impact and open-ended student feedback. In addition to showcasing evaluation activities we will bring together perspectives from those delivering the programme on the practicalities of embedding evaluation as well as engaging with evaluation to affect change.

Co-Creating Evaluation Cultures: Creative Methods and Policy Insights from the Into the Light place-based cultural programme in County Durham

Alison Whelan (Durham University, Into the Light Programme)

Send message to Author

Paper short abstract

Our participatory evaluation of Into the Light uses creative methods and systemic mapping to embed learning in cultural practice. This session explores how inclusive evaluation cultures can inform policy, challenge conventions, support regeneration, and inspire sector-wide change.

Paper long abstract

This presentation shares insights from the evaluation of Into the Light, an ambitious, place-based cultural programme designed to support cultural regeneration, talent development, and inclusive participation across County Durham over three years. Grounded in Socio-Cultural Historical Activity Theory (SCHAT), the ongoing evaluation adopts a participatory, arts-based approach to embed learning and reflection within everyday cultural practice.

Work Package 1 focused on co-developing an inclusive evaluation framework and toolkit, shaped by a review of UK and European place-based programmes. This framework supports capability-building among cultural practitioners through co-designed tools, inclusive principles, and creative methods that embed evaluation into everyday cultural practice.

Work Packages 2 and 3 apply creative methods, including Photovoice, LEGO® Serious Play, philosophical dialogues, and community storytelling, to surface lived experience and identify contradictions within the programme. These tensions are reframed as opportunities for transformation, supporting iterative learning and collaborative sensemaking.

A key strand of the evaluation is the development of a Community of Practice across County Durham, bringing together freelance creative practitioners, cultural organisations, educators, and community partners. This network is designed to support peer learning, reflective practice, and shared inquiry, enabling practitioners to engage with evaluation not as a separate activity but as an embedded part of cultural development. Through co-created resources, thematic workshops, and collaborative storytelling, the Community of Practice fosters a culture of openness, experimentation, and mutual support. It also provides a platform for surfacing diverse perspectives and amplifying voices that are often underrepresented in formal evaluation and policy processes.

The evaluation challenges conventional hierarchies by positioning participants as co-researchers and valuing narrative, visual, and performative data alongside traditional evidence. It also demonstrates how creative inquiry and arts-based methods can deepen understanding in complex systems, offering cultural organisations, governing bodies, regional and national policy actors, cultural funders, and strategic decision-makers new ways to surface insight, foster collaboration, and amplify community voice.

Embedded within Durham University’s Policy Hub, the evaluation contributes to cultural policy by generating inclusive, context-sensitive insights. It informs strategies for regeneration, workforce development, and place-shaping, demonstrating how embedded evaluation cultures can support both practice and policy. The approach offers a transferable model for other regions and sectors seeking to embed evaluation in complex, creative, and community-led contexts.

This session offers an original contribution to the field by showcasing how evaluation can be rigorous, relational, and responsive, supporting transformation across complex systems through creative and participatory approaches. The session will also reflect on how this approach can be adapted for other regions and sectors, contributing to wider learning in cultural evaluation, public policy, and community-led development. It invites dialogue on how creative methods can support inclusive decision-making and long-term cultural change.

Navigating Evaluation: Strategies for Building an Evaluation Culture in Complex Organisations

Daisy Smith (National Highways)

Everyone’s invited: growing shared outcomes into evaluation culture

Emma Watson (Imperial College London)

Send message to Author

Paper long abstract

Our approach to evaluation culture is to get everyone involved. We will do the same in this session, engaging conference participants using tools that we have found effective in our own practice. We will discuss concrete approaches to embedding evaluative cultures, presenting our experiences and the challenges we have faced in implanting evaluation culture from within an organisation. We encourage others to share methods that have worked for them, as well as obstacles.

Imperial is a world–leading university for science, technology, engineering, medicine and business (STEMB), with a wide portfolio of outreach and public engagement work. As in-house evaluators in this setting, we will share tools we have used to build trust, relevance, and agency into an evaluation culture that values and benefits from different perspectives.

We will cover a range of scenarios and starting points, from working with colleagues who are confident using surveys but don’t feel a sense of ownership of their evaluation to others who aren’t sold on the value of evaluation at all. We will discuss how we have approached each of these as professional evaluators and the tools that have helped us build these cultures.

One of the key methods that has helped unify our evaluative approaches is the co-creation of shared outcomes. These outcomes provided essential buy-in from stakeholders and developed a shared language and sense of purpose, anchoring the benefits and need for evaluation beyond data collection. We will touch on how our shared outcomes helped us navigate organisational changes and set us up to protect programmes and communicate our shared purpose across an organisation of over 8,000 members of staff and 22,000 students.

With a show-don’t-tell approach, we will demonstrate some of the simple tools we have found effective to encourage engagement, generating discussion around challenges and opportunities of building evaluation culture. We will candidly share challenges we have faced with data availability, as well as over collection of data that is not used to its full potential.

Central to this session will be engaging conference participants in discussions of what has worked well (and not so well), using some of the tools we have implemented with colleagues at Imperial. We will create a safe space to grapple with challenges, discuss opportunities, and scaffold key take-aways for conference participants.

How does youth engagement inform local government decision-making? Insights from a local council in England

Ravita Taheem (University of Southampton and Southampton City Council)

Send message to Author

Paper short abstract

Involving young people in decision-making can benefit the community and improve local services. This case study assesses the influence of youth engagement on local government decision-making by examining local leaders’ perspectives on how it translates to actionable insights.

Paper long abstract

A recent review undertaken by the UK government identified that young people wanted to be involved in decision-making processes. Involving young people in decision-making can benefit the young people themselves and the wider community through shaping and improving local services. However, little is known about the policy impact of youth engagement. Understanding the mechanisms through which youth engagement translates to actionable insights for decision-makers, could help local governments to strengthen youth engagement to shape local policy decisions. This case study involving a document analysis, together with interviews and a focus group with leaders in a local authority in England, describes how youth engagement can inform decision-making and what factors reinforce or weaken these processes. The study found that even where a range of youth engagement activities are supported, the absence of strategic corporate commitment can result in an approach that is fragmented, without adequate resource to ensure insights reach the relevant decision-making forums. Services and policies are more likely to change where the pathway from insights to service provision is short, for example among Children’s services where service providers directly seek insights from the young people they support. However, outcomes were not routinely fed back to young people, and their input was not consistently acknowledged in relevant strategies. This may limit ongoing engagement. Creating a broader organisational culture that values youth engagement requires leaders willing to challenge the status quo to demand consideration of young people’s perspectives. This could involve adopting a systematic approach to embed youth engagement into key decision-making structures within the local authority.

Commissioning Better (Evaluation) Outcomes: How to build an effective evaluation culture in a 12-year evaluation

James Ronicle (Ecorys) Eileen Jack (The National Lottery Community Fund)

Send message to Authors

Paper short abstract

All evaluations require good governance and adaptation, but these take on new meanings and importance in long-term evaluations of new interventions. In this session, the commissioner and evaluator reflect on how to build effective evaluation cultures in lengthy and novel evaluations.

Paper long abstract

Commissioning Better Outcomes was funded by The National Lottery Community Fund. It operated from 2013 to 2024, with a mission to support the development of more social outcomes contracts in England. It made up to £40m available to pay for a proportion of outcomes payments for social outcome contracts (SOCs, previously known as social impact bonds (SIBs) commissioned locally (i.e. by local authorities, clinical commissioning groups, police and crime commissioners etc; hereafter referred to as ‘commissioners’). Alongside the CBO programme, The National Lottery Community Fund commissioned Ecorys and ATQ Consultants to evaluate the CBO and to explore the ‘SOC Effect’. Running from 2013-2025, the evaluation aimed to explore the advantages and disadvantages of commissioning via a social outcomes contract; challenges in developing social outcomes contracts and how they can be overcome; and the extent to which CBO met its aim of growing the market for social outcomes contracts.

At the time of commissioning the evaluation, SOCs were a very new mechanism, with limited examples of how they had been evaluated previously. Furthermore, the evaluation was over a very long timescale – 12 years. All evaluations require good governance, strong working relationships and adaptation, but these take on new meanings and importance in an evaluation of such novelty and duration. This session highlights the key learnings of how to develop an effective evaluation culture that stands the test of time, drawing on both the commissioner (The National Lottery Community Fund) and evaluator (Ecorys) perspectives. In particular, it encourages stakeholders to be cognisant of, and embrace, the Forming, Storming, Norming, Performing process that takes place in any new team.

Bridging Theory and Practice: The Evolution of Federal Evaluation Policy in Canada

Elyse McCall-Thomas (University of Ottawa) Isabelle Bourgeois (University of Ottawa)

Building an evaluation culture: Embedding learning in Mercy Corps’ GIRL-H programme for adolescent girls

Paul Fean (Mercy Corps) Esther Musa (Mercycorps) TOM KIPRUTO MARITIM (Mercy Corps)

Send message to Authors

Paper short abstract

Mercy Corps’ GIRL-H evaluation embedded learning within adolescent girl programming in six countries in East and West Africa. By applying participatory methods, a learning agenda and iterative reflection cycles, the evaluation enhanced adaptive learning, supported inclusion, and program development.

Paper long abstract

This paper presents how the Mercy Corps Girl-H programme integrated learning and programme development through a deliberately cultivated evaluation culture. GIRL-H provides tailored interventions for adolescent girls and young women to gain skills and transition on pathways to formal education, economic opportunities, and civic engagement. GIRL-H has operated in Kenya, Nigeria, Tanzania, Uganda, South Sudan and Sudan since 2020.

The multi-country GIRL-H evaluation, which was published in 2024, informed program adaptation. The evaluation employed participatory qualitative tools, notably the River of Life, enabling adolescents to narrate their own journeys using locally accessible materials. This co-creative method surfaced insights on programme relevance within peer networks and communities. The collected data was analysed using MAXQDA AI Assist to draw trends and common patterns across the dataset. The findings were used to reflect contributions and constraints for participatory analysis. Informed by data from regular review meetings, monitoring visits and learning sessions, the program made significant adaptations, such as in financial inclusion and social and behaviour change communication (SBCC). Crucially, learning sessions brought together mentors, enumerators, and programme participants to co-interpret results and guide course corrections in real time. The paper includes reflections on power dynamics, inclusion (e.g. whose voices were heard), and ethical tensions in conducting and using the evaluation.

Evaluation findings affirmed the importance of mental health and psychosocial support and SBCC to address harmful gender norms, while noting that more time and resources are required for meaningful norms change in communities. Based on participatory interpretation, the evaluation influenced mid-course adjustments and shaped partner decisions about scaling these components. This case contributes to evaluative practice by demonstrating how embedding routine evidence-based learning informed by programme monitoring and learning data into programme management processes and decision-making can shift an organisation toward being reflexively evaluative and how participatory methods enrich both uptake and ownership.

Evaluating the Targeted Lung Health Check Programme: building the real-world case for change for lung cancer screening

Sally Mouland (Ipsos) Michael Woodall (NHS MIDLANDS AND LANCASHIRE COMMISSIONING SUPPORT UNIT)

Send message to Authors

Paper short abstract

In 2019, UK lung cancer survival rates hadn't improved in 50 years. NHS England initiated a targeted screening programme for early detection. Ipsos and the Strategy Unit evaluated this, providing real-world delivery insights which helped inform the UK’s decision for national roll-out.

Paper long abstract

The UK has historically lagged behind comparable countries in cancer survival rates, emphasising the need for earlier diagnosis. In 2019, the NHS set a target to increase early-stage (1 and 2) cancer diagnoses from half to three-quarters by 2028. Lung cancer accounted for 21% of cancer deaths, making it the most common cause of cancer death in the UK, with late-stage diagnosis being a critical issue.

NHS England's Targeted Lung Health Check (TLHC) programme (2019-2024) was initiated to enable earlier lung cancer diagnosis in real-world settings, following positive results from several small-scale trials and pilots.

Ipsos, working with our data partners the NHS Strategy Unit, was commissioned to conduct a process, impact and economic evaluation of the programme. The main objective of the evaluation was to assess whether the encouraging results shown in earlier trials were replicated when the programme was delivered in real world NHS settings. The main outcome of interest was to assess whether there was a shift in cancer staging at diagnosis. We were also tasked with exploring how effectively the programme was delivered, what participants thought of the health checks, and to provide advice on how the programme should be rolled out in future.

During the lifetime of the programme, the UK National Screening Committee recommended that a national lung cancer screening programme should be initiated, and NHS England is now working on national roll-out.

The evaluation showed that 1.22 million invitations were sent, with an overall 44% uptake rate, leading to 324,000 Lung Health Checks and 163,000 CT scans. A total of 2,748 participants received a lung cancer diagnosis, representing a 1.7% conversion rate from initial CT scan. Approximately 75% of these cancers were diagnosed at stages 1 or 2, meeting key benchmarks for early detection. Furthermore, 2,056 other cancers were diagnosed and the programme identified incidental findings in three-quarters of CT scans.

The robust quantitative impact evaluation- which used a Propensity Score Matching and Difference-in-Differences methodology –estimated that an additional 781 lung cancers were diagnosed at stage 1 or 2 that would have otherwise been diagnosed at a later stage or not diagnosed at all. The programme also enabled the detection of an additional 341 lung cancers at stage 3 or 4.

While no immediate impacts were seen on lung cancer mortality rates, this aligns with clinical expectations within the timeframes of the evaluation. Programme challenges included lower participation rates in deprived areas and among ethnic minorities, despite projects reporting delivering engagement strategies to try and address this. However, the programme as a whole was delivered in some of the most deprived areas in England and these areas have therefore disproportionately benefitted. High delivery costs, largely due to staffing, highlighted the complexity and resource demands of implementation, supported by testimonies from various projects.

Insights from the evaluation have been instrumental in shaping the national rollout strategy. NHS England is integrating findings to optimise programme delivery, addressing engagement disparities, driving overall uptake and focusing on engaging the most high-risk individuals.

From tactic to treaty… Theories of Change for influencing and diplomacy interventions.

Niki Wood (Integrity) Lamia Renaud (Integrity)

Send message to Authors

Paper long abstract

Developing theories of change for influencing and diplomacy interventions is knotty and hard. This is due to their propensity for multiple futures, penchant for potential-building interventions that have no causal pathways (yet), and frequent integration with programmatic interventions.

Developing useful and useable theories of change for influencing is challenging, but not impossible. In this talk we will discuss the features of these types of theories of change that are distinctive, and how to take account of these when delivering theory-based evaluation.

We will start the session by outlining common features of influencing theories of change, and how this impacts evaluation considerations. We will also outline the importance of articulating influencing theories of change well, and their power as a communication and sensemaking tool, as well as an analytical one.

We will then showcase two examples of theories of change for influencing, and how we were able to structure a theory-based evaluation around them. These examples will cover (i) an example of reconstructed portfolio-level influencing theory of change and (ii) an example of an influencing intervention that is integrated with a programme, where the theory was designed with support of the evaluator at the start of the intervention. We will engage the audience on these two scenarios, using a voting system to see if they can identify the challenges and opportunities that arose in these two scenarios, as well as how these theories were used to catalyse action at different points of the evaluation.

We will then workshop a scenario with the audience, where the presenters represent an intervention designer and an evaluator respectively. The scenario will be an intervention designed by a think tank to influence policy makers to improve equity legislation through research provision, and the evaluator must design a theory-based evaluation. The presenters roleplay a theory of change discussion typical of influencing interventions, where the audience are invited to join in and support the poor evaluator who is suffering in the discussion. Any audience members who may attended our potential (to be finalised) UKES training on influence and diplomacy monitoring and evaluation will be invited to participate first due to greater familiarity with the subject matter (though noting this in no way overlaps with our proposed training content). If we have a large group, we will pivot to two break out groups, and draw on additional Integrity colleagues to roleplay the discussion.

Relevance to the theme: this is relevant to theme 3 ‘Communicating evaluation for action’. Well-articulated and easily accessible theories of change are essential for theory-based evaluation: the better the theory of change, the more actionable and understandable these evaluations are. Our work is centred on the articulation of Useful theories of change in influencing evaluations, and how to use behavioural insights and scenario planning methods to support their creation. We will also aim to cover to how evaluators communicate these theories and their use in evaluation processes, as well as how to use them in sensemaking processes.

Outcome Harvesting in Action: Evaluation of AHRC's Follow-on Funding Scheme

Aphra Murray (Technopolis)

Send message to Author

Paper long abstract

I am excited to submit a poster presentation to showcase our approach and findings from the evaluation of the Arts and Humanities Research Council's (AHRC) Follow on Fund (FoF) scheme and how they have informed the new evolution of the programme.

For over a decade, the AHRC FoF scheme has supported researchers in transforming arts and humanities insights into tangible change across knowledge exchange, skills development, commercialisation, policy engagement, and public life. As the scheme reached its fifteen-year mark, AHRC commissioned an independent evaluation to explore its effectiveness, relevance, and future direction.

This evaluation took place against a backdrop of increasing expectations for publicly funded research to demonstrate impact beyond academia. While this imperative spans all disciplines, what constitutes ‘impact’ (and how it unfolds) differs markedly between fields. In the arts and humanities, pathways to impact are often non-linear, relational, and co-produced, contrasting with the more structured trajectories typical of science and innovation funding (such as TRLs). The evaluation sought to reflect these distinctive pathways while recognising the growing policy interest in the economic and societal value of arts and humanities research.

To capture the richness of ten years’ worth of evidence, we took an exploratory approach, co-developing and iterating a Theory of Change with AHRC and using Outcome Harvesting to understand and analyse the full body of evidence. This approach allowed the team to identify, substantiate, and analyse hundreds of outcomes (and trace this back to the AHRC FoF scheme), capturing nuanced examples of how arts and humanities research creates social, cultural, and economic value.

The evaluation confirmed that FoF is a valued and effective part of the funding landscape. Between 2015 and 2024, FoF awards leveraged £193 million in further funding, outperforming comparator schemes. Furthermore, manyFoF awards were referenced in the 2021 Research Excellence Framework (REF) impact case studies, underlining the scheme’s significant contribution to research impact and reach across the UK.

Yet, the evaluation also surfaced a clear message: there remains untapped potential, particularly in supporting projects to drive economic impact, commercialisation, and bold new pathways to impact. These insights came at a pivotal time, as the arts and humanities sector (and the funding system more broadly ) grapples with how to demonstrate value and relevance in a changing innovation landscape.

The poster will showcase how Outcome Harvesting and iterative Theory of Change development can be used to generate actionable insights even late in a programme’s lifecycle. It will share practical lessons on how evaluative learning can inform funding redesign, demonstrating that it is never too late to evaluate, reflect, and adapt. Ultimately, it argues that embracing discipline-sensitive definitions of impact and adaptive evaluation methods is essential to supporting the full potential of arts and humanities research to contribute to the UK’s economic, cultural, and societal wellbeing.

Embedding Strategic Evaluation in a Mayoral Combined Authority: Developing an Outcomes Framework for West Yorkshire

Jack Ford (West Yorkshire Combined Authority)

Send message to Author

Paper short abstract

WYCA’s new Outcomes Framework embeds evaluation into strategy, translating complex evidence into actionable insights that can guide regional investment and strengthen feedback loops. This work has solidified a focus on outcomes and will help promote a culture of evidence-based decision-making.

Paper long abstract

The West Yorkshire Combined Authority (WYCA) is a Mayoral Strategic Authority with responsibilities across transport, culture, skills, housing, and economic development. In recent years, our Evaluation Team has worked to embed robust monitoring and evaluation practices across an organisation undergoing rapid change and under pressure to deliver at pace. We have had to balance the urgency of tackling entrenched inequalities with the need for evidence-based decision-making and learning from past interventions.

As our ambitions expand—particularly in areas such as mass transit, bus franchising, and home retrofit—while funding remains constrained, the need to prioritise investment around strategic outcomes has become increasingly important. Equally, to support learning and accountability, we must distil complex evaluation evidence from a multibillion-pound investment portfolio into clear, actionable insights for senior leaders and elected members.

In response, throughout 2025, the Evaluation Team has led the development of an overarching Outcomes Framework. Grounded in WYCA’s Local Growth Plan and other strategies such as the Local Transport Plan, the framework identifies key outputs and outcomes across policy areas, supported by both intervention-level and regional metrics. This enables us to track progress at a regional level while assessing the contribution of individual programmes.

The framework is summarised in a series of one-page logic models. Whilst these are deceptively simple, we developed them through extensive stakeholder engagement, negotiation of competing priorities, and alignment with national devolution targets.

We showcase the development of an outcomes framework as a practical strategy for generating and disseminating evaluation evidence in ways that meaningfully inform strategic decision-making by gaining buy-in from organisational leadership up front. By using theory of change models, evaluation practitioners can provide strategic clarity and lead conversations around organisational outcomes and objectives. As the outcomes framework becomes established, this will help embed robust and proportionate approaches to evaluation within a complex, multi-stakeholder public sector organisation. This case study serves to highlight the critical role of sustained communication and relationship-building with a range of stakeholders in ensuring that evaluation has real real-world impact and can inform better policy-making and delivery.

We will sshare our approach to developing this framework through iterative stakeholder engagement and explore its implications for improving evaluation design, evidence use, and strategic learning. We argue that a shared set of outcomes and metrics can strengthen the ROAMEF cycle, support continuous improvement, and embed evaluation more deeply into regional strategy and decision-making, offering applicable insights for attendees working across diverse evaluation contexts.

How does capability, capacity, and consciousness in whole system place based working develop: A physical activity realist stakeholder analysis

Kev Harris (Hartpury University)

Send message to Author

Paper short abstract

Whole systems approaches and realist evaluation are positioned as antidotes to reductionist methods, due to their preoccupation with understanding the role of multiple layers of context and causal forces. Here, we communciate findings about how practitioners value and use them to inform their work.

Paper long abstract

Pressing healthcare issues, and health inequalities are recognised as complex issues that are irreducible to their constitute parts. Appropriate evaluation within complex areas, including whole systems approaches and realist evaluation, have burgeoning credibility in their ability to account for learning and innovation across complex issues. However, deployment is often fraught with challenges and understanding how stakeholders become engaged in these approaches and integrate cycles of learning is lacking. Questions exist surrounding how and in what ways stakeholders react to this “participation” in complex congruent evaluation and how this evidence is valued and used. The aim of this research was to understand how large scale transformation of whole systems realist practice and evaluation occurs, for whom, and in what circumstances.

The National Evaluation and Learning Partnership, commissioned by Sport England, have worked collaboratively with a wide range of place partnerships engaged in whole system place-based approaches to tackle physical inactivity. The team have supported places to explore how Place Partnerships can build capacity to undertake appropriate evaluation. A focus of the work has been to substantially raise capability in whole systems realist evaluation. Drawing upon a bricolage of participatory evaluation methods, this approach has worked with places to appreciate importance of complexity, the conditions for change, and then enable them to operationalise realist informed evaluation methods. In this paper we reflect on the findings from 11 realist interviews with stakeholders who have been engaged in this place partnership journey to explore ideas on how, capability and consciousness may develop to inform everyday decision making and delivery.

Emerging results verify that places initially require an increased recognition on the need to accept uncertainty and alternative evaluation approaches. A prominent feature was the need for a senior leader who advocates for, supports, and facilitates change by “feeding the beast” of traditional ways of thinking whilst highlighting the need for broader ways of capturing impact. Another resource influencing change was the presence of a credible external voice who “fights the corner” of innovative ways of thinking. Findings indicate that once places understand complexity, they become ready to alter practices. The influence of funder expectations, engrained beliefs on evaluation as a performance metric, and the role of share social spaces for knowledge exchange prominent. Commissioned activities and external frameworks can be persuasive due to the competitive landscape, meaning organisations will conform to meet the funders requirements. However, in other instances without enforcing expectations some used the approach to embellish their work. The evolution of places to being reflexive with cycles of learning was not as discrete. This was often complicated by the various levels of the system and trying to influence multiple varying agendas. Often, this cross boundary work required “translation” which many within the system found alien.

Sustainable uptake of whole systems place based realist work is influenced by historical practices of evaluation, enduring beliefs about practice, the funding landscape, the provision of external support and social spaces, the wider stakeholder belief system, and the interplay of senior and middle management in discursive ways.

Embedding Feminist Evaluation Cultures in Climate Governance: Insights from Kenya, Nigeria & Pakistan

Cynthia Jebichii (Keele University)

Send message to Author

Paper short abstract

Explores how feminist evaluation strengthens climate governance in Kenya, Nigeria, and Pakistan. Highlights participatory learning, gender gaps, and proposes practical principles to embed accountability, inclusion, and gender-sensitive evidence use in policy.

Paper long abstract

Embedding Feminist Evaluation Cultures in Climate Governance: Insights from Kenya, Nigeria & Pakistan

Strand: Building Evaluation Cultures

Author: Cynthia Jebichii KERING, MA Candidate, Keele University

Email: cynthiajebichii0@gmail.com | Phone: +447470589290

Full Abstract

Aligned with the UK Evaluation Society 2026 Conference theme “Bridging the Gap: Evaluation to Action,” this paper explores how feminist evaluation cultures can transform climate governance systems in Kenya, Nigeria, and Pakistan. Climate adaptation frameworks increasingly commit to gender equity, yet evaluation practice still prioritises technical indicators over lived experience, learning, and accountability.

This study examined how evaluation systems recognise or marginalise women’s climate knowledge and agency. Using a systematic qualitative analysis of climate policies, M&E frameworks, and evaluation reports, the research analysed how evaluation culture shapes equity and learning. Findings reveal that Kenya’s decentralised structures foster participatory learning and feedback, while Nigeria’s centralised, externally driven evaluation limits gender accountability. Pakistan’s dryland agriculture context illustrates risks where climate-smart frameworks and trade systems exacerbate women’s unpaid labour and water burdens when gender-sensitive evaluation is absent.

The paper proposes feminist evaluation principles that strengthen local learning cultures and promote relational accountability, inclusion, and epistemic justice. These principles offer evaluators practical guidance for embedding gender-sensitive evaluation approaches that ensure women’s knowledge and experiences inform climate action. By bridging evaluative insight and policy change, feminist evaluation cultures can help realise climate justice in the Global South.

Short Abstract

This paper contributes to the UKES 2026 theme “Bridging the Gap: Evaluation to Action” by exploring how feminist evaluation cultures can strengthen climate governance in Kenya, Nigeria, and Pakistan. Through systematic document analysis, the study examines how evaluation systems recognise or marginalise women’s climate knowledge. Kenya demonstrates participatory learning cultures, while Nigeria maintains technocratic practices. Pakistan reveals how the absence of gender-sensitive evaluation can intensify women’s unpaid labour. The paper proposes practical feminist evaluation principles for embedding learning and accountability.

Keywords: feminist evaluation; climate governance; participatory learning; evaluation capacity; gender equity; Global South.

Speaker Bio

Cynthia Jebichii KERING is a Gender and Development scholar and evaluation researcher completing her Master’s degree at Keele University, United Kingdom. Her work focuses on feminist evaluation, climate governance, and gender-responsive public policy in Africa. She has researched comparative gender-responsive climate adaptation in Kenya and Nigeria and explored climate-smart agriculture and women’s economic precarity in Pakistan. Her research advances participatory and equity-driven evaluation approaches that centre women’s lived knowledge in climate decision-making systems.

Presentation Title Options

- Feminist Evaluation Cultures in Climate Adaptation

- Embedding Women’s Knowledge in Climate Evaluation

- Climate Justice Requires Evaluation Justice

Evaluability assessment as a tool for evidencing impact for global advocacy for humane dog population management

Jennika Virhia (The University of Glasgow) Danni Anderson (University of Glasgow) Elly Hiby (ICAM) Nai Rui Chng (University of Glasgow)

Send message to Authors

Paper short abstract

Evaluability assessments (EAs) are tools that can help evidence impact and bridge the evaluation-action gap. We conducted four EAs with organisations undertaking dog population management (DPM) to strengthen their monitoring and evaluation capacity in order to and advocate for humane DPM globally.

Paper long abstract

Background

Evaluability assessment (EA) is a quick and useful tool that can be used to support organisations facing challenges in demonstrating impact. Recent applications of EA have been used to support evaluation planning and improving monitoring and evaluation (M&E) systems (Kate Hamilton-West et al., 2019). Within the field of dog population management (DPM) numerous organisations around the world conduct passionate and intensive work to humanely manage dogs, yet they lack the necessary M&E knowledge and tools to evidence the impact of their work (Hiby et al., 2017). Animal welfare organisations such as The International Companion Animal Management Coalition (ICAM) are working to overcome these challenges via investing in research and methodological expertise to support charities and local governments carrying out DPM to increase their M&E capacity. The aim of this research was to demonstrate how EAs can bridge the evaluation-action gap by increasing M&E support to organisations carrying out DPM to better learn how to evidence their impacts. In doing so, successful case studies may be used to champion for humane DPM globally.

Methods

An M&E team comprising of a partnership between ICAM and the University of Glasgow included: evaluation scientists, DPM experts and epidemiologists with expertise in quantitative methods. The team worked collaboratively to provide direct support to a selected group of organisations implementing DPM. We conducted four EAs with organisations located in Thailand, Sri Lanka, Georgia, and India. For each organisation the EA process comprised three participatory workshops (one online, two in person) to meet with stakeholders, co-develop a theory of change, prioritise outcomes, identify key performance indicators, data availability and data needs. The process for each culminated in a clear and actionable set of M&E recommendations co-developed with the local organisation. After recommendations were identified, data experts worked intensively and collaboratively with the organisations to share, analyse and interpret data to showcase the impacts of their DPM activities.

Results

The four organisations who participated in the EA all had varying levels of M&E capacity. Three were collecting data on their DPM efforts, with basic analysis, interpretation and reporting, while one had a strong track record of publications. The M&E team were able to provide direct support to each organisation, and a bespoke plan was co-developed with each to strengthen their M&E capacity going forward. Specific actions varied across organisations and included: providing input for improving data collection tools, data cleaning, data analysis, data visualisation and interpretation, with the ultimate aim of publication of results. In some cases, the organisations adapted their practices for more effective data capture.

Conclusions

We conclude that evaluability assessments can work towards bridging the evaluation-action gap within DPM by supporting organisations to increase their M&E capacity, and in turn facilitate operational decision-making towards evidencing impact. This strengthens the evidence base for successful DPM approaches, which may be used to advocate for humane DPM globally.

ReAct: Embedding evaluation to create real-time change

Rosie Gloster (IES)

Send message to Author

Paper short abstract

ReAct embeds evaluation into test-and-learn approaches, aligning efforts across the employment sector. Through co-produced insights and adaptive methods, it has shaped employer engagement, recruitment, and participant support, translating evaluation into action.

Paper long abstract

The Get Britain Working white paper highlights the need for systems change, supported by test and learn and adaptation. This session will explore how the ReAct Partnership* has integrated evaluative thinking into test-and-learn environments across the employment sector. The ReAct Partnership is an industry-led, active collaboration to support a continuous improvement community in the Restart programme through action research, shared and iterative learning, and the development of applied, evidence-based resources.

At the heart of ReAct is a commitment to co-produced evaluation, funded and overseen by the Restart Prime Providers. Practitioners, policymakers, and other stakeholders are actively involved in shaping evaluation questions, interpreting findings, and driving change. This approach ensures that evaluation is relevant, grounded in context, and more likely to influence decisions.

The session will highlight three case examples where ReAct has contributed to positive outcomes and influenced action. First, in shaping how organisations engage employers. Second, in workforce recruitment and development, where evaluation insights prompted a redesign of recruitment processes to attract candidates from a wider range of backgrounds. Third, in shaping participant support, ReAct developed targeted resources such as carers webinars and top tips sheets to improve engagement and outcomes.

The session will reflect on how this change was achieved through action evaluation, including how evaluation is resourced, when and how evaluation questions are agreed, and how findings are shared. The session will also reflect on challenges and lessons for the evaluation community, including funding for non-traditional evaluation activity across organisations, building trust for collaborative evaluation, creating impact with the right audiences and ensuring timely evaluation insights.

*The ReAct Partnership is co-funded by the eight ‘prime providers’ for the Restart programme — FedCap Employment, AKG, G4S, Ingeus, Maximus, Reed, Seetec and Serco — and is being managed by the Institute of Employment Studies (IES), working alongside the Institute for Employability Professionals (IEP) and the Employment Related Services Association (ERSA).

Exploring the Intersection of Evaluation Policy and Capacity Building: Insights from the Canadian Federal Context

Elyse McCall-Thomas (University of Ottawa) Isabelle Bourgeois (University of Ottawa)

Send message to Authors

Paper short abstract

We examine the intersection of evaluation policy and ECB as a foundation for strengthening evaluation culture within the Canadian federal government by showing how evaluation policies can operate as ECB strategies, and how additional strategies can be leveraged to enhance policy implementation.

Paper long abstract

Evaluation policy and evaluation capacity are two critical influences on evaluation practice. However, their relationship, including the fact that policy may be considered an ECB strategy in some contexts, remains relatively unexplored in the academic literature. Understanding how these two areas intersect might uncover how evaluation systems operate in practice as well as how certain factors facilitate and hinder policy uptake and implementation.

Evaluation policy is often defined as the rules and principles that guide an organization’s decisions and actions when planning, designing, conducting, reporting, or using evaluations within specific organizational, cultural and/or political contexts (Al Hudib & Cousins, 2022; Christie & Lemire, 2019; Trochim, 2009). Consequently, such policies play a pivotal role in shaping evaluation practice. Like evaluation policy, ECB is context-dependent, offering a range of strategies intended to facilitate and sustain quality evaluations (Bourgeois et al., 2013; Stockdill et al., 2002). ECB strategies may target individuals (e.g., training, technical assistance) or organizations (e.g., building data systems, designating evaluation champions, allocating resources) (Labin et al., 2012; Preskill & Boyle, 2008). Multi-level approaches are typically required because strategies implemented at one level often reinforce those implemented at another (LaMarre et al., 2020). For instance, organizational resources are often needed to support individual training opportunities.

There are several ways in which ECB strategies may intersect with evaluation policy. First and foremost, evaluation policy can be an ECB strategy that builds a common language, improves institutional knowledge, and establishes a long-term vision for evaluation practice (Sutter et al., 2024, p. 537). ECB can also serve as a bridge between policy and practice by developing the capacity of individuals responsible for interpreting and implementing evaluation policy, which helps them recognize and understand key policy requirements. Such strategies may include training, embedding policy language in key organizational documents, and communications materials (e.g., newsletters) (Fierro et al., 2022). Conversely, evaluation policy can drive ECB, as policy requirements guide capacity building strategies and signal where organizations must strengthen their capacity to meet policy expectations (Al Hudib & Cousins, 2022). Together, these perspectives position ECB as a mediating mechanism that enables evaluation policy to move beyond its role as a written directive to one that actively shapes practice. Even the most robust evaluation policies risk remaining aspirational without the necessary individual and organizational capacity to translate policy expectations into effective practice.

Our presentation examines the intersection of evaluation policy and ECB as a foundation for strengthening evaluation culture within the Canadian federal government. Drawing on interviews with federal policymakers, evaluation leaders, and scholars, we will share findings on how federal evaluation policies operate as ECB strategies, how individual and organizational capacities, or the lack thereof, affect the implementation of evaluation policy, and how additional ECB strategies can be leveraged to enhance policy uptake and implementation. The results from this study illustrate the relationship between evaluation policy and ECB, and how this relationship can be leveraged to create environments where evaluation is embedded in organizational culture to support evidence-informed decision-making, continuous learning, and ongoing improvement.

Changing systems, changing mindsets: embedding evidence-based leadership in local systems. Lessons from the Changemakers Implementation and Process Evaluation.

Kathryn Lord (Cordis Bright)

Send message to Author

Paper short abstract

Drawing on Foundations’ toolkit and their Changemakers programme, Foundations and Cordis Bright share insights into how local evidence-based leadership bridges the longstanding gap between evaluation and action in children’s services.

Paper long abstract

There is a longstanding gap between what the evidence tells us improves outcomes and what is available for children and families locally. This session will explore how to close that gap by using a suite of tools developed by Foundations and partners to support local areas move from understanding evidence to embedding it into practice.

Using Foundations’ Changemakers programme as a case example, we will demonstrate how Foundations supports evidence-informed decision making at the local level. The toolkit brings together two key resources: Practice Guides, offering evidence-based recommendations for commissioning and delivering family support, and the Guidebook which summarises tested interventions that put these practices into action. Together, these provide a practical starting point for local authorities (LAs) seeking to make evidence-informed choices.

Despite a strong evidence base for parenting interventions, there is a gap in the interventions we know to be effective reaching scale. The Changemakers programme, funded by Foundations in partnership with the Department for Education and the Youth Endowment Fund, was designed address this challenge. It empowered LAs to bridge the gap between evidence and practice by appointing ‘Local Evidence Leaders’ to champion evidence use, embed evidence-based interventions, and build capacity for evaluation-informed decision-making.

Cordis Bright conducted an implementation and process evaluation following the programme from inception to completion (2024–2026). Using mixed methods, they explored how the model operated in different contexts, what supported or constrained implementation and how dedicated local evidence leadership can influence system wide change.

In this session, Foundations and Cordis Bright will reflect on what it takes to embed evaluation and evidence use in local systems. We will share findings on enablers and barriers to evidence leadership; from leadership commitment and organisational readiness to practical enablers such as time, networks, and peer learning. The session will conclude with interactive discussion, encouraging reflection on how What Works Centres and evaluators can collaborate to make evaluation useful, used and usable.

Aligned with the theme “Bridging the Gap: evaluation to action,” this session provides practical insights for What Works Centres, policymakers and evaluators seeking to move from isolated evidence use toward sustainable evidence-led cultures and explores what it really takes to make evidence-based leadership stick.

Getting evaluation evidence into use: the role of evidence-based decision-making products

Ashima Mohan (Campbell South Asia)

Send message to Author

Paper short abstract

This session shows how structured products like evidence maps and digital summaries help turn evaluations into decisions. With examples from youth, employment, and climate policy, it highlights how design and communication make evidence easier for policymakers to use.

Paper long abstract

Evaluation is intended to serve two functions: accountability and lesson learning. The lessons from an evaluation can go beyond the intervention being evaluated to be applicable to other interventions in other settings. One channel for evaluation findings being transferred to other settings is when their findings are summarised in systematic reviews. But, like evaluation reports, systematic reviews often remain unread and are not accessible to decision-makers. Traditional outputs like academic papers or policy briefs fail to connect with decision makers. These formats often assume too much time, background knowledge, or interest from the audience. As a result, evidence ends up being underused, no matter how strong it is.

Knowledge brokering, or knowledge translation, has emerged as a means of getting evidence into use. This presentation presents examples of a specific form of evidence product, called Evidence-Based Decision-Making Products based on systematic reviews of evidence from evaluations. These products include interactive toolkits, visual platforms, and digital summaries based on systematic reviews and evidence and gap maps.

In this presentation, we share insights from three approaches:

1. Evidence toolkits

a. Youth Endowment Fund Toolkit (UK) – A platform that presents evidence on what works to reduce youth violence. Each approach is rated for its impact, strength of evidence, and cost. The toolkit has been used by government agencies, local councils, and even the Prime Minister’s Office, to shape funding decisions and policy strategies.

b. Youth Employment Evidence Platform (Sub-Saharan Africa) – A collaboration with the European Commission to help guide investments in youth employment. The platform includes a meta-analysis across ten interventions, plain-language summaries, and policy-relevant metrics. It supports planning by showing what works, where, and at what cost.

2. Evidence Q&A (Global) – The CIGAR Evidence portal is designed to support gender-responsive climate and agriculture policies. It organizes complex evidence into a simple question-and-answer format, helping users explore topics like how women adapt to climate change or how gender affects access to resources.

3. Evidence summary evidence and gap maps – Evidence and Gap Maps (EGMs) are of growing interest, being used to identify what evidence exists in particular policy are. These maps show what evidence exists, not what it says. We have worked on two projects in which the map does contain cell-wise evidence summaries: Child Protection Research, and Conflict and Atrocity Prevention.

Across all these cases, the core idea is the same; communication is not just the last step in evaluation, it is part of the design. We involve stakeholders early, ask what they need, and build tools around their questions. We use plain language, strong visuals, clear structure, and digital formats that are easy to navigate. We also build in features like filtering, comparisons, and implementation guidance to help people move from knowing to doing.

These efforts have already led to tangible results: budget decisions tied to toolkit ratings, local governments revising programs based on evidence, and greater awareness among international donors of where their money can make the biggest difference.

Adapting realist and participatory approaches to evaluate a multi-faceted intervention in communities affected by podoconiosis in rural Ethiopia

Estelle McLean (Malaria Consortium) Misganu Endriyas (South Ethiopia Public Health Institute)

Send message to Authors

Paper short abstract

We report learnings from deployment of realist and participatory approaches with limited resources and time-constraints to evaluate a multi-faceted intervention in rural Ethiopia to improve health outcomes of podoconiosis patients, improve prevention, care-seeking and access, and reduce stigma.

Paper long abstract

Podoconiosis (endemic non-filarial elephantiasis) is a non-infectious disease caused by long-term exposure of bare feet to red clay soil derived from volcanic rock. Since 2023, Malaria Consortium has been implementing a project called “Happy Feet: Strengthening Community-based Podoconiosis Prevention and Control in Ethiopia”. The project involves a community-based, innovative intervention package, including training and support for health providers to improve access and quality of morbidity management, disability prevention and psychological support services; community messaging campaigns (billboards, radio messages and community events) to improve preventative and care-seeking behaviours, and reduce stigma against patients; and distribution of customised shoes to aid physical recovery. To evaluate this multi-faceted intervention, and provide usable evidence of what worked, for whom and in which contexts, we will employ aspects of realist and participatory approaches: adapted Ripple Effects Mapping will first be undertaken with providers, community members and patients to understand anticipated and unanticipated outcomes, and to challenge and add to the existing theoretical mechanisms and pathways to these outcomes. These theories and hypotheses will then be further tested with quantitative surveys carried out at households and health centres, allowing for analysis by gender and other factors. Finally, a participatory feedback and reflection event with local and national stakeholders will be held, following all data capture and preliminary analysis, to feed into final conclusions. Both participatory and realist approaches have challenges as they require expertise and time to implement, they have also not been widely used in African settings. In this session we will report on lessons learned from implementing this theory-driven evaluation in a rural Ethiopian setting, with limited resources and time-constraints. The lessons will be recorded systematically and prospectively throughout the evaluation (November 2025-March 2026) through individual, team and participant reflections.

Bridging the gap between academic research and actionable insights through a collaborative PhD Programme focusing on innovative skills research and researcher development.

Rebecca McCartan (Skills Development Scotland) Graeme Smith (Skills Development Scotland)

Send message to Authors

Paper short abstract

This presentation explores the role of a collaborative PhD partnership in bridging the gap between academia, policy, and practice to deliver innovative, impactful skills research, influence policy, and develop a new generation of researchers in Scotland.

Paper long abstract

This presentation will describe how Skills Development Scotland (SDS) and the Scottish Graduate School of Social Science (SGSSS) have formed a collaborative PhD research partnership spanning 13 years. The main aim of this partnership is to connect PhD students, universities, practitioners, policymakers, and stakeholders, so that academic research can more directly inform policy and practical action in Scotland’s skills landscape.

The partnership is designed to close the gap between academic theory and real-world policy by encouraging innovative, impactful research on skills issues. It serves as a useful example of how to achieve research impact, involve stakeholders in research , and share knowledge in practice.

A key feature of this partnership is that collaboration and impact are built in from the start. The programme doesn’t just look at the quality and relevance of research produced on skills policy—it also examines how well and in what ways those research findings are shared. The partnership uses a range of events and outputs to make sure research outcomes reach both policymakers and practitioners.

We will share lessons on some of the challenges of our approach. These challenges include involving multiple stakeholders on a continual basis over the lengthy period of a PhD and making sure that complex research produced by PhD students is turned into clear, practical insights for people outside academia. The programme tackles these issues by using a variety of communication methods, such as student-led seminars and events, to make sure knowledge is shared widely and effectively.

The partnership also pays close attention to diversity, equality, and inclusion. It brings together a wide range of people—students, academics, practitioners, and stakeholders—ensuring that many voices and experiences are included in the research process and in the evaluation of the programme itself.

The presentation will highlight several practical results from the partnership. These include evidence that the research has influenced policy and practice in Scotland, increased the employment prospects of PhD students by giving them real-world policy experience, and developed a model for collaborative research partnerships.

Another major strength of the partnership is the transfer of innovative research methods from academia into practice. These include advanced approaches such as utilising AI in the research process, innovative methods like photo-elicitation, and working directly with young people to co-produce research. These methods have brought fresh perspectives and real innovation to benefit everyday professional practice in SDS and in skills policy research more broadly.

The presentation will highlight that by promoting knowledge exchange and supporting student development, the collaboration has become a model of good practice, showing how partnerships between academia and the public sector can lead to meaningful, impactful research that shapes policy and practice. Finally we will highlight recent developments in the programme that demonstrate our commitment to continuous improvement, for example through our use of AI and innovative research methods.

Using the realist evaluation approach to challenge programme assumptions and promote stakeholder learning: A case of a health research capacity strengthening programme

Meshack Mutua (Liverpool School of Tropical Medicine)

Send message to Author

Paper short abstract

We applied realist evaluation approach to test the programme assumption that individual researchers can be institutional change agents in African universities. The resultant evidence catalysed practice-relevant dialogue among stakeholders and highlighted the need to strengthen research ecosystem

Paper long abstract

In the global health space, health research capacity strengthening (HRCS) has been deemed a strategic way of fostering [health] research equity, especially in low and middle-income countries (LMICs). While the majority of HRCS initiatives focus on developing a critical mass of individual researchers, evidence on the effectiveness of the ‘individuals as agents of institutional change’ model remains underdeveloped. We conducted a realist evaluation to examine how and why research partnerships under the ‘Developing Excellence in Leadership Training and Science in Africa’ (DELTAS Africa) programme – an initiative delivered through a global North-South research partnership – strengthen the health research capacity of African universities. Two cases representing unique research consortia were studied using realist-informed qualitative methods to test an initial programme theory (IPT). We conducted realist interviews with African principal investigators (PIs), collaborators, research support staff, PhD researchers and postdoctoral fellows, and programme-level staff. Retroductive theorising guided the testing of the IPT through the Context-Mechanism-Outcome (CMO) configuration framework. Through theoretical abstraction, we refined the IPT using CMOs from the case theories. Multiple mechanisms (e.g., empowerment, inspiration, sense of agency, vulnerability) were triggered to generate varied research capacity outcomes for individual researchers and their institutions across the two cases. Findings show that the research partnerships provided researchers with access to research resources and opportunities, triggering an empowerment, motivation and inspiration mechanism that resulted in short-term outcomes such as improved research outputs (e.g., increased publications and funding) and enhanced technical and soft research skills and researchers’ career growth in a context where there was buy-in and support by university leadership. A sense of agency mechanism was activated to generate medium-term outcomes, such as improved supervisory capacities in research departments and the establishment of research hubs, in a context where the university research environment was conducive, with researchers spending more time on research than on teaching activities. Even when researchers were empowered with the appropriate skills to mobilise research funding through grant writing, they were often frustrated and rendered vulnerable in contexts where the environment was less supportive, such as poor remuneration, a lack of protected time for research, and deprioritised funding by national governments. The evidence challenges the use of individuals as change agents as an HRCS model and argues that the institutions within which the individuals are based should have minimal supportive research systems in place. Shared with the programme stakeholders, the evidence catalysed discussions about the need to extend beyond individual-level research capacity to sustainably address systemic challenges and weaknesses, thereby building a conducive research environment that retains individual talent and enables research to thrive.

The Nourish school food improvement programme: connecting frontline and evaluation teams through adaptive evaluation

Emily Dawson (School Food Matters) Eve Blair (School Food Matters) Priya Cooper (ICF Consulting Services)

Send message to Authors

Paper short abstract

Nourish is a long-term, flexible intervention to improve food environments in schools. This presentation explores how adaptive evaluation was used over five years to develop and shape the Nourish delivery model, as well as to demonstrate impact and wider applicable learning in a compelling way.

Paper long abstract

Nourish is a long-term, flexible intervention designed to improve school food environments in individual schools. This presentation explores how adaptive evaluation was used to develop and shape the Nourish delivery model, as well as to demonstrate impact and wider applicable learning in a compelling way.

The Nourish programme supports schools to adopt a whole school approach to food. This approach, recommended by both the World Health Organisation and the UK’s School Food Plan, promotes nutritious food across the school day - from the classroom to the dining room- while engaging the whole school community.

Delivered over five years in Southwark and Lambeth, the programme evolved iteratively, shaped by continuous feedback from both the evaluation and the frontline team.

We will focus on how adaptive evaluation can:

- Rise to the challenge of evaluating a programme with no fixed delivery model at the outset

- Fully explore the nuance which makes evaluation learning more practically applicable to a broader range of audiences

- Strengthen relationships between evaluation and frontline teams, and magnify the value of an iterative process

- Help longer-term programmes adapt to changing policy climates

We share the strategies that helped build strong relationships between the evaluation and delivery teams and how we supported the delivery team to work iteratively and reflectively.

This session will showcase how the adaptive approach shaped not only programme delivery but also future iterations of the work, including new strands of the programme in secondary and special schools. It will also demonstrate how this approach supported School Food Matters’ wider policy and campaigning work around improving school food, including the government’s roll out of universal breakfast provision.

Introduction to EvalC3 online

Ryan Storey (Sheffield Hallam University)

Inspiring a Sector: how evaluation informs action to help the nation be more active

Mylene Pacot (Ipsos) Jo Scott Tim Fitches (Sport England)

Send message to Authors

Paper short abstract

Sport England wants more people play sport and be active. Join Ipsos, NPC, Sport England and representatives from their ‘System Partners’ to learn how a groundbreaking evaluation is helping drive Sport England’s investment into 137 organisations across the sport and physical activity sector.

Paper long abstract

Sport England’s 'Uniting the Movement' strategy exemplifies a transformative approach to address inequalities in sport and physical activity by investing in over 137 'System Partners'. This bold investment approach is designed to catalyse system change over the long term and on a broad scale. Here, evaluation is not just a measure of progress, but a dynamic process that drives action.

Ipsos will present our 'Learning & Knowledge Exchange' model, prioritising timely, utilisation-focused insights that are shared through clear reporting and visual storytelling. This model supports partners' ability to swiftly adapt based on insights, translating complex evaluations into actionable strategies.

NPC will share more about their Capability & Capacity building offer, which ensures that partners develop confidence around evaluation and learning techniques, are empowered to implement them effectively, and are supported to understand systems change. This offer is about building a shared understanding that transforms evaluation findings into practical applications.

Representatives from System Partner organisations will bring valuable insights into how integrating the 'Learning & Knowledge Exchange' model with Capability & Capacity support fosters actionable change. Their testimonies will highlight how evaluations have been pivotal in refining their approach and realising strategic goals.

Concluding the presentation, Sport England will underscore the essential role of this approach to drive change alongside their Theory of Change. The holistic integration of evaluation, learning, and action exemplifies a sustainable, impactful approach to achieving system change towards the Uniting the Movement vision.

Please note that a separate abstract from Kev Harris (Hartpury University) has been submitted relating to the National Evaluation and Learning Partnership, a separate but overlapping Sport England investment into ‘Place’. We are in touch with Kev and the team and would be delighted to work with them to ensure that, if selected, our presentations complement one another.

Implementing an evaluation culture in a research team

Maddy Gilliam (British Council)

Send message to Author

Paper short abstract

A case study about embedding participatory and theory-based evaluation in a research team. The example used contribution analysis and co-produced tools to build an evaluation culture in a research team to demonstrate value while trying to avoid M&E becoming a tick-box activity.

Paper long abstract

As institutional budgets tighten and non-income-generating activities face increasing scrutiny, evaluation has become a crucial means not only of improving programmes and projects but also of demonstrating their value and relevance. This paper explores how evaluation practices were introduced and embedded within a research team, referred to here as English Language Research, which had not previously been expected to evidence the impact or value of its work in such a systematic way. Introducing evaluation in this context required a sensitive and participatory approach that recognised both the autonomy of research practice and the need for accountability.

The paper presents a case study of an evaluator joining the team as a member of staff to establish monitoring and evaluation (M&E) practices that were both practical and theoretically informed. Drawing on participatory and co-production principles, the approach aimed to integrate evaluation into the team’s existing culture of inquiry, positioning it as a tool for learning and reflection rather than as an external audit mechanism. The process sought not only to demonstrate outcomes but also offer learning opportunities for the team.

Evaluating the impact of a research team working across a large and complex organisation presented distinct challenges. Activities such as dissemination, relationship-building, and collaboration often contributed indirectly to outcomes, making attribution difficult. Furthermore, there was initial concern that introducing a monitoring culture might reduce research activity to a ‘tick-box exercise’ or fail to recognise the value of exploratory, developmental work.

To address these challenges, a participatory evaluation framework was developed, engaging the team at every stage. Evaluation tools were co-created and refined through consultation, including the use of collaborative digital platforms (such as Padlet) that allowed members to share feedback, build collective insights and co-construct an evolving picture of outcomes. Regular team meetings were used to share findings and invite reflection on the M&E process, embedding evaluation within the team’s ongoing practices rather than positioning it as a separate requirement.

Alongside this participatory process, a contribution analysis approach was applied to explore the team’s influence within the wider organisational system. Contribution analysis provides a structured, theory-informed method for testing whether the evidence reasonably supports a hypothesised chain of outcomes. Combined with a collaboratively developed theory of change, this enabled the team to articulate how their research, dissemination and partnership activities contributed to longer-term institutional outcomes, even where direct attribution was not possible. However, aligning anecdotal and qualitative insights with structured evidence remains an area for further development. The next stage will involve developing case studies to explore how different areas of contribution interconnect within a broader picture of institutional impact.

The paper concludes by reflecting on how participatory and theory-based approaches can be combined to build evaluative capacity, foster ownership of evidence and support research teams to demonstrate value in ways that are meaningful, proportionate, and aligned with academic practice.

The “equity knot”: Negotiating equity in participatory evaluation of systems change

Xiaotong Zhu (Dartington Service Design Lab)

Send message to Author

Paper short abstract

This presentation explores the equity and power dynamics in co-creating a Theory of Change for a systems change programme. It shares how evaluations can navigate equity-related challenges while embedding reflection, inclusion and usability in participatory evaluation practice.

Paper long abstract

Participatory approaches have been increasingly promoted as ways to ensure diverse voices are heard in evaluation and to build a shared understanding across stakeholders. Yet in practice, they usually present evaluators with challenges and difficult trade-offs - balancing divergent perspectives and reconciling conflicting priorities while still delivering evaluation responsibilities within resource constraints.

This presentation reflects on what this “equity knot” (Gates et al. 2024) looks like in practice, drawing on our experience of co-designing of a Theory of Change (ToC) for a place-based systems change programme aiming to improve education, employment, and training (EET) outcomes for South Asian young people, particularly those from Pakistani and Bangladeshi backgrounds who face persistent barriers to good-quality work.

As action researchers embedded in the programme, we worked closely with a wide range of stakeholders, including young people, youth ambassadors, local partners, employers, communities, and funders, to co-create a ToC that valued and reflected multiple perspectives. While the process sought to ensure transparency and equity, it surfaced significant tensions around which forms of knowledge (e.g. practitioner, lived, research) most strongly shaped the ToC, whose ideas were prioritised or left out, and how divergent views on success could be brought together without losing evaluation focus or feasibility.

We documented these tensions through keeping a reflective learning log, tracking key decision-making points, rationales, trade-offs throughout the development of the ToC. This helped embed evaluative thinking and reflection into the programme’s ongoing learning and delivery to address systemic barriers influencing local youth employment.

Through the presentation, we will invite the audience to reflect on the “equity knots” in their own evaluation practice and to see equity not as an endpoint or a tick-box item, but as a continuous process of negotiation, reflection, and adaptation – an integral part of embedding evaluative learning in complex systems change initiatives.

Festivals with Benefits

Alison Rivett (University of Bristol)

Send message to Author

Paper short abstract

This poster highlights a case study from public engagement with research at festivals to demonstrate how evaluation findings can be shared to increase understanding and drive practical action.

Paper long abstract

This poster highlights a case study from public engagement with research at festivals to demonstrate how evaluation findings can be shared to increase understanding and drive practical action in the university sector and beyond.

Delivering interactive festival activities is a popular way for researchers to engage with a wide range of people. Many researchers and engagement professionals believe there is great value in such engagement, not just for festival-goers, but for researchers too. However, there is little published data backing this up. This case study highlights the benefits, challenges, and learning gained from 8 years of the FUTURES Festival. Our extensive evaluation evidence identifies the ingredients and support which ensures researchers can deliver high-quality public engagement; demonstrates the skills they gain from taking research to festivals; and suggests how to articulate, enable and reward researchers’ development.

High-quality public engagement helps researchers to understand, increase, and demonstrate the impact of their research outside of universities. It encourages academics to involve the public in their work, and consequently creates a more accessible research culture, generating meaningful, impactful research. Festivals in particular help researchers engage with communities of place and interest, inclusively sharing and situating their research with relevance, democratising knowledge and including diverse voices. Our soon-to-be-published journal article shows how researchers and engagement professionals can use this evidence themselves when organising festival-style events or advocating for them in their own institutions, to ensure maximum value for all concerned.

The FUTURES Festival of Discovery has been bringing research to life across the South West of England since 2018. The extensive free programme of public events exploring the worlds of science, culture and research has been funded by UKRI and the European Commission and delivered by a consortium of the Universities of Bath, Bath Spa, Bristol, Exeter and Plymouth. For more details see: https://futuresnight.co.uk/about/

The ACCESS Guiding Principles Framework: The role of reflection and evaluation in enacting sustainable, inclusive and co-productive research cultures

Jenny Hatchard (University of Exeter) Sarah Golding

Send message to Authors

Paper short abstract

A story of how a culture of continuous reflection and regular evaluation can embed multiple principles in research and related projects in ways which maximise their synergy and tackle their tension.

Paper long abstract

We report on an integrated framework for embedding principles of sustainability, inclusion and co-production in research and related practices. Applicable to policy, practice and academic settings, the ACCESS Guiding Principles Framework fosters a culture of continuous reflection and evaluation, creating pathways for learning and change.

There are increasing pressures on researchers working in all settings to explicitly underpin their professional practice with fundamental principles relating to people and the environment. Some of these pressures are internal – driven by our own values. Others are external – expected by our institutions, funders or partners.

The ACCESS Network*, which foregrounds the critical importance of social sciences for addressing environmental challenges, set out in 2022 to underpin all of its work with three fundamental principles: environmental sustainability (ES), equality, diversity and inclusion (EDI) and knowledge co-production (KCP). However, the team quickly realised these principles bump into one another, moving dynamically between synergy and tension in different contexts, and requiring frequent deliberation. This was particularly evident because ACCESS has such a wide remit encompassing delivery of training and networking events, flexible fund management and policy-facing environmental social science research.

To address this, the separate principles were reimagined as an integrated framework within which users are encouraged to continuously reflect on how sustainability, inclusion and co-production intersect. In this approach, which recognises the dynamic and context-specific nature of value-driven research and related work, reflective prompts replace fixed rules as the key tools for practice. And, where possible, partners and participants are invited to join the conversation through informal or formal channels. This supports open, evidence-based and thoughtful decision-making, rendering any trade-offs, compromises, prioritisations or innovations amongst the principles conscious and visible.

An overarching evaluation of the Guiding Principles Framework, based on 26 interviews with users and workshops with 65 members of the wider ACCESS Network, has uncovered stories of unexpected synergies between principles, as well as sticky situations where principles have seemed impossible to reconcile. In both of these circumstances, users of the Framework highlight the value of continuous reflection and regular evaluation. These practices create spaces and evidence for transparent and shared deliberation with partners and participants about what works and what can be improved, paving the way for sustainable, inclusive and co-productive change.

What if we’re wrong? Using hypothesis testing to support reflection, test assumptions and learn lessons in digital democracy support.

Alex Scales (Westminster Foundation for Democracy (WFD)) Adrienne Joy (WFD)

Send message to Authors

Paper short abstract

What happens when organisations embrace a “learning first” approach in place of results-oriented programming? In this session, Westminster Foundation for Democracy (WFD) will share its “hypothesis testing” approach – a real-time learning framework ideally suited to working with uncertainty.

Paper long abstract

Digital technologies are reshaping how societies communicate, govern, and engage with power – but democratic actors are often playing catch-up. The risks are clear: unchecked misinformation, exclusion by design, manipulative AI use, and civic spaces under pressure. But the opportunities are real too: stronger political inclusion, more responsive institutions, and new tools for accountability.

But realising these benefits is hard – especially when evidence of “what works” might be hard to find. So how do we learn what actually works? And how do we do that before scaling up unproven ideas?

This session explores what happens when a democracy support organisation embraces a “learning first” approach in place of traditional results-oriented programming. Join Westminster Foundation for Democracy (WFD) to learn about hypothesis testing – a practical learning framework for generating real-time programme- and portfolio-level insights when working with high levels of uncertainty

WFD’s “Democratic Resilience in a Digital World” programme was a one-year pilot programme designed not to deliver big results, but to generate lessons that WFD – and the wider democracy community – could use. It included:

1. Testing digital tools and interventions through pilot projects in Kenya, Bosnia & Herzegovina, and Sri Lanka;

2. Real-time learning through structured knowledge exchange and reflection between pilot projects and other organisational work on digital democracy;

3. Purposeful research to build WFD’s evidence base on promising digital approaches for democracy support.

We’ll share more about the key hypotheses and questions that underpinned the programme’s learning approach: Can digital tools support more inclusive governance? Can AI tools be used effectively to enhance public participation? What combination of human and AI inputs does it take to build a public interest Wiki on election candidates?

We’ll share how WFD’s hypothesis testing approach helped to surface honest insights to these questions in complex, fast-moving contexts – and what that means for others trying to deliver quality programming in the face of high levels of uncertainty. You’ll hear about what worked, what didn’t, and how intentional learning created space for more adaptive, resilient programming. We’ll also share details of how this approach helped to generate relevant portfolio-level learning to help inform the design of future programmes.

This session is especially relevant for:

• Programme managers and implementers seeking ways to generate practical and useful evidence of what’s working, what’s not, and why

• Grant managers seeking learning frameworks capable of delivering relevant programme- and portfolio-level insights

• Civic tech or democracy support innovators looking to better understand change

• Policymakers and donors looking for adaptive, evidence-driven approaches

• Researchers and technologists interested in co-creating with democracy actors

• Anyone seeking smarter, more humble ways to navigate digital transformation

Come ready to challenge assumptions, ask questions, and take away practical ideas to apply in your own work.

For more information, please see:

Alex Scales, Seyi Akiwowo, Adrienne Joy and Charlotte Egan, 2025. Using digital technology for democratic resilience, transformation and impact – learning paper. Westminster Foundation for Democracy. June 2025. Available online here: https://www.wfd.org/what-we-do/resources/using-digital-technology-democratic-resilience-transformation-and-impact

Piloting collaborative evaluation approaches in the ‘Fashion Practices for Social Change’ unit at LCF: Making space for staff and student voice in evaluation for the purpose of curriculum development

Rose Thompson (University of the Arts London) Francesco Mazzarella (University of the Arts London)

Send message to Authors

Paper short abstract

We piloted a collaborative qualitative approach to explore student experience of a new complex master's unit's first year of delivery to identify strengths and areas for improvement. We supported students to contribute to data analysis, who surfaced unexpected insights for the next unit iteration.

Paper long abstract

Introduction:

Fashion Practices for Social Change is an elective unit embedded within the Master's programme at London College of Fashion, UAL. Designed by Unit Leader, Dr Mazzarella, the unit delivery entails a combination of taught lectures, seminars and group project work responding to live creative briefs set by external partners. Students are asked to consider key principles and concepts relating to climate, racial, and social justice, and embed relevant practices into their work.

Evaluation approach:

In the academic year 2024-2025, we piloted a collaborative evaluation approach to explore how students experienced the unit during its first year of delivery. We drew on Ward et al.’s (2021) work on embedded research in health care settings to pilot a collaborative approach to embedding a culture of evaluation in universities delivering creative education. Our key aim in this evaluation was to identify strengths and weaknesses in the content and delivery of the unit to provide timely constructive feedback that would enable effective improvements in the next academic cycle and beyond. Through conversations between the Evaluation Lead (Dr Thompson) and the Unit Leader, we agreed on a qualitative approach that would sit alongside the delivery of the unit while causing minimal disruption. The core methodology involved a combination of some observations, with one-to-one semi-structured interviews with staff, students and external partners who were involved in setting live creative briefs.

Key adaptations:

The crucial adaptations within our approach concerned data analysis and feedback. We employed a small team of UAL students who had held student advocate roles relating to climate, racial and social justice to contribute to the analysis and interpretation of the data from the interviews. They did this through a guided series of thematic analysis meetings supported by the Evaluation Lead. This analysis was then curated onto a Miro board and fed back to the Unit Leader via a one-to-one meeting at the point in which he began planning for the next academic cycle, with later meetings scheduled with other members of the delivery team. The Unit Leader and Evaluation Lead agreed on an action plan for writing up the findings, in which the latter would write a first draft, and the former would then layer in his team’s reflections and correct key details around unit development and delivery through iterative discussions.

Key learnings:

• Student researchers identified several themes as being important to student experience that did not come to the immediate attention of the Evaluation Lead, enriching the analysis and feedback.

• The timing of feedback allowed the Unit Leader to have access to, and reflect on, the key messages in time to embed relevant changes into the curriculum for the planned unit delivery in the next academic year.

• This collaborative approach, that centred staff and student voice, was felt to be supportive and constructive.

Conclusion:

This unit and collaborative evaluation demonstrate how staff and student voice can be embedded in the curriculum and support student learning, through producing timely and constructive feedback resulting in effective iterative curriculum change.

Bridging the gap between evidence and policy: driving accountability via Evidence-to-Action workshops

Yaz Romani (Department for Culture, Media and Sport) Jamie Juniper (DCMS) Matt Turner (The Department for Culture Media and Sport)

Send message to Authors

Paper short abstract

Traditional dissemination often lacks accountability and follow-through. Evidence-to-Action workshops use co-production to create evidence-based actions to improve interventions. We use a grassroots sports case study to demonstrate how these workshops drive accountability in evidence-based policy.

Paper long abstract

Effective dissemination moves beyond typical one-way presentations and instead uses active engagement with stakeholders. Evidence-to-Action workshops aim to bridge the gap between evaluation findings and practical action. These workshops provide an opportunity to discuss key evaluation findings, explore their implications, and co-produce action-orientated recommendations with stakeholders to improve policy and practice.

Whilst traditional dissemination, presentation followed by Q&A, can raise awareness of evaluation evidence, it often fails to secure ownership or follow-through for improving policy and practice - with findings often 'sitting on the shelf'. Failing to change policy and practice using evidence does not achieve true value for money. Studies suggest limited effectiveness of passive methods on their own, and point to greater impact where activities involve two-way engagement. Active dissemination using co-production delivers practical benefits. For example, more efficient translation of evidence into actions, stronger stakeholder buy-in and accountability for using evidence, identification of context-specific adaptations, and an improved culture for making evidence-based decisions. Evidence-to-Action workshops increase the likelihood that findings will be operationalised rather than archived.

After an evidence-based summary of the evaluation, participants break into facilitated, mixed-stakeholder groups to test the implications of key findings for policy and delivery, identify barriers and enablers, and co-produce time-bound, evidence-based recommendations that stakeholders can integrate into decision-making. The outcome is a short implementation plan with assigned action leads and deadlines.

We will use DCMS' Multi-Sport Grassroots Facilities (MSGF) programme as a case-study to demonstrate how Evidence-to-Action workshops can actively translate findings into policy and practice. The MSGF programme allocates funding for the improvement of multi-sport grassroots facilities across the four Home Nations. This aims to boost activity levels and sports participation amongst local communities. The programme focuses on delivering projects in areas where there are under-represented groups and higher levels of deprivation to ensure physical activity is accessible to all, no matter background or location. We will discuss our lessons learned from delivering an Evidence-to-Action workshop with key programme stakeholders, highlighting how this has resulted in tangible improvements to programme delivery and offering reflections on how this approach could be applied elsewhere in DCMS and across government.

Evidence-to-Action workshops aim not merely to inform, but to catalyse change - turning evaluation findings into owned, implementable policy and practice.

Shaping the Tech for Better Care Programme through developmental evaluation: harnessing evidence in complex systems for timely decision making and learning.

Ross Goldstone (The Health Foundation)

Send message to Author

Paper short abstract

This session shares learning from a developmental evaluation of the £2m Tech for Better Care Programme, showing how theory-based approaches and real-time evidence informed programme adaptation, strengthened local evaluation capacity, and advanced innovative evaluation practice.

Paper long abstract

This presentation will share learning on the role of a developmental evaluation approach in informing programmatic changes and decision-making within the Tech for Better Care Programme.

The Tech for Better Care Programme is a £2 million innovation programme exploring the potential for using digital technology to enable proactive and relational care at home and in the community. The programme adopted a ‘test and adapt’ approach, whereby funding was provided to teams develop, test and pilot innovation approaches to tech-enabled service change between October 2023 and March 2026. This was an innovative programme design developed at the Health Foundation, which positioned evidence-based iteration at the core of its way of working.

During the programme, a process and impact evaluation was undertaken to capture the programme process and experience, as well as the impact of the local interventions implemented. Specifically, a developmental evaluation was chosen to enable iterative development of the funded programme and local interventions in real-time using evaluation evidence and learning. This was underpinned by the Contribution Analysis theory-based evaluation methodology, which focused on testing the validity of and strength of support for eight core programme hypotheses in the Theory of Change. Data triangulation was also a characteristic of the evaluation methodology, with local project impact and learning data (e.g., on user experience and outcomes) combined with workshop and interview data collected at the programme level to generate findings. Evaluation activities also involved working closely with local implementing teams who were conducting local evaluations to feed into decision-making at the intervention and programme level. Thus, the evaluation also sought to directly encourage the development of evaluation practice and evidence use in local teams.

Our presentation will begin with a concise outline of the Tech for Better Care Programme, including its Theory of Change and evaluation approach. Thereafter, we will focus on the key learning obtained by the programme funder, evaluation team and local teams on the programme. This will allow attendees to learn about:

- How to apply theory-based evaluation approaches to support iterative developmental programmes;

- The role of different programme actors in effectively using evaluation to bring about programme change;

- The opportunities and challenges inherent in an iterative developmental programme;

- Practical tips for effectively embedding evaluation at different levels of decision-making.

Therefore, the session offers broad appeal to the evaluation community, but most notably to those interested in developmental evaluation, contribution analysis, and the use of evaluation and evidence in the development of digital intervention in healthcare.

Creating Evidence-Ready Teams: Lessons from Pause & Reflect Practice Across Humanitarian and Development Programs

Amy Joce (Mercy Corps) Sydney Stevenson (Mercy Corps) Florence Randari (Mercy Corps)

Send message to Authors

Paper short abstract

How do teams become places where evidence is valued and used? Drawing on Pause & Reflect practice across 20+ humanitarian and development programs, this session shares practical lessons on the behaviours, conditions, and relationships that build real evaluation cultures.

Paper long abstract

What truly enables teams to value and use evidence in their everyday decisions? In fast-paced humanitarian and development settings, MEL systems often generate data, yet the cultural, relational, and organisational conditions required for evidence use are far less understood. This session offers practice-based insights from Mercy Corps’ experience designing and facilitating structured Pause & Reflect processes across more than twenty programs in Africa, the Middle East, and Asia—spanning emergency food security, cash assistance, protection, resilience, and market systems development.

Rather than framing evidence use as a technical gap, this work positions it as a cultural one. Through cross-functional reflection sessions—supported by learning questions, participatory dialogue, consolidated data sets, and SOAR analysis—teams begin to establish the norms, habits, and relationships that allow evidence to inform everyday decisions. While the USAID-funded Pause & Reflect toolkit provides a helpful structure, this session focuses on what enables the approach to work rather than on the tool itself.

Three insights consistently emerge across humanitarian and development programmes.

First, evidence is used when teams have protected spaces for sensemaking. Staff in emergency responses often move from one urgent priority to the next, with little room to interpret data collectively. When teams pause—away from immediate delivery pressures—they can identify trends, challenge assumptions, reflect on participant feedback, and generate shared interpretations. This strengthens both learning and decision ownership.

Second, evidence use increases when power dynamics are intentionally disrupted. In many teams, hierarchical routines shape whose interpretation is accepted and whose evidence counts. Creating inclusive, participatory spaces where diverse staff voices, local partners, and community insights are elevated proved essential. This redistribution of interpretive authority strengthens localisation and builds environments where evidence is collectively valued.

Third, evidence becomes actionable when learning is tied to feasible adaptation. Teams engaged more deeply with evidence when reflections led to clear next steps—adjusting transfer values, refining accountability mechanisms, improving market monitoring tools, or strengthening targeting approaches. Learning that remains abstract rarely shifts behaviour; learning that leads to adaptation does.

The session will also highlight challenges: building psychological safety in politically sensitive environments, addressing imperfect or fragmented data, sustaining learning amidst staff turnover, and balancing structured reflection with delivery demands. Examples will illustrate how similar enabling conditions—shared purpose, inclusive dialogue, and structured reflection—support evidence use across both humanitarian and development contexts.

Participants will leave with a nuanced understanding of what helps create environments where evidence is genuinely valued: collective reflection rituals, inclusive sensemaking, reduced hierarchy in evidence interpretation, and practical links to action. This session offers evaluators, practitioners, and programme leaders insights for embedding evaluation into everyday work, regardless of context.

Developing digital tools to support evaluation in higher education

Hannah Thomson Tatjana Damjanovic (TASO) Rob Summers (TASO) Katherine Drew (TASO)

Send message to Authors

Paper short abstract

We will introduce two digital tools recently developed by the Centre for Transforming Access and Outcomes in Higher Education (TASO) and share how the Theory of Change Builder and the Higher Education Evaluation Library support higher education institutions to embed evaluation.

Paper long abstract

In the UK, inequalities persist between who accesses, succeeds at, and successfully progresses from higher education. Higher education institutions run a variety of interventions aimed at addressing these inequalities, often targeted at people from socioeconomic backgrounds that are underrepresented in higher education. These interventions range from information sessions on applying to university, to wellbeing and academic skills support provision once at university, to career guidance supporting the progression from university into employment or further study. As a government What Works Centre, our role at the Centre for Transforming Access and Outcomes in Higher Education (TASO) is to support efforts to evaluate the impact and implementation of these interventions. We do this by commissioning evaluations and supporting the higher education sector to run their own evaluations, in line with requirements set by the higher education regulator.

A key driver of higher education institutions evaluating their own interventions is to embed a culture of evaluation across teams and departments to ensure that the effectiveness of all interventions is assessed and this evidence is used to inform practice. Similarly, across the higher education sector, we encourage sharing evaluation findings to collectively develop a better understanding of what works to address inequalities in higher education. However, in practice, the individuals tasked with evaluation often lack adequate resources to evaluate interventions and make evidence-informed decisions.

In response to these challenges TASO has developed two freely available digital tools. These tools support two key points in the evaluation process: developing a theory of change (ToC) and disseminating evaluation findings.

The ToC Builder is an online tool that walks the user through creating a ToC for their intervention. It supports those with little prior evaluation experience by including guidance and examples from the higher education sector at each step of the way. The tool produces a ready-to-export ToC in diagram and narrative format, compliant with accessibility standards.

The Higher Education Evaluation Library (HEEL, launching in spring 2026) is a freely accessible searchable database of evaluations focused on interventions supporting access, success and progression in higher education. The HEEL will support knowledge exchange, foster collaboration, and support the dissemination of evaluation evidence on what works to reduce inequalities in higher education. It will also help identify trends and gaps in evaluation practice across the sector.

Both digital tools build on existing TASO resources, including a framework for coding interventions, and make planning evaluations and reporting findings more interactive and accessible to non-specialists. In this session, we will introduce the ToC Builder and the HEEL, including how we developed them with input from prospective users, and how they embed evaluation in higher education providers. We will explore both the practical and ideological aspects of developing digital tools for evaluation, and how digital tools can be used to expand evaluation capacity and impact in response to sector needs.

The value of visual metaphor in critically reflecting on how inequalities take shape within evaluation practice: A qualitative study with health system actors in England

Naoimh McMahon (Lancaster University)

Send message to Author

Paper short abstract

‘Problems’ do not exist independently of our knowledge of them, but instead take shape through our efforts to study and evaluate them. This poster presents findings from a qualitative study that used visual metaphor to explore the implications of this epistemic insight for evaluation practice.

Paper long abstract

Despite significant research, intervention, and evaluation to inform policy and practice responses to health inequalities, improvements have been slow to follow, with some metrics even suggesting that health inequalities are widening. While the reasons for this are multiple and complex, there is increasing recognition that the ways in which inequalities get framed for action might be contributing to the challenge.

Scholarly insights from fields such as the sociology of social problems, and from novel Foucauldian-inspired approaches to policy and discourse analysis, have demonstrated the importance of attending to the forces that put shape on complex problems in policy and practice, and how they can open up or close down the scope of possibility for action.

Inspired by these insights and their practical application in the fields of health, early childhood, and youth justice, I created a resource of visual metaphors that is designed to assist practitioners in undertaking this form of critical analysis, and to reflect on how dominant approaches to evaluation may inadvertently lead to inequalities being framed in narrow and limiting ways. I undertook extensive engagement and data collection with people working to implement and evaluate action on inequalities across the health system in England to explore their perspectives on the role and value of this kind of creative resource in their work.

On the whole, health system actors were positive about the visual resource and appreciated the ways in which it distilled down complex and often abstract or theoretical ideas into a digestible narrative with supporting imagery in the form of visual metaphors. They offered constructive critique on aspects of the resource that could be further developed and clarified. However, they also expressed a degree of pessimism about the extent to which institutions can be reshaped and felt additional tools and resources would be required to help operationalise the insights presented. While the booklet does offer a useful tool for collective reflection and dialogue, more prescriptive guidance on the ‘how’ of realising deep institutional change to engage with and value alternative approaches to evaluation is needed.

Developing Evaluation Triangles: A Visual Representation of a Systems-Based Approach to Evaluating Marine Plan Objectives and Policy Effectiveness

Rachel Holtby (ICF)

Send message to Author

Paper short abstract

A visual approach to evaluation of England’s marine plans, which uses radial diagrams built of evaluation triangles, to communicate progress and interactions across policy areas, making marine plan monitoring and evaluation more accessible, engaging, and actionable for planners and policymakers

Paper long abstract

Marine plans are a central part of how England manages the sustainable use of its seas, balancing environmental, economic and social priorities across different marine sectors. Each plan contains a set of policies and objectives designed to guide decision-making and deliver multiple outcomes; from supporting blue growth to protecting marine ecosystems and enhancing community wellbeing. To assess their effectiveness, the Marine Management Organisation (MMO) monitors data on policy implementation and environmental and socio-economic indicators. However, this data is not yet systematically evaluated or presented in an accessible way, limiting understanding of whether plan policies are achieving their intended objectives, and how progress in one policy area may influence outcomes in others.

To address this challenge, ICF developed a contribution analysis framework for marine plans through an MMO-commissioned project, providing a structured way to assess how plan policies contribute to outcomes across the complex marine and coastal system. Building on this, a joint ICF/MMO CECAN Fellowship research project has been exploring how to organise marine plan monitoring data to enable more systems-based evaluation and more effective communication of findings. Central to this work is the development of a visual approach that helps represent the progress of policies and objectives in a clear, engaging, and holistic way.

Inspired by established visual frameworks such as Planetary Boundaries (Rockström et al., 2009) and Doughnut Economics (Raworth, 2018), radial diagrams built up of evaluation triangles show how each marine plan policy is performing relative to its intended outcomes and acceptable system limits. This visualisation makes it easier for planners, policymakers, and stakeholders to understand how progress is distributed across different outcomes, where synergies or trade-offs may exist, and which areas may require adaptive management.

This approach directly supports the conference theme “Communicating Evaluation for Action” by translating complex evaluation findings into intuitive visual narratives that promote shared understanding and dialogue. The evaluation triangles and radial diagrams are scalable, meaning they can be applied at the level of individual policies, plan objectives, or even across multiple marine plans. This scalability enhances the accessibility of evaluation findings, making contribution analysis more transparent, easier to interpret, and more actionable for policymakers and delivery partners.

In the joint presentation, delivered by Dr Rachel Holtby (ICF) and Victor Owoyomi (MMO), we will share: the methodological steps for linking monitoring data to the visual triangles; examples of how visual tools help clarify progress and interdependencies across marine policy areas; reflections on how the approach facilitates faster analysis and more engaging communication; and insights on how this technique could be adapted to other policy domains facing similar systems challenges.

Ultimately, this work aims to stimulate discussion around how evaluators can use visual tools to communicate complexity effectively, promote adaptive learning, and strengthen collaboration between evaluators, analysts, and decision-makers. By demonstrating the potential of evaluation triangles, we invite participants to consider how similar techniques might enhance evaluation reporting and action across diverse policy areas.

Stories of Change: Using hindsight in form of a Letter to Your Younger Self to understand the learning of leaders implementing a wholes systems approach

Chris Szedlak (Hartpury University)

Participatory approaches: Using Ripple Effects Mapping to evaluate a local authority research training programme

Natasha Harding (Blackpool Council) Amelia Simpson (Lancaster University) Natalie Holt (Blackpool Council) Sandra Be (Empowerment charity)

Send message to Authors

Paper short abstract

We demonstrate how ripple effects mapping as a participatory approach to evaluation, delivers rich and insightful data about a local authority research training programme. Findings reveal intended and unintended impacts on individual skills and organisational research activity.

Paper long abstract

Background

Local authorities regularly make decisions that impact the determinants of health and subsequently, health inequalities. To ensure that policy and service decisions are optimum, it is important that local authorities access and use the most contemporary and wide-ranging evidence. In response, Blackpool Researching Together co-designed and delivered a research training programme to help build local authority and voluntary sector staff capabilities to use and conduct research. The aim of the current study was to evaluate the intended and unintended outcomes of the research training programme.

Methods

We used ripple effects mapping (REM) as a participatory qualitative approach to evaluation. This interactive session encouraged previous research training participants and facilitators to reflect on how the programme had influenced individual skill development and contributed to changes in research activities and outputs in their respective organisations. All participants and programme facilitators from cohorts 1 and 2 were invited to map outcomes. Two workshops were held in December 2025, allowing 12 months of post-programme outcomes for cohort 1 and six months post-programme outcomes for cohort 2. Cohorts 1 and 2 workshops were held separately. Small group guided discussion and flip charts were used to capture participant insights. Discussions focused on links between actions and impacts, the most and least significant outcomes, and who was impacted. Workshops focused on sense-making, gathering both individual and collaborative insights, resulting in two co-created visual maps. The maps depict a timeline capturing all of the outcomes provided by the participants.

Results

Although formal results not yet available, prior to the study, informal anecdotes suggested that the research training programme was successful in improving the participant’s confidence in research and had positive impacts on career trajectories. At the conference, we will present the results of the REM workshops, including a summary of the intended and unintended outcomes of the programme, and the co-produced visual maps. We will discuss the evaluation’s implications for practice, turning evidence into action.

Conclusion

This presentation will demonstrate how participatory and reflective approaches to evaluation, such as ripple effects mapping, can deliver rich and insightful data that can be translated into practice. Our presentation will demonstrate how participatory methods capture nuanced impact that may otherwise be missed using traditional methods. We will also highlight how REM workshops break help to remove traditional power dynamics on research.

Tracing Contribution: Lessons from Applying Contribution Analysis in Energy and Environment Policy Evaluations

Daniel Cook (Technopolis Ltd) Isobel Urquhart (Technopolis) Orla Doherty (Technopolis Group)

Send message to Authors

Paper short abstract

Robustly assessing how government funding achieves policy goals remains a challenge. This paper synthesises our experience of using Contribution Analysis (CA) in energy and environment evaluations. Drawing on practitioner insights, we explore the challenges and lessons learned through using CA.

Paper long abstract

This paper synthesises learnings from recent evaluations in the energy space that have applied Contribution Analysis (CA) to assess the contribution of UK government interventions towards net-zero policy goals across projects, programmes, and market mechanisms. For example, this includes evaluations of innovation funding schemes (such as Heat Pump Ready), retrofit funding schemes (such as the Social Housing Decarbonisation Fund), and market interventions (such as the Capacity Market). Drawing on practical experience in delivering evaluations with a public policy consultancy, we explore how Contribution Analysis, often combined with Process Tracing (PT) and/or informed by the work of Delahais and Toulemonde, has been delivered in live evaluation contexts.

Our synthesis highlights methodological challenges encountered when applying CA in live policy environments. Specifically, we critically assess the use of CA as tool to assess the contribution of interventions when evaluation projects specify the use of CA but it is not wholly fit for purpose: 1) complex and interlinked multi-programme evaluations, 2) evaluations with poor quality data sources, 3) evaluations using CA in parallel with programme delivery where there is insufficient time for contribution to be clear, and 4) smaller scale evaluations without resources to collect sufficient data to exploit the value of CA’s evaluative power.

We examine how we as evaluators have navigated these challenges to produce credible conclusions on contribution stories that have informed decision-making, and how this has refined our approach to theory-based evaluation to ensure impact for policymakers. The analysis identifies lessons from our experience on how to best utilise Contribution Analysis, Process Tracing and the work of Delahais and Toulemonde in order to develop and test compelling theory based contribution narratives. Namely, which programme objectives are met, how market mechanisms adequately incentivise participants and whether project teams successfully deliver innovation through their grant funded projects.

By combining the power of Contribution Analysis, Process Tracing, and the approach of Delahais and Toulemonde, our frameworks have evolved to allow us to deliver two elements in parallel. Robust assessments of whether an intervention is necessary and/or sufficient to achieve its objectives, as well as a clear commentary on the strength of the evidence used to form those judgements

Beyond our methodological insights, this paper also demonstrates how CA can bridge the gap between evidence and action in the energy sector by clarifying the role of government funding in achieving policy outcomes. We identify where CA has, and has not, effectively leveraged insights for future policy design and the reasons for these successes or failures.

We argue that CA’s structured approach to causal inference strengthens accountability. By linking the use of CA with PT and elements of the work of Delehais and Toulemonde, programmes can be robustly assessed against their original intention, rather than their actual out-turn.

By sharing practitioner perspectives and cross-case lessons centred on the energy sector, this session will contribute to the conference themes of influencing policy and programme change, building evaluation cultures, and communicating evaluation for action.

Building evaluation cultures through participatory, user-led methods: Lessons from applying PVMSC in the Healthy Cities for Adolescents programme

Soledad Muniz (InsightShare) Kathryn Scurfield (Ecorys UK) Annie Barber (Ecorys - Healthy Cities for Adolescents)

Send message to Authors

Paper short abstract

This session explores how Participatory Video Most Significant Change (PVMSC) builds evaluation cultures by co-producing evidence with adolescents, embedding reflection and learning in decision-making, and using ethical, user-led storytelling to value diverse voices.

Paper long abstract

Healthy Cities for Adolescents (HCA) is Fondation Botnar’s flagship initiative, managed by Ecorys, to create cities that are fit for adolescents. Now in its second phase (2022–2026), HCA operates in six countries, supporting projects that address adolescent health and wellbeing in diverse urban contexts.

This session introduces an innovative, participatory, and user-centred method used in HCA to enhance learning and reflection. Drawing on implementation experience, we will demonstrate the transformative potential of this approach, share a practical example, and reflect on lessons learned from both evaluator and implementor (InsightShare) perspectives on fostering an evaluation culture.

Aligned with Theme 2, the session will present PVMSC - a method combining Participatory Video (PV) with the story-based Most Significant Change (MSC) technique. Grounded in equitable evaluation and participatory action research, this approach enables adolescents to film, edit, and share their own stories of change, taking the lead in identifying and analysing what matters most to them. Through visual storytelling, they generate evidence for local action, learning, and advocacy.

Our experience with PVMSC illustrates its value, relevance and feasibility in complex programmes.

1. Ethical storytelling: PVMSC shifts away from extractive practices in which external actors (often in the Global North) interpret service user data. Instead, it co-produces evidence that centres adolescents in MEL, redefining what counts as evidence from externally set indicators to locally defined significance.

2. Adolescent agency: It exemplifies evaluation as empowerment, with adolescents acting as co-evaluators and co-communicators, generating evidence in their own words through creative expression.

3. Capacity strengthening: The process builds lasting skills among grantee organisations and young people in digital media, storytelling, participatory research, MEL, civic engagement, and facilitation.

4. Local ownership: Participatory analysis, collaborative reflection, and community film screenings become platforms for local sense-making, dialogue, and advocacy that inform ongoing learning and policy action.

The session will showcase examples of PVMSC in action and share lessons on creating environments where evidence is valued and used for local action beyond traditional reporting and accountability.

Please note, we are aware of Social Development Direct’s abstract submission on YET, and confirm that this is a different approach and that there is no overlap between the sessions’ content.

Tracking What Works: EEF’s Approaches to Longitudinal Evaluation

Maria Pomoni (Education Endowment Foundation)

Send message to Author

Paper short abstract

Present two main EEF approaches to longitudinal analysis: (1) routine tracking via the EEF Archive, and (2) pre-specified longitudinal analysis built into evaluation design when delayed or sustained impact is plausible and discuss and compare the advantages and limitations of the two approaches.

Paper long abstract

Longitudinal analysis is a key component of the Education Endowment Foundation’s (EEF) mission to understand whether the effects of educational interventions persist, diminish, or emerge over time. By providing evidence on the durability of outcomes beyond the initial evaluation period, longitudinal analysis informs educational practice, supports evidence-based improvements, and guides decisions about regranting funding to programmes.

This presentation aims to introduce the two main approaches EEF uses for longitudinal analysis. The first is routine tracking through the EEF Archive, which leverages National Pupil Database data and analysis by Durham University. The second is pre-specified longitudinal analysis built into evaluation design when delayed or sustained impact is plausible. It is also aimed to compare the advantages and limitations of these approaches, considering factors such as cost, data completeness, theoretical interpretation, and the burden on schools. Another objective is to demonstrate how learning from challenges can refine educational approaches and how EEF’s evidence can help policymakers and funders decide which programmes to scale, adapt, or regrant. EEF is currently revising its approach to longitudinal analysis to consider how best the analysis can support EEF’s overall mission. Flexible models are being considered, including longitudinal analysis on all archived projects, longitudinal outcomes being incorporated in original trial design or focusing on programmes in the funding pipeline to maximise the practical value and impact of longitudinal evidence for education. By connecting these insights to the conference theme, we aim to show how evaluation can drive real-world change and improve outcomes for learners and the education sector.

Attendees will gain practical insights into two viable approaches to longitudinal analysis that are highly relevant to the educational sector. They will learn how EEF has applied these methods to strengthen its strategy and mission, ensuring evidence-based improvements and sustained impact. The session will provide an opportunity to explore the EEF's Archive and understand EEF’s longitudinal research methodology in depth. Attendees from evaluation teams in the charity sector, other What Works Centres, and consultancy evaluation teams will leave with a clearer understanding of how longitudinal analysis can inform decision-making and enhance evaluation practice. There will also be time for questions and discussion to support knowledge sharing and application.

EEF uses longitudinal analysis to identify which programmes are worth scaling, adapting, or regranting. This ensures that the education sector can come to more evidence informed decisions when it comes to investing their time and resources in interventions that are more likely to deliver a positive, lasting impact.

EEF’s mission is to break the link between family income and educational achievement by supporting the education sector to transform outcomes for socio-economically disadvantaged children and young people. To achieve this, EEF through its wide work and the longitudinal follow up research methodology enables practitioners to focus on evidence and what works best in practice, ensuring that early years providers, schools, and colleges have access to accurate, accessible, and actionable evidence to improve teaching and learning.

The Unsteady Pulse of Evidence: Two Decades of Navigating Politics, Administration, and Donor Dynamics to Build a Government Monitoring and Evaluation System in Uganda

David Rider Smith Timothy Lubanga (Office of the Prime Minister)

Strengthening evidence-informed policy in fragile contexts: insights from Ukraine’s national evaluation system

Rebecca Wagner (Peace Research Institute) Olha Krasovska (North East CA) Liubov Margolina Vitalii Gryga (Institute for Economics and Forecasting of NASU) Vira Nedzvedska (ast Europe Foundation) Mirjana Köder (German Institute for Development Evaluation (DEval)) Iryna Lupashko (Ukrainian Evaluation Association)

Send message to Authors

Paper short abstract

Meet-the-Author session on Ukraine’s emerging national evaluation system, showing how evaluation architectures, capacities and incentives shape real policy and programme decisions in fragile, crisis-affected contexts, and what this means for influencing change elsewhere.

Paper long abstract

This Meet the Author session will explore how a national evaluation system can support – and sometimes struggle to support – evidence-informed policy and programme change in a highly fragile, rapidly evolving context. Drawing on the discussion paper Strengthening Evidence-Based Decision Making in Fragile and Conflict-Affected States: Insights from Ukraine’s National Evaluation System (DEval Discussion Paper 3/2025), the session uses Ukraine as a case study to examine how evaluation infrastructure, incentives and capacities shape real-world decisions on recovery, reconstruction and EU accession.

The paper finds that monitoring and evaluation of public policy in Ukraine is widely referenced in regulations and strategies, yet actual evaluation practice remains patchy and weakly institutionalised. Fragmented responsibilities, limited political demand for evaluation, and the absence of a coherent legal framework constrain the systematic use of evidence. At the same time, war-related recovery and reconstruction, and the EU accession process, have created powerful external pressures and opportunities to build stronger evaluation systems. Civil society organisations, evaluation associations and international partners are emerging as important actors, piloting practices and norms that can influence how public institutions generate and use evaluative knowledge.

In the session, the author will briefly present the paper’s conceptual framing and qualitative case study methodology, and then focus on the entry points it identifies for strengthening policy and programme change through evaluation. These include: clarifying mandates and co-ordination structures for government-led evaluation; using EU accession and reconstruction funding as levers for institutionalising evaluation; investing in evaluation capacity development across state and non-state actors; and fostering a culture that values learning alongside accountability. The session will highlight the tensions between urgent decision-making in crisis and the longer-term, systemic work of building a national evaluation system.

A facilitated discussion will invite participants to interrogate the transferability of these insights beyond Ukraine: How can evaluators and policymakers in other fragile or politically contested settings use windows of opportunity (e.g. reform processes, external funding, crises) to embed evaluation more deeply? What kinds of adaptive approaches and partnerships help ensure that evaluation findings travel into policy and programme decisions, rather than remaining at the margins?

The session will be of interest to evaluators, commissioners, policymakers and funders concerned with Theme 1 of the conference. Participants will leave with a richer understanding of how national evaluation architectures interact with politics, conflict and reform – and with practical ideas for leveraging evaluation to influence policy and programme change in their own contexts.

How Can We Capture ‘Transformational Change’? Turning the UK’s ICF KPI 15 Framework into a Tool for Transformative Action

Liam Shah (Ecorys) Thomas Bowe (Ecorys)

Send message to Authors

Paper short abstract

We adapted the UK’s ICF KPI 15 from a static metric into a dynamic tool that tracks signals of transformational change. Applied under FCDO’s ARCAN programme, we demonstrate our experience of how this can enable actionable insights and future-aware decisions to accelerate systemic transformation.

Paper long abstract

Transformational change is central not only to climate and nature agendas but also to sectors where systemic shifts are critical. Yet measuring this change meaningfully, and using that evidence to shape decisions, remains an evaluative frontier. The UK’s International Climate Finance (ICF) Key Performance Indicator (KPI) 15 was designed to assess the likelihood of transformational change, but its original design was too static, linear and output-focused to capture the realities of complex, adaptive programmes or to inform forward-looking decisions. The urgency of the climate and nature crisis, and the growing scale of international climate finance, demands tools that do more than monitor progress: they must actively guide strategies and policies towards transformative action.

Drawing on our experience as the Monitoring, Evaluation and Learning (MEL) Unit of the FCDO’s Africa Regional Climate and Nature (ARCAN) programme this session will share how we developed, tested, and implemented an adjusted KPI 15 methodology, the lessons learned from doing so, and what this means for future applications across complex portfolios. Endorsed by ARCAN’s management team, our goal was not only to adapt the tool but to shift our team’s mindset towards transformational thinking - encouraging both evaluators and policymakers to move beyond compliance and adopt a reflective, future-oriented approach that asks “So what?” and “Now what?” to critically turn evidence into action.

At its centre is a Signals of Transformational Change framework, adapted from the Climate Investment Funds, which recognises that transformation unfolds in stages. Evaluators identify no, early, interim and advanced signals across criteria such as capacity, incentives and scalability to track cumulative progress and surface emerging momentum. We trialled additional dimensions such as gender equality and synergies, embedded sustainability as a cross-cutting judgement, and introduced a dual scoring system linking likelihood of change with strength of evidence. By moving KPI 15 from a static monitoring metric to an evaluative, future-aware sense-making tool; we aimed to strengthen evaluative insight, enable more nuanced judgements about where and how change is emerging, and better connect evidence to real learning and action.

Our session will begin with an introduction to transformational change concepts and the ICF KPI 15 methodology. We will then briefly introduce the ARCAN portfolio and the deficiencies of the existing ICF KPI 15 framework in our context. Next, we will provide a practical walkthrough of our adaptations and examples of its practical application. This includes a set of guiding principles we developed that encourage critical and contextual thinking - such as questioning the significance of change, recognising weak or negative signals and examining how change connects across levels (local to regional). Finally, we will share joint lessons learned by our team and FCDO, including challenges, along with forward-looking insights for evaluators, managers of complex portfolios (including the UK’s ICF portfolio team), and policymakers. While our focus is on climate and nature, these insights are relevant to other sectoral evaluators and MEL professionals seeking approaches to measurement that help programmes and policies move towards transformational change.

Bridging the Gap Between Different Strata of Evaluation

George Bramley (University of Birmingham)

Send message to Author

Paper short abstract

A panel session of evaluators working in different strata of evaluation in programmes designed to promote the development of innovative place-based interventions. Each will discuss how their work is shaped by their stratum and how this corresponds to each of this year’s conference subthemes.

Paper long abstract

It is not uncommon for there to be different strata of evaluation in programmes designed to promote the development of innovative place-based policy and practice through inclusive collaboration between different stakeholders and communities. Each stratum has a specific role, responding to each of this year’s conference subthemes in their own ways. An example being the Local Policy Innovation Partnerships (LPIP) funded by Economic and Social Research Council (ESRC), Arts and Humanities Research Council (AHRC), and Innovate UK. The programme includes four local partnerships based in each of the four nations and a national coordinating hub. The programme aim is to create a step change in the quality and impact of the evidence created by universities and their local place partners to support place-based policy and practice innovation.

In this round table discussion we bring together evaluators involved in the Local Policy Innovation Partnerships working in different roles including:

• National independent evaluator responsible for developing and implementing the overarching evaluation framework and programme assessment

• Experienced evaluator based in the Strategic Coordinating Hub for LPIPs whose role is support the development of evaluation capabilities and understanding of the distinctiveness of place-based evaluation.

• LPIP leader and evaluator who is closely involved in building capability within local policymaking, co-producing evidence with communities and service users, and using participatory and user-centred methods that support reflection and learning.

Panel members will provide their reflections on:

• Importance of building trust both professionally methodologically and how to measure it as a key intermediary outcome and potential indicator of future sustainability

• Trust is an enabling condition to explore policy and programme effectiveness. Trust between partners, trust between evaluators and participants, and trust within communities shaped:

o Data quality

o Engagement levels

o Partnership stability

o The credibility of findings

• Alignment of and choices around perspective and approach adopted depending on where you are positioned as an evaluator

• The role of evaluation in supporting adaptive programme management and learning in placed-based innovation programmes

• The necessity of contextualised evaluation in for place-based systems?

• Evaluators as learning partners, enabling sensemaking rather than auditors?

• The need to build evaluative capacity as a core output. Evaluation activity should be:

o A capacity building exercise

o A route to strengthen strategic clarity

o A way to improve day to day decision making

o A tool for building internal cultures of reflection

• How to overcome challenges related to Data Quality, Infrastructure Gaps, and the Limits of Measurement. In place-based work, focused on sub-regions, evaluators often have to:

o Work creatively with incomplete data

o Build new baseline measures

o Use qualitative insights to compensate for gaps

o Advocate for long term infrastructure strengthening

• Supporting and engendering evaluative thinking when engaging with different communities and stakeholders including preparation to be able to evidence of impact co-produced initiatives.

The format will be each panel give a brief presentation of no more than 3 minutes to maximise time for a chaired discussion.

Monitoring, Evaluation and Learning in Complex Climate-Health Systems: Insights from PAICE

David Osrin (UCL) Simon Vakeva-Baird (University College London) Gemma Moore (500) Ruth Unstead-Joss (UCL)

Send message to Authors

Paper short abstract

PAICE (Policy and Implementation for Climate & Health Equity) explores how evidence on climate action, health, and health equity can be translated into UK policy and practice. We outline a replicable model for integrating evaluation into complex, multi-stakeholder research-policy initiatives.

Paper long abstract

The PAICE project (Policy and Implementation for Climate & Health Equity) explores how evidence on climate action, health, and health equity can be translated into UK policy and practice. Addressing these systemic links requires approaches that integrate diverse disciplines and stakeholder knowledge. PAICE adopts a transdisciplinary research framework as its guiding theory for project design, delivery, and evaluation.

PAICE brings together researchers in systems thinking, modelling, epidemiology, building physics, and members of the Climate Change Committee and regional government (the Greater London Authority). A dedicated workstream has led the evaluation approach of the project by developing a monitoring, evaluation and learning plan (MELP). This plan aims to apply evaluation principles to derive criteria and indicators with which to evaluate across four project phases: formation, formulation, investigation and translation. Across these phases, transdisciplinary research processes, outputs and outcomes are evaluated using participatory qualitative and quantitative methods.

This poster presents:

• Evaluation principles and criteria adopted to evaluate process, outputs and outcomes

• Suggested methods for monitoring progress and facilitating reflexive learning

• Alignment with the governing program theory, including the intended action model and anticipated project impacts.

• Impact pathways for research and policy practice, including how we

Emerging insights included are:

• Challenges and opportunities of mid-term evaluations

• Lessons from working with resource-constrained societal partners

• Strategies for fostering discipline-specific learning within a climate-health context, including community engagement and systems thinking.

Few UK climate-health research projects embed evaluation activities into projects from the beginning. We hope that the MELP offers a replicable model for integrating evaluation into complex, multi-stakeholder research-policy initiatives. By embedding evaluation in a transdisciplinary framework, PAICE demonstrates how adaptive, participatory approaches can strengthen evidence translation and inform policy in complex, uncertain domains.

From influence to implementation: contribution analysis of PLANE’s policy support for systemic education reform in Nigeria (2023–2025)

Matthew Pritchard (Mat Pritchard Consulting) Malami Buba (Ecorys UK) Sara Albertini (Ecorys)

Send message to Authors

Paper short abstract

We used contribution analysis and a 9-stage reform value chain to assess how PLANE shaped seven education reforms across five Nigerian states, identifying insider-led and institution-embedded pathways that moved policies from drafting to budgeted implementation.

Paper long abstract

Background & alignment to UKES themes. This study examines how an FCDO-funded programme (PLANE) influenced systemic education reforms in Nigeria and the conditions under which influence translated into adoption, budget execution and early institutionalisation—squarely addressing UKES Theme 1 (policy influence) and, through utilisation-focused design and iterative insight sharing, Theme 3 (communicating evaluation for action).

Methods. We applied contribution analysis as the primary approach, structured around a 9-stage reform value chain (from gap analysis to sustained results). Evidence combined document review and 82 key informant interviews (17 PLANE staff; 65 stakeholders, of whom 9 were women), coded in MAXQDA against a pre-specified analytical framework. Reforms spanned seven processes across federal and state levels: Teacher Recruitment/Deployment/Replacement (Jigawa, Kano), Education Quality Assurance law (Jigawa), Girls’ Education Policy (Kano), Domestication of the National Policy on Almajiri (Kaduna), UBEC/Intervention Fund law reform (federal), School Safety policy (Jigawa), and TaRL sustainability (Borno, Yobe).

Systemic pathways (what worked). Evidence shows PLANE’s influence operated through four mutually reinforcing pathways:

• Insider-led brokerage: mobilising respected government technocrats and trusted intermediaries as champions to navigate ideological sensitivities (e.g., reframing gender language to “inclusive access”), keeping opponents engaged and approvals moving.

• Multi-tier convening and coalitioning: from governor-level dialogues to Technical Working Groups and civil society coalitions (e.g., K-SAFE), enabling consensus-building and de-politicised problem-solving.

• Legal-institutional embedding with budgets: creating units, mandates and budget codes prior to full passage—e.g., a Girls’ Education Unit and NGN 402.6m 2025 allocations across MDAs in Kano—so policies did not stall at adoption.

• Peer learning and vertical alignment: brokering federal–state linkages (e.g., UBEC reform dialogue) and cross-state diffusion (e.g., Jigawa TRDR influencing Kano/national; Kaduna’s Almajiri domestication inspiring neighbours).

Results (what changed). Across cases, no supported reform regressed; several advanced multiple stages in 2025. Examples include Kano Girls’ Education Policy (stage 4→6) with early operational uptake; and Kaduna’s Almajiri domestication moving from stage 1 to a multilateral implementation platform (ROOSC). Teacher reforms were associated with tangible staffing actions (e.g., 2,400 recruits in Jigawa; >23,000 volunteer teachers absorbed in Kano, 2023–2025). These movements reflect budget execution and governance arrangements activated (stages 6–7), not just paper progress.

What didn’t work (and why). Reform pace was constrained by elite religious/political sensitivities (notably on girls’ education), political turnover, under-attention to the non-formal sector, and MEL capacity gaps that limited feedback loops—pointing to where influence needs earlier elite engagement, broader actor inclusion, and embedded MEL to sustain momentum.

Contributions to evaluative practice. Methodologically, combining contribution analysis with a staged reform rubric offered a transparent, theory-linked account of influence and decision-relevant evidence for adaptive management—translating directly into approvals, budget lines, and units within ministries.

Implications. To turn influence into durable results, evaluative practice should: (i) plan for elite-sensitivity management and multi-party continuity pre-approval; (ii) institutionalise capacity and learning platforms; and (iii) integrate policy-tracking MEL into government performance systems from the outset.

Shaping global, regional and national responses to strengthening AMR surveillance through formative evaluation

Tim Shorten (Independent consultant) Jonathan Cooper (Itad) Milena von und zur Muhlen (Department for Health and Social Care) Lamiaa Shehata (Itad)

Send message to Authors

Paper short abstract

Learn how formative evaluation shaped Fleming Fund design, adaptation and policy influence amid global uncertainty. Gain practical lessons on embedding evaluation for impact, balancing timeliness with rigour, and strategies for influencing policy—ideal for evaluators seeking actionable insights.

Paper long abstract

Introduction

This panel will share lessons from using formative evaluation to strengthen Fleming Fund programme outcomes and influence policy at global, regional, and national levels.

Context

The Fleming Fund was established in 2017 to strengthen Antimicrobial Resistance (AMR) surveillance as a key pillar in global efforts to tackle AMR. Through a portfolio of country-, regional- and global grants, the Fund has generated country-level analyses on AMR and shared with decision makers to influence national and global policy and regulation.

Itad has been the Fund’s independent evaluation partner since 2017, delivering a range of evaluation products from the start up of the Fund. These have informed substantial programme change (including securing a second phase of support for the Fund, with an evolving focus based on experience in phase 1); and policy and regulation change at national and global level, including at the UN High Level Meeting on AMR in September 2024.

The Fund has been implemented throughout a period of significant uncertainty. The programme has adapted to respond to the challenges of COVID-19, Brexit, changes of UK government, multiple short-term spending reviews and associated replanning exercises. Uncertainty looks set to influence the design and implementation of ODA programmes for the foreseeable future, particularly in the context of recent and ongoing US and UK cuts to ODA. The Fund’s incorporation of evaluation from the design stage and its use of evaluation outputs for timely support to key decisions offers valuable lessons to any evaluators or decision makers seeking effective, sustainable ODA programmes.

Objectives

The panel will show how DHSC and Itad structured and used evaluation to guide programme design, adaptation and decision-making in a complex, multi-stakeholder context.

Plan for panel

Two speakers will present for 10 minutes each:

• Milena von und zer Muhlen (DHSC) will provide an overview of how DHSC structured the evaluation to maximise value and relevance, including on how evaluative thinking was used to inform programme design and maximise effectiveness. For example, incorporating evidence on best practice in policy influencing and agenda setting.

• Jon Cooper (Itad) will outline how the evaluation adapted its approach to respond to DHSC changing needs and uncertainty whilst maintaining methodologically robust evaluative insights.

Both will discuss challenges and strategies for maximising effectiveness.

• DHSC adapted evaluation timeframes and deliverables to ensure timely, relevant insights.

• A supportive culture for learning and evidence use is critical.

• Rigid portfolio and contract systems hinder adaptation.

• Multiple tailored mechanisms are needed to engage decision-makers; findings must be simple yet substantive and striking the balance is not easy.

• Policy influence and sustainability require time, strategic action, and political awareness—evidence alone is insufficient.

• Sustainability must be embedded from the outset, not left to later stages.

Q&A facilitated by Tim Shorten, with potential questions such as:

1. Which evaluation design features more / less influential for DHSC decisions?

2. How did the Fund engage national decision-makers, and and to what extent were evaluation findings necessary/sufficient?

3. How to balance timeliness vs robustness, pragmatism vs perfection?

Causal mapping in action: evaluative visualisations

Steve Powell (Causal Map Ltd) Alastair Spray (INTRAC) Dena Lomofsky (Southern Hemisphere)

Send message to Authors

Paper short abstract

We present two evaluation case studies which show how AI‑assisted causal coding can turn large volumes of interviews and reports into theory-driven or theory-free causal visuals with traceable evidence. We share workflows, accuracy checks and design choices to make the maps useful in evaluations.

Paper long abstract

Evaluators often struggle to process and communicate thick qualitative evidence quickly and convincingly. Causal mapping offers a concise visual language - outcomes, drivers and intermediate steps linked into a causal map with supporting quotes - but building reliable causal maps at scale used to require weeks of manual coding. This talk shows how AI‑assisted coding accelerates coding and synthesis while making sure that every visual element is traceable to verbatim text in context.

We present two recent evaluations:

Case A – Using causal mapping and contribution analysis for the final evaluation of a large multi-country programme (Dena). We will explain how hundreds of interview transcripts and internal reports were uploaded for causal coding with a “verifiable AI” technique, and how the causal mapping helped to feed into the Contribution Analysis step.

We address two challenges

How to agree on a vocabulary for the common causal elements across the program components: what to do if the language in the Theory of Change itself is ambiguous, and terminology varies across contexts? We will explain how analysts validated suggestions and managed the merging of terms (e.g., “coalition‑building”/“alliance work”).

How to narrate the story of change that the maps are showing in an accessible way.

Case B – Making more sense of masses of Outcome Harvest data (Alastair).

This was a multi-country, multi-year project, with large amounts of outcome harvest data from 692 individual sources.

Both evaluations involved highly sensitive data. Partners were understandably concerned about automated processing procedures. In this case, we were able to get the approval even of some partners who started off pretty hesitant, mainly through the use of automated offline anonymisation of data before further processing.

Even more than Case A, this was a very complex programme with many partners, each country having its own Theory of Change (ToC) as well as a global programme ToC with various different outcomes for different countries, programmes and donors, plus learning questions and hypotheses that the client all wanted to check against the data. They were finding it hard to grasp the big picture. Using causal mapping enabled them to triangulate the other methods the team were using to evaluate the programme, and clearly articulate causal chains at various different levels. There was very good feedback from the client.

Across both cases we will demonstrate: (1) a reproducible workflow from corpus → verifiable coding by AI → iterative refinement of labels → application of standard algorithms to answer evaluation questions → maps/tables; (2) validation; (3) supervised use of AI to create accessible text summaries of the data contained in the maps.

Why Theme 3? Because the product is not the map or the algorithm - the aim is to strengthen shared understanding. Visuals with transparent provenance (every node/link opens the quotes behind it), are intended to help communicate complex findings in a way that promotes discussion. We will close with a compact checklist of do’s and don’ts to help others pilot AI‑assisted causal mapping responsibly.

From Evaluation to Evolution: How a Learning Partnership Approach Helped Refine a Children and Young People’s Social Prescribing Programme.

Oliver Hamer (Edge Hill University) Michelle Howarth (PerCIE (Personalised Care Interprofessional Network)) Jade Thomson (Edge Hill University)

Send message to Authors

Paper short abstract

This abstract features an evaluation of Barnardo’s social prescribing service (Cumbria LINK) and how it prompted key refinements in programme delivery. The findings highlight how a learning partnership approach including relational practice and iterative feedback loops, helped refine the programme.

Paper long abstract

The Barnardo’s LINK social prescribing programme was established in the Northwest of England to address a critical gap in support for children and young people (CYP) experiencing social, emotional, and mental health difficulties (that fell below clinical thresholds). Designed within a social prescribing model, the programme positioned social support and community engagement as therapeutic mechanisms to improve wellbeing.

The programme was independently evaluated by Edge Hill University over three years, generating evidence that measured impact and refined programme delivery. The evaluation adopted a learning partnership approach (alongside a process and impact evaluation) that embedded iterative reflection between researchers and LINK practitioners. This design transformed the role of evaluation from a retrospective assessment into a live process of co-inquiry and service improvement. Through multiple methods (e.g., qualitative interviews, outcome monitoring, and development of a Theory of Change), the evaluation gathered insights into how relational practice, and adaptability were key drivers of success. In turn, these insights catalysed a series of refinements for programme delivery, and system integration (particularly in health, education and social care).

The evaluation’s impact was evident in how the learning partnership refined programme delivery. Evaluation findings which highlighted inconsistent referral pathways, were used to prompt the development of clearer guidance to help families better understand the offer. At the organisational level, evaluation insights informed a shift from an initial pilot to a mature, system-embedded programme, encouraging efforts to improve visibility (across systems such as social care). Findings from evaluation activities also highlighted practical challenges faced by families (e.g., difficulties with transport), which led to programme refinements aimed at making the service easier to access. In addition, the evaluation team identified key data inefficiencies within the programme’s existing monitoring systems, highlighting gaps that limited the ability to evidence outcomes. These insights re-shaped data collection processes to capture more meaningful evidence, strengthening the programme’s capacity to demonstrate robust impact to commissioners.

The evaluation process also revealed barriers that hindered the translation of evaluation into decision-making (e.g., workforce capacity pressures, and constraints in funding), whilst also highlighting how an adaptive, learning partnership-based approach could overcome these hurdles. By maintaining regular dialogue between researchers and LINK practitioners, evaluation findings were mobilised in real time, promoting a culture of reflection and shared ownership of change. Ultimately, the evaluation of the programme did not simply describe what worked or outline its impact; it became a crucial mechanism for programme refinement. This case illustrates how evaluation (when relational and adaptive), can bridge the gap between evidence and action, not as an endpoint, but as an evolving process of learning.

Evidence-Informed Evaluation Framework for Sustainability Interventions: A Co-Design Experience in Brazilian Municipalities

Flavio Hourneaux Junior (University of Sao Paulo) Daielly Mantovani (University of Sao Paulo) Emre Cinar (University of Portsmouth)

Send message to Authors

Paper short abstract

This session presents an initiative to co-design an evidence-informed framework for evaluating sustainability interventions, drawing on a UK–Brazil project focused on building and empowering communities of evidence.

Paper long abstract

Municipalities in Brazil face persistent challenges in managing sustainability interventions due to limited resources, complex policy interdependencies, and competing political and administrative demands. These conditions frequently result in fragmented planning, inefficient implementation, and inadequate evaluation practices. Addressing such challenges requires structured evaluation frameworks that help municipalities design, implement, and assess sustainability interventions systematically, while balancing short-term constraints with long-term environmental goals.

Our project responds to this need by co-designing an evidence-informed evaluation framework (EF) that strengthens municipal capacity to integrate evidence throughout the policy cycle. The initiative focuses on sustainability interventions in solid waste management within municipalities in the State of São Paulo—one of Brazil’s 27 federal units and responsible for roughly one-third of the national GDP. Solid waste management is a critical policy area where behavioural change, stakeholder engagement, and cross-sector coordination are essential for effective and sustainable results.

The co-designed EF is grounded in behavioural change principles and in empowering communities of evidence. It recognises contextual factors shaping local decision-making and provides municipalities with practical tools to plan, implement, measure, and refine sustainability interventions. By integrating evidence use into routine processes, the framework supports improved governance, enhances transparency, and enables adaptive learning. In doing so, it contributes to advancing the Sustainable Development Goals at the municipal level.

This initiative emerges from a collaboration between the University of Portsmouth (UK) and the University of São Paulo (Brazil), with strong emphasis on knowledge mobilisation across international partners. The project also includes exchanges between researchers and municipal practitioners, enabling co-production of tools that are both context-sensitive and operationally feasible for local governments.

Through this session, we will share the framework, key components of the toolkits, and reflections from the co-design process. We will also discuss how building communities of evidence can strengthen municipal evaluation culture and contribute to more effective sustainability outcomes.

Real-Time Improvement in Acute Care Settings: applying the Implementation Research Logic Model and Theoretical Domains Framework within the developmental evaluation of a suicide prevention pathway

Chiara Lombardo (Health Innovation East) Judith Fynn (Health Innovation East) Rakesh Magon (Hertfordshire Partnership Foundation University NHS Trust) Chetan Shah (Hertfordshire Partnership University NHS Foundation Trust)

Send message to Authors

Paper short abstract

This evaluation of the Hertfordshire Suicide Prevention Pathway combined the Implementation Research Logic Model and Theoretical Domains Framework within a developmental evaluation to enable real-time adaptations and improved adoption and implementation across acute healthcare settings.

Paper long abstract

Background

We present an evaluation of the early implementation of a novel healthcare pathway and use of a developmental evaluation approach to influence pathway development and improvements in implementation.

In 2024, Hertfordshire Partnership University NHS Foundation Trust, in collaboration with Hertfordshire Mental Health, Learning Disability and Autism Health Care Partnership, launched the Hertfordshire Suicide Prevention Pathway (HSPP). Based on scientific evidence, the HSPP aims to enhance early identification, safety planning, and continuity of care across services. HPSS was developed in response to local and international evidence base relating to rising suicide rates and acute healthcare settings as critical intervention points for suicide prevention.

The evaluation employed the Implementation Research Logic Model (IRLM) and Theoretical Domains Framework (TDF) to explore both organisational and behavioural determinants to understand how the pathway was adopted, adapted, and embedded across a multi-agency system.

Methods

A developmental evaluation approach was used to support real-time learning and adaptation between April 2024 and February 2025. Qualitative data were gathered through stakeholder workshops and individual conversations. Purposive sampling included clinicians and senior leaders from acute and mental health services. The IRLM guided the mapping of determinants, strategies, mechanisms, and outcomes, while the TDF informed topic guides and coding, enabling analysis of behavioural factors such as knowledge, skills, beliefs, and motivation. Thematic analysis was applied to transcripts and detailed notes.

Results

Implementation was iterative and adaptive. The IRLM helped to identify key strategies, including face-to-face and simulation-based training, e-learning modules, promotional materials, leadership-led communication and IT system improvements. The evidence-based structure of the intervention was well received, although some training content and terminology were less relevant. Workload pressures, and inconsistent understanding created barriers to adoption. At the individual level, leadership at both senior and team levels acted as key enabler, while confidence varied according to clinical experience. Emotional factors, e.g. fear of making mistakes, also influenced uptake. In terms of process, promotional activities, IT optimisation, and flexible training formats helped support engagement.

Mechanisms of change included strengthened shared language around suicide prevention, increased staff confidence following simulation training, and improved visibility of the pathway in high-risk settings. Early outcomes included increasing numbers of staff trained and referrals to the pathway, greater awareness among acute teams, and improvements in signposting to external services.

The TDF highlighted variability in staff knowledge, confidence, motivation, and emotional resilience. Whilst variability in staff understanding and engagement remained a challenge, changes to training improved confidence, and improvements in communication gaps and IT integration enhanced adoption and acceptability.

The combined IRLM-TDF analysis and regular feedback workshops to share emerging findings enabled real-time adaptations; targeted communication strategies, expanded training formats, and improved electronic system integration.

Conclusion

Early implementation of the HSPP reflects strong organisational commitment, iterative adaptation, and growing cross-sector engagement. Integrating IRLM-TDF provided a robust, theory-driven framework to identify actionable improvements and behavioural drivers. The developmental evaluation approach ensured findings were rapidly translated into implementation. This combined methodology offers a transferable model for evaluating complex, multi-agency health service innovations that require quick translation of findings into actions.

Theory of change for action: Drawing together automation, real-time reporting and baked goods to allow all stakeholders to reap the rewards of theory-based evaluation approaches

Gemal Mekki (Information Commissioner's Office)

Send message to Author

Paper short abstract

Lessons from the ICO’s impact reporting of an ambitious and fast-paced project focused on advertising cookies. We will showcase a theory-based approach that can be deployed in any situation to generate real-time actionable insights from evaluation findings as well as post-project reporting.

Paper long abstract

How do we make insights from regulatory impact measurement as appealing and easy to digest as a chocolate chip cookie? The Information Commissioner’s Office (ICO) initiated the ‘Cookie banner project’ to improve compliance in the online advertising industry, analysing the cookie banners of the 1,000 most popular websites in the UK. We, the ICO’s Impact and Evaluation team, were tasked with writing a recipe (dropping the cookie metaphor now…) for measuring and reporting on the project’s impact.

The session will walk you through our approach from collaborative theory of change design to influencing decision-makers. We will cover the lessons we learned and the tools we implemented to provide actionable insights that can be applied by any evaluation team, including:

- Combining tools to enable visual story telling and clear reporting: where audiences have varying degrees of experience, it is important to have a varied toolkit ready for engaging with them. We will show how tools like interactive whiteboards, dashboards, and data entry platforms can be combined with an organisation’s existing board reporting arrangements to catalyse the use of insights.

- Delivering real-time insights early on to gain buy-in: winning project delivery team members over early on pays dividends when it comes to drawing on their time for evidence collection later. Our approach involved delivering small quick wins to colleagues at all levels, providing them with insights and time-saving tools to improve buy-in. This included reporting tools, data input tools, automated processes, feedback mechanisms and advice and guidance on case-making and communications. It was often more about demonstrating how common evaluation tools could also be drawn on to inform project delivery than it was about designing whole new processes and products to meet their needs.

- Theory of change (ToC) for the masses: Getting colleagues to engage with and own the project’s ToC requires involvement throughout the process. At inception, we set up design workshops using whiteboards so that colleagues could take a hands-on approach to shaping the ToC. We then layered a real-time dashboard on top of the ToC to bring the theory-based evaluation to life, demonstrating how activities, outputs and outcomes were being delivered as the project progressed. This, coupled with our organisation-wide theory of change training initiative, brought about a step-change in outcomes and impact-based decision-making for the project.

We would also love to hear from you if you have experience using any of the tools and approaches we cover during the session, or any alternatives. We welcome engagement during the Q&A and after the session so that we can learn from your experiences.

Using evaluation as a developmental tool: Findings and lessons from a feasibility study of the Restart programme

Kathryn Lord (Cordis Bright)

Send message to Author

Paper short abstract

A feasibility study of Restart, a multi-agency domestic abuse programme, showed how evaluators can add value even without full recruitment by supporting system learning, strengthening programme design and improving conditions for future evaluation.

Paper long abstract

Feasibility studies rarely unfold as intended, particularly for complex domestic abuse programmes operating within dynamic multi-agency systems. This presentation uses the evaluation of Restart, delivered by the Drive Partnership and Cranstoun, to show how meaningful insights and value can still be generated even when formal study recruitment is not achieved.

Restart takes a multi-agency, whole family approach to hold perpetrators accountable for change, to prevent escalation of risk and helping (ex-) partner and child victim-survivors remain safe and together at home. It brings together professionals from Children’s Social Care, housing, and domestic abuse sector services to identify, change, and disrupt patterns of harmful behaviour at an early stage.

Our role evolved during the study from testing feasibility for a traditional impact design to acting as a learning partner, focused on understanding what was helping or hindering delivery, strengthening programme design, and identifying the conditions needed for robust future evaluation. Working closely with delivery partners, we explored where Restart sat within local systems, what was enabling or constraining implementation, and which model adaptations were needed before progressing to an outcomes-focused study. Key insights from our study included:

Mission drift around Restart’s aims and adaptations to Restart’s model, highlighting the need to refine the Theory of Change, clarify the programme’s vision, codifying its core delivery elements, and consider extending the intervention’s timeframe.

Engagement and retention barriers, including practitioner hesitancy around thresholds, competing priorities and uncertainty about Restart’s place within local pathways. There was also inconsistent understanding of eligibility criteria for Restart which influenced communication of the target cohort to referrers.

Challenges to embedding evaluation processes within delivery and barriers to recruiting participants to the research study

System-level enablers and barriers, especially the influence of strategic priorities, leadership visibility and alignment across CSC, Early Help, Housing and VAWG teams.

Rather than treating these issues as limitations, we used them as catalysts for programme and system learning. We brought together insights from multiple stakeholders to help partners understand the wider conditions shaping delivery. Our final report set out what is, and is not, currently feasible to evaluate, alongside clear recommendations for strengthening programme clarity, referral pathways, data systems and practitioner confidence in using validated tools.

While formal funding for Restart has now ended, the feasibility study has laid strong foundations for future collaboration between evaluators and practitioners. Partners identified several areas that would benefit from further development, including:

refining programme aims and mechanisms through further Theory of Change work;

strengthening eligibility and informed consent processes;

improving data monitoring systems to support future evaluability;

piloting outcomes measurement tools; and

further embedding meaningful Expert by Experience involvement in programme design and practitioner support.

This case study illustrates how evaluators can add significant value, even when recruitment to research is limited, by taking an adaptive approach that prioritises system learning, programme development and evaluability. It highlights the importance of evaluation as a developmental tool for shaping policy and practice within complex social systems.

Evaluation at the Heart of Digital Interventions in the Department for Business and Trade

Joe Justice (Department for Business and Trade) Seoras Lyall Daisy Thomas (UK Government)

Send message to Authors

Paper short abstract

Presentation to discuss DBT's new Digital Evaluator role, which applies multidisciplinary research methods to evaluate digital interventions and enable smarter decisions. We will share lessons, best practices and case studies to showcase how the role was developed and the benefits it has achieved.

Paper long abstract

The UK public sector spends over £26bn annually on digital technology (gov.uk, 2025). While the Magenta book underlines the importance of evaluating government interventions, the digital aspect of public service delivery has thus far been under-evaluated. Addressing this gap in best practice, the Department for Business and Trade is the forerunner in establishing a team of evaluators at the heart of digital services. This approach enables smart policy making, embeds data-driven insights as a core value, shaping behaviours and decision-making across the organisation. Analysts in the department are embedded within digital teams, providing insightful and impactful evaluation to shape Government’s digital landscape.

Following the success of this initiative DBT led a cross government team, including members of the Evaluation Task Force, Government Economic and Social Research formally develop and launched a GDS "digital evaluator" role. Combining elements of social, statistical and economic research, this role underpins the unique skills required to effectively evaluate digital projects and tools. Compared to traditional evaluators, the digital evaluator is embedded in Agile teams to provide continuous insight to inform ongoing improvement, enable the measurement of the impacts and value for money of digital tools. The digital evaluator integrates ROAMEF principles with the product cycle, as illustrated by the DBT,’s Digital Evaluation Strategy enabling smart, reactive policy delivery, in a fast-paced environment.

This cultural shift will be demonstrated through case studies; notable success stories include the team’s evaluation of AI tools as well as public-facing digital services. By working closely with product teams, communications and senior stakeholders, DBT’s digital evaluators have conducted comprehensive evaluations of the two AI tools to understand their impacts, risks and the attitudes of colleagues. These evaluations have been crucial in enabling seniors to make informed decisions on the future of AI across the department.

For this presentation, we will explain how the digital evaluator works in digital teams, the key capabilities it covers and how it fosters a culture of evaluation in digital and technology settings – as well as the challenges specific to working within digital delivery. We will then cover case studies from our team, discussing how we have applied various evaluation methods to shape decisions and achieve stronger outcomes. To conclude, we will share lessons learned and practical advice for evaluators seeking to establish similar teams in their own organisation.

Building Capacity for AI and Evaluation in Homelessness Prevention: Early Insights from Two Randomised Controlled Trials

Luke Arundel (Centre for Homelessness Impact) Ella Whelan (Centre for Homelessness Impact)

Send message to Authors

Paper short abstract

AI adoption in public services is growing fast. The homelessness sector needs capacity to both use these tools and evaluate them rigorously. We share early insights from two randomised trials testing predictive machine learning and generative AI interventions that aim to reduce homelessness.

Paper long abstract

The use of AI is rapidly expanding across public services, but evidence struggles to keep pace with adoption. In homelessness services, where AI tools hold promise, we risk scaling interventions that seem promising without testing their impacts. As with other interventions, AI tools should be robustly tested to understand their impact on the outcomes we care about and identify any unintended consequences.

The Centre for Homelessness Impact is conducting two complementary randomised controlled trials, one funded through MHCLG's Test & Learn programme - the first globally to invest in robust evidence of homelessness intervention impact - and one funded through the Cabinet Office’s Evaluation Accelerator Fund:

Trial 1: Predictive machine learning for upstream prevention (4 Local Authorities, ~2,000 households)

Testing whether machine learning models can identify households at risk of homelessness, and whether proactive phone calls to at-risk households reduces homelessness. Building on promising pilots, this addresses important questions about data quality, practical applications of predictive models, and scalability across local authorities with varying levels of data maturity. This trial is part of the groundbreaking £15m Test & Learn and Systems-wide Evaluation Programme, the first of its kind in the world.

Trial 2: Generative AI for housing advice (Southwark Council, ~9,000 households)

Evaluating an AI chatbot providing personalised housing advice from trusted sources (Shelter, Citizens Advice, government guidance). Unlike general-purpose AI tools like ChatGPT, this chatbot is specifically designed to assess someone's housing situation, offer tailored advice, and draft letters to landlords or councils. The intervention addresses a crucial gap - people often don't seek help until crisis point and can find advice difficult to access - by proactively reaching out to at-risk households and offering 24/7 accessible guidance before households reach statutory thresholds. This trial is funded by the Cabinet Office's Evaluation Accelerator Fund.

This presentation offers methodological insights and implementation learning from setting up trials to evaluate the use of AI. We share learning on:

Embedding rigorous evaluation in fast-moving tech contexts: Pre-registration protocols, ethical oversight, and adaptive designs that balance flexibility and methodological rigour.

Navigating data governance: Practical lessons from data-sharing agreements and concerns around algorithmic decision-making across multiple partners and local authorities.

Building organisational capacity: Understanding variation in data maturity and implications for scaling data-driven approaches.

Addressing ethical dimensions: Considering questions of algorithmic fairness and consent within the context of randomised trials.

These trials show how evaluation needs to keep pace with technology, and that successful adoption requires simultaneously building technical capacity and addressing ethical concerns. Overcoming these challenges builds strong evaluation practices that can test innovations while generating robust evidence to inform decision-making.

This directly addresses exploring "fast-emerging areas such as AI and new ways of working". By sharing learning from these groundbreaking evaluations, we support evaluators and policymakers considering: How do we test AI tools? What conditions enable adoption? How do we ensure these technologies serve vulnerable populations?

In a sector where neither AI applications nor rigorous trials are yet commonplace, these evaluations are building both the capacity and acceptance needed for evidence-based innovation.

Changing the goal posts: evaluation as a catalyst for policy innovation in place-based systemic approaches to tackle physical activity inequalities

Chad Oatley (Sport England) Katie Shearn (Sheffield Hallam University) Jessica Woodward (Sport England)

Send message to Authors

Paper short abstract

Attendees will gain insights into evaluating complex, place-based systemic approaches to reducing physical activity inequalities. The session covers innovative methods, evidence on what drives change, and how findings shaped national policy and £250m investment.

Paper long abstract

Background and aims

Sport England have, over several strategy cycles, invested in place-based systemic approaches to tackle physical activity inequalities. Place-based systemic approaches are, by nature, complex interventions. They have multiple interacting parts, are based on local characteristics and aim to influence local conditions for physical activity, as opposed to delivering programmes, alone. To support the evaluation of this investment Sport England commissioned a National Evaluation and Learning Partnership with two aims: to build capacity for evaluation and learning across “Places” and to generate evidence about what meaningfully changes states in systems towards a narrowing of physical activity inequalities, for whom, in what circumstances and why?

Method

The evaluation is developmental, participatory and longitudinal. It uses mixed methods, and is underpinned by realist evaluation and a set theoretic modelling approach “configurational comparative analysis” supported by EvalC3 software. This presentation draws on data from documents (n=48), workshops (n=24) and online evidence submissions (n=150). Evaluation outputs are orientated to support a variety of stakeholders to learn and adapt their approach in real time.

Results

Findings will illuminate not just what has changed, but how change happens. We will highlight specific findings relating to the foundational work to operationalise place-based systemic approaches which has impacted on local and national decision making. This includes capacity building for place-based systemic approaches, generation and sharing of insight about underlying barriers to physical activity and organisational processes that enable or limit systemic ways of working.

Influence

The evaluation has supported the case for a further £250m investment into scaling place-based systemic approaches from 12-90 in England, and formalising the way of evaluating and learning from them. More specific contributions include informing the strategy for expanding the work; including investment guidance and support to places in understanding and developing their place-based systemic approach, and capacity do it. Sport England, it has influenced understanding of how change happens and therefore how to organise to deliver place expansion as a ‘programme’ of investment, and therefore reframing accountability and performance in ways which are in line with complexity.

From Insight to Action: Embedding Learning and Evidence in Everyday Decisions

Louise Clark (Institute of Development Studies)

Send message to Author

Paper short abstract

Learning journeys embed insights into policy and practice through co-production, participatory design, and systems thinking. This session explores engagement strategies, inclusion and use of technology to turn evidence into actionable insights in multiple learning partnerships.

Paper long abstract

Embedding evaluative evidence into everyday decision-making and programme delivery requires approaches that foster reflection, ownership, and actionable insights. Drawing on the Institute of Development Studies’ experience with Accompanied Learning processes alongside partners such as FCDO, IDRC, GIZ, WFP, and AHRC, this paper explores how co-produced evidence and participatory methods create environments where evidence is valued and used. Through co-design workshops, facilitated reflection spaces, and iterative feedback loops, these approaches integrate systems thinking and user-centred design to strengthen capability within policymaking and delivery.

Our work demonstrates how Learning Journeys—structured, collaborative processes—help organisations identify key learning questions, surface tacit knowledge, and synthesise evidence for real-time decisions. Examples include participatory online workshops with the AHRC Disability-Inclusive Development Network to address power dynamics and equity in funding, and feedback-driven processes with IDRC CORE to amplify Southern-led policy voices during global crises. In contrast, FCDO’s K4D Learning Journeys generally build upon rapid evidence reviews and then use facilitated spaces to explore the practical application of the existing explicit knowledge.

Technology and digital platforms have enabled global participation, while raising ethical questions around accessibility, bias, and consent. Emerging AI tools offer new opportunities for synthesis and sensemaking, but require careful attention to transparency and equity to avoid reinforcing existing power imbalances. By working at the interface of evaluation, communication, and technology, these approaches transform technical findings into actionable insights that influence policy and practice. Ultimately, embedding evaluative thinking through participatory and co-produced methods can catalyse organisational learning, strengthen systems, and ensure diverse voices shape decisions.

Portoli-OMG! Reflections and learning from evaluating mega-portfolios

Niki Wood (Integrity)

Send message to Author

Paper short abstract

Evaluating a mega-portfolio of interventions? We’ve been there. We unpack how we tackled ill-suited criteria, abstraction, data access hurdles, and non-evaluator audiences from our work evaluating cyber portfolios, sparking debate on what credible evaluation really means at the mega-portfolio scale.

Paper long abstract

We are delivering a Portfolio Monitoring, Evaluation, and Learning programme focused on a portfolio of cyber interventions. This 'portfolio' is in fact a portfolio in name only, as it houses multiple sub-portfolios, all of which have multiple programmes that house projects. This is challenging evaluatively, as our role involves portfolio-level evaluations and reviews. These pieces of work must cut across a wide range of interventions, delivery bodies, and actors, all operating at different levels of society.

Delivering useful and actionable evaluation at this level of abstraction (i.e. cross-portfolio) is difficult, and cyber and security sector programming bring acute information access restrictions. Furthermore, we face the barrier of security-sector evaluation commissioners often not having any MEL or evaluation background, and requiring very different evaluation products and decision-making support to translate evaluation into action.

In this session we aim to share our experience and spark discussion with others evaluating mega-portfolios or otherwise evaluating the security sector. We will focus on two evaluative reviews we recently conducted focused on coherence and on Gender Equality and Social Inclusion (GESI). We will speak to how we created analytical frameworks and evaluative practice that was practical, defendable, and useful despite operating at a mega-portfolio level. We will also speak to how we delivered these, and created useful and actionable findings to overcome the above stated barriers. We hope to show how these solutions might translate into others’ contexts.

In running this session, we will outline our context, the barriers, and how we overcame them. We then wish to spark discussion with the audience on important questions facing evaluators in our position:

o Can and should government take an OECD-DAC approach to evaluation and reviews in these thematics, or when operating at a mega-portfolio level?

o How do we defensibly but flexibly look into assessing security sector topics at a portfolio-level without over-engineering new criteria that face the same problems?

o How do we define evidence, success, and credibility in these types of reviews and evaluations that operate at a mega-portfolio level? Do you think we got it right?

o How do you evaluate for non-evaluation clients given the above questions?

Relevance to the theme: this is relevant to ‘building evaluation cultures’ as our journey focuses not just on methods and approaches alone, but how evaluative practice is designed to a unique programming culture. We speak to how we created something useful and were a part of fostering a culture of commissioning, participating in, and using evaluation (as well as what went less well here).

Looking Backwards and Forwards – How Evaluation and Futures Thinking Shaped Natural England’s new Science Strategy

Alison Darlow (Natural England)

Send message to Author

Paper short abstract

How Natural England combined evaluation insights, futures thinking, and staff co-design to create its new Science, Evidence and Analysis Framework—embedding evidence at the heart of decisions for nature recovery. Learn what worked, what didn’t, and lessons for others.

Paper long abstract

This session will share how Natural England used evaluation insights, futures thinking, and staff co-design to create their new Science, Evidence and Analysis Framework (SEAF) a practical step toward embedding science, evidence, and evaluation at the heart of decision-making for people and nature.

Evaluation of our previous science strategy revealed significant challenges: fragmented governance, stretched capacity, and a culture that struggled to turn learning into action. At the same time, horizon scanning highlighted a future of accelerating environmental change and uncertainty, raising critical questions about how we could truly become evidence-led. By combining evaluation findings with foresight, we identified priority areas for investment and transformation.

Key elements of our approach:

• Developmental Evaluation in Action – Using “What? So what? Now what?” cycles to feed real-time insights into design decisions.

• Blending Foresight with Evidence – How horizon scanning and scenario planning shaped priorities and governance structures.

• From Theory to Tools – Translating evaluation into practical solutions like the Scientific Hive for evidence access and the Evidence Buddy network for capability building.

• Staff co-design – Involving staff as part of the process, under6and the nature of the problem and co-designing solutions.

The new SEAF provides a blueprint for how we use science and evidence to deliver nature recovery at scale through five themes: Data Science and Digital Innovations, Building Strategic Science Partnerships, Growing Scientific Capability, Science Communication and Impact, and Learning What Works.

We will share what worked well, what didn’t, and the lessons learned along the way. For other public sector organisations, this case offers a replicable approach: combining evaluation and futures thinking to design strategies that are both evidence-informed and resilient.

Harnessing AI for Education Evaluation: Insights from Commissioning and Practice

Maria Pomoni (Education Endowment Foundation) Anna Brian (RAND Europe)

Send message to Authors

Paper short abstract

RAND Europe (RE) and the Education Endowment Foundation propose a joint presentation on AI in education evaluation. EEF will share their priorities for using AI in evaluation, RE will present insights from developing an AI assessment marking tool. We discuss scaling, ethics, and lessons learned.

Paper long abstract

The role of AI in the evaluation of education programmes is rapidly expanding, offering opportunities to enhance efficiency, accuracy, and insight throughout the research process. Education systems are increasingly seeking innovative solutions to help improve learning outcomes; however, those commissioning and evaluating education interventions face the challenge of providing timely, actionable evidence while maintaining methodological rigour. AI technologies, particularly Generative AI, present new possibilities for addressing these challenges, particularly around the automation of routine tasks which are required throughout the evaluation process.

We propose a joint presentation by the EEF and RE, exploring AI’s role in evaluation from two complementary perspectives: commissioning and implementation. This talk would fit well within Theme 4: Evaluation in Action, as it directly explores the considerations required to integrate AI tools into live evaluations.

EEF will share their emerging interest in the application of AI within education evaluation, highlighting strategic priorities for integrating these technologies into evidence generation. This includes considerations around cost-effectiveness, scalability, and methodological integrity. EEF will also reflect on the implications for commissioners in ensuring that AI-driven approaches align with ethical standards and maintain transparency in decision-making.

RE will present insights from their current Writing Roots evaluation, an English writing intervention commissioned by EEF. Within this evaluation, RAND Europe is piloting an innovative AI-driven tool designed to mark handwritten assessments produced by children responding to the Writing Assessment Measure prompt. Marking this type of assessment can traditionally require a lot of resourcing and can be time-consuming. The AI tool RE has developed interprets and scores handwritten assessments, and outputs scores for each script in a format which can be directly analysed. This tool aims to reduce evaluator burden while maintaining reliability and validity. We will share lessons learned from the development process, validation results, and practical challenges encountered in integrating AI into a live evaluation project. These include technical and operational issues, such as ensuring fairness and avoiding bias in automated scoring, as well as the ethical considerations and information provided to participants.

Together, EEF and RE will discuss broader implications for scaling AI-enabled approaches in educational evaluation. Key themes will include ethical considerations, such as data privacy and transparency, alongside reflections about how AI can support evaluators in delivering timely insights for policy and practice. Our presentation will also consider the future research agenda and explore questions around the evidence needed to build confidence in AI-driven evaluation methods, and how commissioners and evaluators can collaborate to ensure these tools are deployed responsibly.

Evidence into Action: Evaluation to inform practice and adaptation in a play-based learning education programme

Lydia Marshall (Oxford MeasurEd) Jessica Best (Right To Play) Tadele Zewdie (Right To Play)

Send message to Authors

Paper short abstract

The presentation demonstrates how a rigorous impact and process evaluation can generate robust evidence and drive real-time improvements. Oxford MeasurEd will present how they designed an evaluation for learning, while Right to Play will show how findings are already shaping their programme.

Paper long abstract

Enhancing Quality and Inclusive Education (EQIE) 2.0 programme is a multi-country initiative delivered by Right to Play and funded by NORAD. It aims to improve foundational literacy and socio-emotional learning (SEL) through play-based pedagogy in Ethiopia, Tanzania, Lebanon and Palestine. In Ethiopia, the programme involves in-service teacher training and training for the head teachers and District Education Officials who will support teachers to change their practice.

Right to Play have commissioned Oxford MeasurEd to deliver an independent evaluation of EQIE 2.0 in Ethiopia and to support their internal monitoring and learning throughout the five-year programme. This presentation focuses on how the evaluation design combines generating rigorous evidence and supporting real-time programme improvement.

Oxford MeasurEd will present how they have designed a robust efficacy trial to assess the programme’s impact on literacy and SEL outcomes, along with a mixed-methods process evaluation to provide timely insights into whether and how the programme is working and classroom practice is changing. This integrated design enables the evaluation to produce credible impact evidence for funders and policymakers while continuously informing programme delivery and adaptation.

Right To Play will present how baseline findings have guided adjustments in design, including refining teacher training content, adapting coaching strategies and prioritising what to monitor. They will discuss how – supported by Oxford MeasurEd – they have embedded a culture of “evidence into action”, ensuring that learning and reflection are integrated throughout the intervention. RTP will also share practical insights from the baseline that have supported adaptations to design.

The presentation will demonstrate how robust evaluation can function as both an accountability mechanism and a driver of adaptive practice. By combining a randomised trial with embedded feedback loops, EQIE 2.0 offers a model for how evaluation partnerships can generate actionable evidence, strengthen learning systems, and contribute to sustained improvements in education quality.

Rethinking housing policy: strategic evaluation for real-world impact

Adam Knight-markiegi (Verian) Lan-Ho Man (MHCLG) Sabina Maslova (Verian) Gareth Roach (MHCLG) Amber Hill-Cann (MHCLG) Katie Golden

Smartness does not Mean Sustainability: Evidence from evaluating Brazilian Cities Through Smart City Rankings and SDG Indicators

Flavio Hourneaux Junior (University of Sao Paulo) Daielly Mantovani (University of Sao Paulo) Adriana Backx Noronha Viana (University of São Paulo)

Send message to Authors

Paper short abstract

In this session, we discuss the criteria for classifying a city as “smart” and propose an integrated evaluation of smart city rankings that reveals technocentric limits and improves sustainability outcomes aligned with the UN 2030 Agenda, demonstrating this through a case study in Brazil.

Paper long abstract

Brazil presents a highly unequal urban landscape marked by deep regional disparities, heterogeneous levels of digital infrastructure, and long-standing inequities in access to public services. In this context, smart city initiatives have expanded rapidly and gained visibility through rankings that reward digitalization, innovation ecosystems and technological sophistication. However, these rankings often influence policy priorities by signaling prestige and competitiveness, even though they may not reflect the social and environmental realities of most municipalities. This creates a unique environment to investigate how evaluation frameworks can reinforce—or challenge—policy agendas in complex and unequal urban systems typical of the Global South.

The SDGs offer a comprehensive and normative framework for evaluating urban development by integrating social justice, environmental protection, economic resilience and inclusive governance. However, despite their global adoption, the extent to which SDG principles are incorporated into local policy instruments varies widely. In Brazil, many municipalities face difficulties aligning technological innovation with social and environmental priorities. SDGs related to health, education, gender equality, climate action, inequality reduction and sustainable urban development provide a robust lens to assess whether the “smartness” promoted by rankings effectively contributes to broad-based well-being. By grounding the evaluation in the SDGs, this study positions sustainability not as an optional component of urban intelligence, but as its ethical and developmental foundation.

The proposal compares indicators from the Connected Smart Cities ranking (Urban Systems, 2024) with municipal performance on all 17 SDGs using the Sustainable Development Index of Cities (IDSC). Through this evaluation perspective, we identify that being ranked as a smart city does not guarantee superior SDG performance, particularly in social SDGs such as health, education, gender equality, and inequality reduction. Several non-ranked municipalities outperform ranked ones in these domains. Only SDGs related to innovation, infrastructure and environmental management show partial alignment with smart city indicators.

These results reveal structural weaknesses in the evaluative models used to guide public policies in Brazil. Current rankings are dominated by infrastructure and technocentric indicators, which provide an incomplete basis for policymaking. From an evaluation-use perspective, the findings highlight three barriers:

(1) misaligned incentives created by reputational rankings;

(2) contextual disparities that undermine cross-municipal comparability; and

(3) fragmentation between evaluation domains that results in weak or misleading policy signals.

Despite these barriers, the evaluation also identifies opportunities for policy improvement. By exposing inconsistencies between the “smart” label and actual SDG performance, the study supports the development of adaptive, integrative evaluation tools that align technological innovation with social and environmental goals. The proposed model, inspired by the SDG “wedding cake,” integrates urban intelligence indicators with sustainability outcomes, offering municipalities a path to revise priorities and strengthen public policy coherence.

Overall, the work argues that evaluations capable of influencing policy must go beyond technology-based performance measures and incorporate multidimensional, territorially informed perspectives that reflect the complexity of urban systems. Such approaches enable more just, sustainable and evidence-based urban policymaking.

Meaningful engagement as a path to building sustainable evaluation cultures: Insights from evaluating targeted youth interventions in England

Róisín Killick (ImpactEd Limited) Amanda B (ImpactEd) Nina Vafea (ImpactEd Group) Danielle Jones (ImpactEd Group)

Send message to Authors

Paper short abstract

This presentation uses principles of utilization -focused evaluation to examine two real-world evaluations of programmes addressing serious youth violence in England. Drawing on these cases, we identify practices that meaningfully involve stakeholders throughout all stages of the evaluation process.

Paper long abstract

While evaluations, especially impact evaluations, are now standard practice in the delivery of social programmes, many are characterised by the limited engagement of the programme actors (members of organisations or programmes that are being evaluated). Rarely, and only in specific evaluation models, such as participatory approaches, is co-production treated as fundamental to the evaluation process. Whether impact, outcome, or process evaluations, there is minimal consideration of how programme actors actually engage with both the evaluation itself, its outputs, or the implementation of findings. Yet meaningful engagement throughout the evaluation cycle signals trust in the data, relevance of insights, and practical utility - hallmarks of evaluation that creates genuine value.

This presentation critically examines two real-world evaluations of programmes addressing serious youth violence in England, exploring two fundamental questions: ‘What makes an evaluation design and its implementation truly engaging?’ and ‘What value does this engagement add to the relationship between evaluators and those involved in delivery, and ultimately to the successful implementation of evaluation outputs?’

We argue that engagement, measured through active participation in design, data collection, sense-making discussions, and uptake into organisational decision-making, should be recognised as a core indicator of evaluation success, not merely a desirable by-product. Drawing on the theoretical framework of utilization-focused evaluation which emphasises the importance of making evaluations useful and relevant to stakeholders (Patton, 2008) and of the principles of bottom-up evaluation, we also demonstrate how such processes create opportunities to embed evaluative thinking and mindset among diverse stakeholders, ultimately cultivating sustainable evaluation cultures that continue to be helpful even after the evaluation is completed.

The presentation offers practical insights into designing for an engagement-based evaluation delivery from inception, including participatory approaches to evaluation design, co-creation of evaluation questions, collaborative data collection processes, and structured mechanisms for ongoing dialogue around emerging findings, including their implementation. We also consider how evaluators navigate the tension between engaging programme stakeholders on one hand and maintaining independence on the other, examining how this balance influences evaluator-stakeholder relationships in practice. We conclude by challenging the field to expand evaluation success metrics beyond methodological rigour and timely delivery to include the quality and depth of stakeholder engagement at critical junctures.

From evaluation to embedded learning: building cultures of reflection and trust

Esther Winslow (Triple Line Consulting) Gordon Freer (Insight Strategies) Katharine May (Triple Line Consulting)

Send message to Authors

Paper short abstract

This panel explores how embedded learning partnerships shift evaluation from rigid frameworks to adaptive, trust-based approaches — supporting participatory design, real-time decision-making, and inclusive learning cultures across donor and grantee organisations.

Paper long abstract

Evaluation is widely promoted as essential for accountability and learning. However, in practice, traditional approaches—often rigid, judgement-oriented, and externally driven—tend to reinforce a culture of compliance rather than curiosity. Many funders are now deliberately shifting towards a learning-oriented view of evaluation: seeing it not as a mechanism for judgement, but as a strategic opportunity to strengthen programmes. This shift is prompting greater demand for more embedded evaluation roles, where evaluators work as learning partners. This closer relationship helps funders and partners more meaningfully interpret evidence and make real-time, evidence-based decisions that support programme adaptation for greater impact. Such roles create the conditions for staff to engage openly with feedback and to view evaluation as a supportive, adaptive learning process rather than an accountability exercise that occurs at the end of the programme.

In response to this shift, Triple Line has been working with Porticus since 2023 as an embedded Learning Partner. This collaboration spans Porticus’ global education portfolio, engaging both programme staff and grantee partners to co-design programmes, facilitate learning processes and foster an organisational ethos of continuous learning, reflection and adaptation. The partnership aims to move all players; Porticus, Triple Line, and grantee partners, beyond compliance-driven monitoring and evaluation, to cultivate a culture of inquiry grounded in trust, curiosity and shared purpose.

Drawing on their experience, speakers from Triple Line and Porticus will share practical insights into how embedded learning practices are being integrated into day-to-day work at both programme and organisational levels. The session will explore participatory programme and MEL framework co-creation, iterative reflection tools, and strategies for embedding evidence-informed decision-making within complex systems. It will also examine the challenges of cultivating a learning culture within donor agencies and across grantee organisations—including navigating power dynamics and enabling genuine collaboration.

Importantly, the panel will include short recorded contributions from two grantee partner organisations working on the ground. These voices will highlight how embedded evaluative activities have supported their own learning and real-time decision making, and the challenges they have encountered along the way.

We argue that cultivating a learning culture demands a fundamental shift in how evaluation is conceived and practiced. Rather than focusing solely on metrics and outcomes, evaluative processes must enable collaborative sense-making, support emergent learning and be responsive to context. This approach not only strengthens programme effectiveness but also contributes to more equitable and inclusive development practice.

Bridging the gap between quali and quanti: using Qualitative Comparative Analysis QCA with survey data to identify successful pathways in a large population for the Behaviour Hubs programme evaluation

Barbara Befani

Send message to Author

Paper short abstract

This presentation discusses the design of the Behaviour Hubs programme evaluation, outlining the opportunities and challenges encountered at each step. The innovative design combined Realist Evaluation and Qualitative Comparative Analysis (QCA) with a survey designed specifically for QCA analysis.

Paper long abstract

This presentation discusses the design of the Behaviour Hubs programme evaluation. The Behaviour Hubs programme was launched to support schools and Multi-Academy Trusts (MATs) in improving pupil behaviour. The programme encouraged 'lead' schools and MATs with exemplary behaviour cultures to collaborate closely with 'partner' schools seeking to improve their pupil behaviour. Its objectives were to ensure that more teachers felt supported by senior leaders in managing misbehaviour, and understood and consistently applied their school's behaviour policy, ultimately leading to fewer incidents of disruptive behaviour.

The programme, which supported over 650 schools, built on centrally organised bespoke resources and a taskforce of behaviour advisers, delivering customised specialist training and networking events, open days, and encouraging the building of relationships between schools.

The evaluation aims were to: a) determine whether the programme had met its strategic objectives and achieved its projected outcomes for schools, staff, and pupils; b) understand how and why the intervention did (or did not) meet its objectives; and c) investigate the change mechanisms triggered by the programme that produced the observed outcomes and impacts, examining variation across different schools and respondent groups.

The combination of Realist Evaluation and Qualitative Comparative Analysis (QCA) was considered the most appropriate design because of its focus on change mechanisms, contextual variation, as well as ability to generalise findings to medium and large numbers of cases (the survey had responses from 105 schools from a total of 650+ participating schools).

The design was innovative because while there are relatively few examples of QCA applications to large N datasets and survey data, there are almost none in evaluation. The presentation outlines the opportunities and challenges encountered, going through each step, from model specification following exploratory case study work, to the design of a bespoke QCA survey to obtain a dataset of consistently comparable cases, through to calibration, running the QCA algorithms, and interpreting and presenting the findings.

It shows the kind of causal patterns QCA is able to discover, their fit to the impact evaluation questions, and the transparency and repeatability of analysis procedures.

Co-Creating an Evaluation Culture: Navigating Power, Participation, and Adaptation in Complex Systems

Alice Kedge (Youth Futures Foundation)

Send message to Author

Paper short abstract

Systems change takes more than strategy, it needs a culture of learning, reflection and adaptation. We’ll share how we’re collectively building an embedded evaluation culture in a multi-year programme, explore what we’ve learned, and hear from partners who’ve been part of the journey.

Paper long abstract

Connected Futures is a place-based systems change programme designed to transform the journey from education to employment for young people facing exclusion and disadvantage. Tackling complex, cross-cutting challenges, the programme supports the development of locally tailored approaches to youth employment by placing young people at the heart of the system, from schools and employers to housing, health, and care.

We’re excited to share what we’ve learnt from Connected Futures. As we continue to deliver this programme, we remain acutely aware of the challenges inherent in this approach, particularly the pressure between empowering local partnerships to lead learning and responding to expectations for evidence that can influence wider policy. We’ll explore these trade-offs, including navigating power dynamics, working within fixed funding structures while supporting adaptive learning, and sustaining engagement across complex systems. Partners from the programme will join us to share their experiences of integrating evaluation into their systems change activity and decision making.

We’ve cultivated an evaluation culture from the outset, adopting a developmental evaluation approach to support real-time learning in a dynamic systems environment, including rapid feedback loops & iterative sensemaking. This enabled partnerships to adapt to emergent conditions and shifting priorities. A dedicated learning partner provided strategic oversight, building capacity across local partnerships to use evidence, challenge assumptions, and strengthen coherence across diverse workstreams.

As partnerships deepened their understanding of local barriers and systemic opportunities, they began testing different approaches and looking for early signs of traction with stakeholders across the system. To support this, we commissioned embedded action researchers in each local area to facilitate reflective practice, capture emergent insights, and strengthen local learning loops.

Commissioning was intentionally collaborative and co-designed with local areas to build trust, leverage local knowledge, and foster a sense of joint ownership over learning and adaption. Together, the learning partner and action researchers helped shift evaluation from a reporting function to a shared practice of inquiry, prioritising participation, building capacity, strengthening relationships, and driving more coherent and impactful systems change. Theories of change were co-created with local partnerships as evolving tools for sensemaking, alignment, and adaptation.

We’re still learning what it takes to embed an evaluation culture within complex local systems. This session shares what’s worked, what’s been hard, and what we’re still figuring out, with voices from those who’ve lived and shaped the work.

Building shared understanding across complex systems: Participatory Research on Multiple Disadvantage

Beatriz Amaral (Verian) Ramlatu Attah (Verian Group)

Evaluation questions you can answer with causal mapping

Fiona Remnant (Bath Social and Development Research) Steve Powell (Causal Map Ltd)

Breaking New Ground: Ethical and Robust Evaluation of Domestic Abuse Recovery Services for Children

Emma Leith (Foundations - What Works Centre for Children Families)

Send message to Author

Paper short abstract

How can we ethically and robustly evaluate domestic abuse recovery services for children? This roundtable brings together evaluators, delivery partners, and lived experience experts to explore barriers, solutions, and lessons from two pioneering UK pilot RCTs.

Paper long abstract

Domestic abuse affects one in five children in England. The consequences can be profound and enduring, from poor mental and physical wellbeing to difficulties with building healthy relationships in the future. Only 29% of parents seeking support for their child(ren) are able to access it. Robust evidence on what works to improve outcomes for children affected by domestic abuse is lacking because domestic abuse recovery services for children remain under-evaluated. Without evidence on what shifts the dial on outcomes, policymakers and funders lack the confidence to sustainably invest in services that could transform the lives of many children and young people if delivered at scale. At the same time, evaluating these services is complex and requires thoughtful consideration of the ethical concerns around some methodologies and appropriately mitigating these concerns in the evaluation design.

This roundtable will explore how impact evaluation can be done ethically and effectively in this complex policy area, drawing on pioneering randomised controlled trials (RCTs) of two recovery programmes: 1) Bounce Back 4 Kids, a trauma-informed programme for children aged 3 to 11 years and their non-abusive parents, and 2) WeMatter, an online group recovery programme for children and young people aged 8 to 17 years.

The discussion will bring together a small group of evaluators, programme facilitators, and lived experience experts across the two projects to share practical lessons and discuss key questions such as:

- How can evaluators and service providers collaborate effectively to maintain programme quality, participant wellbeing, and methodological rigor?

- What practical advice does the panel have for ensuring service providers, evaluators and those with lived experience work in genuine partnership?

- How can we balance methodological rigor with ethical concerns in evaluations involving vulnerable children and families?

- What strategies help to overcome barriers to recruitment, retention, and resource constraints?

- What role do evaluators and commissioners play in supporting service providers to build their evaluation capacity?

- How might evidence help make the case to secure sustainable funding and why is this important?

At the time of the conference, both projects will be well into the delivery of the full-scale trial, offering a unique opportunity to reflect on early learning and the transition between pilot and full-scale phases. Attendees will leave with insights into collaborative and iterative evaluation approaches, ethical design, and strategies for embedding evaluation cultures in under-evaluated policy areas. This session will demonstrate how generating evidence on what works, for whom, and in what context can shape policy and funding decisions, ensuring more children can receive the support they need.

(Note to abstract reviewers: We have not confirmed exactly who from the project teams will participate, as it's been a very busy time for them wrapping up the pilots of theses evaluations this month. But there is a lot of interest across both project teams to be involved in this roundtable discussion. We are also proposing to have an independent academic expert to moderate this discussion who is neither from Foundations, nor any of our partnering organisations.)

Reimagining Evaluation Consulting: The Case for Learning Partnerships in Education and Youth Services

Deepta Sunil Valliyil (ImpactEd Limited) Danielle Jones (ImpactEd Group)

Send message to Authors

Paper short abstract

Evaluation consulting in the impact sector presents a paradox, as evaluators must serve as both objective 'outsiders' and collaborative co-designers. This paper advocates for learning partnerships in education and youth services to mediate this tension.

Paper long abstract

Evaluation consulting in the social impact sector embodies a paradox. Independent evaluators, influenced in part by the culture and ethos of private management consultancies such as the 'Big Three' and in part by the need to ensure objectivity, are encouraged to assume the role of the 'other' or 'outsider' in client-consultant relationships. Paradoxically, this very distance can foster transactional dynamics that undermine the collaborative conditions necessary for meaningful evaluation of social programmes. Evaluators systematically evaluate organisational practice but rarely face scrutiny of their own methodological assumptions, positional power, or contextual understanding. This one-way accountability becomes increasingly problematic as policy demands for evidence-based practice intensify. Without critical reflection on these consulting models, we risk institutionalising transactional rather than transformative approaches to evaluation.

This presentation draws on a critical literature review (Grant and Booth, 2009) of academic and grey literature to examine prevailing evaluation consulting models in the UK social impact sector. Anchored in Gaventa's Power Cube framework (Gaventa, 2006) and Blyde's consultant-client relationship typology (Blyde, 2008), it addresses two core questions: (a) What are the limitations and systemic risks of standard evaluation consulting arrangements, particularly regarding accountability gaps and epistemic asymmetries? and (b) Can “learning partnerships” offer a transformative alternative, redistributing power, embedding mutual accountability, and prioritising organisational learning alongside evaluative judgement?

I argue that learning partnerships, characterised by transparent negotiation of evaluator positionality and explicit capacity-building commitments, can address the fundamental power imbalances inherent in traditional consulting relationships. These partnerships are especially promising in sectors where power dynamics critically shape service quality, such as education, social care, and youth services. The presentation concludes by exploring why learning partnerships remain rare, despite their theoretical appeal, and examining the structural barriers to their design and implementation. It then proposes practical suggestions for evaluation commissioners and practitioners seeking to operationalise more equitable evaluation consulting approaches.

Scaling Automated Programmatic and Global Aggregated Indicator Targets vs Actuals Reporting: Turning Real-Time Evaluation Data into Timely, Actionable Insights through MEL Technologies

Amy Joce (Mercy Corps) Alex Tran (Mercy Corps) Atish Raj Shakya (Mercy Corps) Farah Haddad (Mercy Corps) Martin Peter (Mercy Corps)

Send message to Authors

Paper short abstract

Automating Indicator Targets vs Actuals reporting through use of real-time MEL Technologies enables real-time tracking and analysis of program performance. This approach supports timely reflection, collaborative learning, and evidence-informed decisions throughout the program cycle.

Paper long abstract

Evaluation findings often arrive too late to influence decision-making, limiting their utility for adaptive management. Manual data collection and reporting introduce delays and inconsistencies, creating a disconnect between technical findings and actionable insights that undermines evaluation goals of supporting learning and improving effectiveness.

To address this, Mercy Corps is rolling out MEL Technologies globally, including CommCare for case management/offline data collection, Microsoft Azure for cloud data engineering and storage, and Power BI for interactive analysis and visualization. A key component of this rollout is a standardized approach to automating Indicator Targets vs Actuals reporting across all Mercy Corps country offices. This enables MEL champions to design and deploy dashboards that transform raw data into timely, actionable insights. This ensures all Mercy Corps country teams are well equipped to monitor performance in real time, identify gaps, and take corrective action long before final reports/evaluations are produced.

The approach aligns with utilization-focused evaluation principles, emphasizing practical application, stakeholder engagement, and actionable evidence. Automated dashboards are designed not only to display performance metrics but also to support interpretation and reflection across diverse audiences. Through visual storytelling, interactive features, and co-created analytics, program teams and decision-makers can collectively understand trends, co-design solutions, and embed learning throughout the program cycle.

A key strength of Mercy Corps’ approach is its capacity-building pathway, which moves from basic to advanced automation techniques. Recent MEL Tech trainings (delivered in English, French, and Spanish) reached over 130 MEL champions across 30+ countries in Latin America, Africa, Middle East, Asia, Eastern Europe equipping them to develop automated data engineering processes and dashboards that that establish consistent Indicator Targets vs Actuals analysis as well as conducting country portfolio-level analysis, organizational outcome measurement, and automated participant counts. This structured training approach ensures consistency, promotes shared understanding, and enables scaling of automated MEL Tech systems across diverse programs and countries.

This session will share Mercy Corps’ experience in standardizing Indicator Targets vs Actuals automation, highlighting technical design, training approach, and lessons learned. Participants will explore how combining MEL technologies, capacity building, and utilization-focused evaluation can transform evaluation from a static reporting exercise into a dynamic, collaborative practice and turn data into actionable insights that drive real-time program improvement and learning across multiple contexts.

Exploring Participatory and Co-creation Methods for longitudinal evaluations in collaborative RDI contexts: unpacking mechanisms of collaborative value creation and collaborative capital building

Linda Wallace (UK Research Innovation)

Send message to Author

Paper short abstract

Recognising that many academic studies focus on a narrow band of outputs, this in depth study uses participatory approach to co-create and develop a new set of dimensions through which to view and evaluate collaborative capital building and value co-creation in RDI contexts

Paper long abstract

As well as alignment to ‘Bridging the Gap: evaluation to action’, this paper also relates to the conference theme of 'Building evaluation cultures'. Through an in-depth study of a successful university-based cooperative research centre (a longitudinal study over a 10 year period conducted as a part time PhD), this paper unpacks the lived experiences of a range of stakeholders (including academics and a diverse range of industry participants from SMEs to large industry primes) involved in collaborative Research, Development & Innovation (RDI). The partners' shared long term goal of achieving a paradigm shift in the way pharmaceuticals are manufactured (from the current batch manufacturing to more efficient and sustainable continuous manufacturing technology, systems and processes). In this collaborative RDI context, the co-production of evidence was generated using participatory and user-centred methods supporting reflection and learning as an integral part of the evolution of the technology, products, processes and the the evolving partnerships across the innovation ecosystem.

This area of advanced manufacturing is critically important to the UK economy (one of the key sectors in the UK Industrial Strategy) and the the study of a successful case demonstrates how effective monitoring and evaluation has been embedded in such a way that ensured a focus on delivering impacts from the outset. Having an agreed shared goal is considered critical to maintaining focus and driving interaction, creativity and collaboration amongst partners.

This study has highlighted a broader range of important metrics being used in the monitoring, evaluation and management in collaborative RDI and this paper demonstrates how this approach has enabled evaluation to play a key role in everyday decision-making and delivery contributing to a culture of entrepreneurial action and and the evolution of the new technology, systems and processes.

Developmental Evaluation as a Catalyst for Systemic Change: Lessons from Climate and Social Justice Programming

Olivia Lasica (NIRAS Group (UK)) Dui Jasinghe (NIRAS Group UK)

Send message to Authors

Paper short abstract

This session explores developmental evaluation (DE) as a tool for driving systemic change at the intersection of climate, social justice, and development. Through case studies, we show how DE enables real-time learning, adaptation, and policy influence in complex, multi-stakeholder contexts.

Paper long abstract

Systemic change in climate and development programming requires adaptive, learning-oriented approaches that go beyond traditional evaluation models. This panel presentation explores developmental evaluation (DE) as a tool for influencing policy and programme change at the nexus of the climate crises, social justice and development. DE emphasizes real-time learning, iterative adaptation, and stakeholder engagement—critical elements for navigating complexity and uncertainty. Drawing on two case studies, we illustrate how DE has informed strategic shifts and strengthened resilience in diverse contexts:

Ford Foundation’s BUILD Programme: A global initiative to enhance the institutional capacity of social justice organizations. The evaluation demonstrated how DE can support long-term systems change by embedding learning into organizational strengthening strategies.

Climate Ambition Support Alliance (CASA): An ongoing evaluation of a multi-country programme aimed at accelerating climate ambition in vulnerable regions. Here, DE facilitates adaptive management and policy engagement in response to evolving climate and geopolitical crises.

The session will highlight practical insights on:

How DE fosters systemic change by influencing programme design and policy dialogue.

Lessons for applying DE in both domestic and international development contexts.

Challenges and opportunities in integrating DE within complex, multi-stakeholder initiatives.

Participants will gain actionable strategies for leveraging evaluation as a driver of systemic change, particularly in programmes operating at the intersection of climate, crisis, and development.

AI in Evaluation: Learn Tools and Workflows from UN Case Studies with >90% Validated Accuracy

James Goh (AILYZE (AI Evaluation Tool))

Send message to Author

Paper short abstract

Get guided, hands-on practice on AI tools and workflows used in UN case studies where AI achieved >90% validated accuracy. Activities include AI analysis of interviews, reports, and survey responses, as well as practice with features like AI avatar interviewers, visualizations and chatbots.

Paper long abstract

As evaluation teams face growing volumes of qualitative data, tighter timelines, and rising expectations for timely learning and use, AI is increasingly positioned as part of everyday evaluative practice. Yet many evaluators remain rightly cautious: How accurate is AI compared to human analysts? Where does it genuinely add value? And how can it be used ethically, transparently, and without reinforcing bias or hallucinations?

This interactive workshop addresses these questions through real-world UN evaluation case studies, where AI methods were systematically benchmarked against human evaluators and independently validated at >90% accuracy. Rather than focusing on theory or speculative futures, the session emphasises practical workflows, governance approaches, and hands-on application that evaluators can immediately translate into their own work.

Participants will explore three applied case studies drawn from UN evaluations:

1) AI Interview Transcript Analysis (UNHCR):

AI was used to analyse 50 qualitative interview transcripts, generating thematic, subgroup, and segment-specific insights aligned with evaluation questions. Results were benchmarked against human coding and validation processes, demonstrating how AI can support rigorous qualitative analysis while dramatically reducing time and cost.

2) AI Avatar Interviewers (UNESCO):

AI avatars were deployed to conduct 50 interviews in two days, enabling multilingual, culturally sensitive data collection at scale. This case illustrates how AI can expand reach to under-represented groups, reduce interviewer burden, and support more inclusive and adaptive evaluation designs.

3. AI Document and Survey Analysis (UNICEF):

AI analysed over 700 management responses and survey entries across 160 evaluation reports in five languages, identifying cross-cutting barriers, enablers, and patterns that would have been impractical to detect manually. The case demonstrates how AI can support synthesis, learning, and utilisation across portfolios.

Beyond showcasing results, the workshop focuses on how these outcomes were achieved responsibly. Participants will learn how human-AI benchmarking was conducted, how hallucination risks were mitigated, and how ethical safeguards, such as human-in-the-loop review, bias checks, and transparent documentation, were embedded into evaluation workflows.

A core feature of the session is hands-on participation. All attendees will be provided with complimentary access to the AI tools used in the case studies and guided through live exercises. Participants will have the option to work with their own evaluation data or provided sample interviews, reports, and survey responses. Activities include:

- Analysing qualitative data and survey responses using AI-assisted workflows

- Creating AI avatar interviewers tailored to specific evaluation contexts

- Generating visualisations and dashboards for sensemaking and communication

- Interacting with an AI chatbot to interactively query evaluation findings

By the end of the session, participants will leave with a clear understanding of where AI meaningfully strengthens evaluation practice, how to apply it ethically, and how it can help bridge the persistent gap between evidence generation and action. The workshop directly contributes to building evaluation cultures that value learning, timeliness, inclusion, and responsible innovation, aligning with the conference theme of “Bridging the Gap: Evaluation to Action.”

Evaluating Equalities impacts of FCDO Programming in Somalia: Evidence to Inform Adaptive Programming Under Policy Shifts and Reduced ODA Funding

Megan Truscott (Tetra Tech International Development)

Send message to Author

Paper short abstract

This research piece reviewed the impacts of Gender Equality and Social Inclusion in the Foreign, Commonwealth and Development Office programming in Somalia. Evidence was provided to inform adaptive programming in a complex context and under policy shifts of reduced Oversees Development Assistance.

Paper long abstract

The Equalities research piece provides lessons on Gender and Social Inclusion (GESI) mainstreaming, to inform an adaptive approach to programming within the context of reduced Oversees Development Assistance (ODA) in Somalia. Findings and recommendations from this provides lessons on the use of evaluative review to adapt programmes based on evidence of what works, in the face of barriers and uncertain political context and policy environments.

This research was delivered under the Foreign, Commonwealth and Development Office’s (FCDO) Somalia Monitoring Programme III (SMP III). SMP III builds FCDO understanding of development needs in Somalia by providing actionable learning to improve the design and delivery of programmes. UK ODA allocations to Somalia have fluctuated substantially in recent years; these shifts affect spending and programmatic activities targeted towards women and other marginalised groups. In 2026, the UK will continue to reduce the aid budget to 0.3% of gross national income by 2027/28 (having been reduce from 0.7% to 0.5% from 2021). The Equalities research addresses urgent questions about who is reached by programmes and how Equality outcomes can be sustained as budgets reduce.

The Equalities piece was completed over four months. The objectives of the research were 1) to understand the extent to which Equality was a consideration in the design of programmes and the extent of reporting against Equity. 2) To understand how the application of the GESI Strategy advanced/maintained Equality expectations, and what lessons can be embedded into future programming, and 3) to understand the potential impact of reduced funding to FCDO programming in Somalia on Equality.

The team analysed the Equality considerations in design documents and their contextual grounding, the extent that Programme Results Frameworks and log frames included and measured Equality indicators, assessed data disaggregation against the nine protected characteristics, and the extent that programme Value for Money frameworks captured Equality data. The team conducted KIIs with key programme staff from the FCDO. The team sought to identify Equality gains achieved and how these can be sustained after programme closure, and crucially how Equality is viewed in the Somali context. The outcomes of this research 1) built the FCDO understanding of the current achievements of GESI mainstreaming, and 2) developed recommendations in collaboration with the FCDO for future programme design and adaptation in the face of reduced ODA funding.

The Equalities research offers a case study of how the SMPIII team used evaluative review to generate evidence-based, actionable insights for adapting programmes. It also provides lessons on how Equalities can continue to be monitored in the face of increasingly challenging contexts, and barriers to GESI programming.

Evaluating for Scale: Scalability Assessments for Decision Makers

Timothy Reilly (Tetra Tech International Development)

Send message to Author

Paper short abstract

This presentation demonstrates how scaling assessments enable evaluators to influence policy and programme change by assessing viability, costs, adaptations and risks to expand proven interventions, using practical tools and Tetra Tech case studies.

Paper long abstract

Scaling assessments are an underused but powerful approach for closing the gap between evidence and large-scale change. This presentation explains how systematic scaling assessments can influence policy and programme change by providing decision makers with clear, practical judgments about whether, how and under what conditions proven interventions can be expanded to benefit many more people.

Drawing on Tetra Tech International Development experience across a range of sectors, including child safety, parenting, food security, disease prevention, biodiversity preservation and water and sanitation, we will set out a pragmatic framework for assessment. Our framework examines both intrinsic features of the model and the external systems that determine whether replication or expansion is feasible. Key dimensions include credibility, observability of results, adaptability to new contexts, affordability at scale, incentives and capabilities of adopting organisations, and the policy and budget environment. The approach uses structured checklists and scoring tools to surface strengths, identify critical risks and prioritise information gaps that need to be filled before a full scale up is attempted.

I will show how assessment outputs can be translated into actionable guidance for policy makers and programme managers. Typical products include a concise articulation of scaling challenges sequenced adaptations, monitoring priorities and a staged piloting plan. These outputs are deliberately diagnostic rather than prescriptive. They support governance decisions by clarifying trade-offs between fidelity and reach, and by specifying the evidence and implementation conditions required to preserve impact at scale.

The presentation will feature two or three short case studies from Tetra Tech practice. Each case will illustrate how assessments influenced decisions about organisational priorities; new delivery partners; and alternative and lower cost delivery models. Participants will learn practical methods for integrating scaling assessments into evaluation portfolios so that evaluations move beyond measuring effect to shaping action. I will discuss timing and sequencing to ensure assessments inform policy windows, and how to present risk balanced, politically savvy recommendations. I will also address common pitfalls, including over reliance on pilot success without context analysis, and treating scalability as a binary judgement rather than a process.

By the end of the session attendees will be able to explain what a scaling assessment is, why it matters for influencing policy and programme change, and how to design one that delivers concise, credible advice for decision makers. In resource constrained times, funders and governments must be able to distinguish between programmes that are merely beneficial and those that can be transformative at scale. Scaling assessments are a practical evaluation tool to make that distinction and to increase the likelihood that proven interventions will be successfully expanded to produce sustained, population level impact.