DevLearn Digest 15: We read every MSD Evaluation – and learned surprisingly little
Hello everyone,
Our findings from an exhaustive (and exhausting) review of MSD evaluations are below – but first, three quick announcements:
- Recruitment: We are recruiting for a Senior MEL Consultant to join our team. Given the chaos and destruction at USAID, we are extending the deadline for another two weeks, till the end of February, for senior consultants with a strong understanding of MEL in MSD programmes at a leadership level. There is some flexibility in salary scales for the right candidate.
- Advanced training on MSD: We have some additional spaces available for our Advanced MSD Training Course, which will take place from the 7th – to the 11th of April in Istanbul, following some understandable drop-outs from US-funded participants. Apply now if you want to join.
- Online training in MSD and MEL: Our introductory courses on market systems development and monitoring, evaluation, and learning take place in May 2025. The early bird discount is still applicable – if you apply by the end of February.
Now, on with the newsletter…

We read every1 MSD evaluation – and learned surprisingly little
For years, I’ve been feeling slightly guilty. Donors and programmes have invested tens of millions of dollars into evaluations of market systems development projects. But, with few exceptions, I’ve hardly read any of them. So with the help of the indefatigable DevLearn team, we set out to remedy this and read a total of 33 evaluations ranging from longitudinal masterpieces to quick-and-dirty project reviews.
At the end of this work, after hundreds of pages of text, you might have expected us to feel elated. We could finally distil the collective wisdom of years of evaluator time, of centuries of expertise, to reveal the answers to the big questions about MSD. Does it work? How? For who? Under what conditions?
Instead, our review left us somewhat underwhelmed. Most evaluations had interesting lessons for the evaluated projects. Some had snippets of insights which might be helpful if you were designing a similar project elsewhere. But ultimately many of the lessons were bland, intuitive, or widely understood in the MSD community. Few, if any of the evaluations challenged our received wisdom, or really managed to surprise us at all.
You may think we’re being harsh. If so, have a think about an evaluation that you read, for a project that you didn’t work on, that really surprised you and changed the way you work. My guess is that you can’t. If you can, please let us know, and I’ll feature the best responses in the next DevLearn Digest.
In the meantime, our attention turned to the obvious question; what’s going on?
- Scope: Evaluations cover a huge range of topics – one evaluation tried to answer almost a hundred different questions. Inevitably, this dilutes the amount of resources that can be spent on any individual question. We thought the strongest evaluations focused on a handful of specific questions (and the least interesting evaluations tended to use the OECD-DAC criteria, which encourage evaluators to spread themselves too thinly.)
- Methods: About a decade ago, evaluation gurus called for a movement away from an over-focus on randomized control trials, and experimentation with alternative approaches. Unfortunately, what we got is mixed-methods theory-based evaluations. Lots and lots of them. Nothing wrong with that, but by the end, we were crying out for some methodological diversity.
- Incentives: Most evaluators were contracted by and paid for the projects that they are evaluating. Many still do excellent work, but with the best will in the world, it is difficult to tell a client that their project was a disaster. Although we read hundreds of evaluation recommendations, almost every evaluation was either neutral or broadly positive about the programme it evaluated.
- Ecosystem: Only four of the evaluations we reviewed had citations on Google Scholar; only one had more than five. Of course, citations are not the only way that learning can spread, but we also didn’t find many online mentions; most evaluations were published then vanished. This is the sad fate of most reports – 30% of World Bank reports are never downloaded. But a small number of evaluations should go on to shape the field. Our mechanisms for finding, debating, and responding to them are extremely limited.
Perhaps we’re expecting too much from evaluations. Evaluators are, first and foremost, hired by a particular programme to give feedback on its performance. Broader learnings for the sector are often included in evaluation designs but seldom prioritized in practice.
Rather than commission more evaluations, donors and organisations could think in terms of learning agendas, working with researchers to look across multiple programmes with a specific question in mind. There are lots of interesting questions – what happens after a programme finishes? How additional is financing? Does crowding in ever really happen? What pilots really scale up? How widespread are ‘donor darlings’ – that are best answered by looking across multiple programmes, rather than evaluating just one?
More generally, we need to think about incentives to be controversial and interesting. Perhaps there is something to learn from academia and think-tanks, where new individuals and institutions make their name through new and surprising findings, challenging rather than supporting previous positions. What would this look like for MSD evaluators?
The DevLearn Team
1. Never believe a click-bait newsletter headline; we didn’t actually read every MSD evaluation. A more accurate – if less catchy – headline would be “We read the executive summaries of every MSD impact evaluation listed on the BEAM Exchange evidence map, conducted since 2014 by a consulting firm or academic institution”. This came to 33 evaluations in total. Still a lot of reading, in our opinion.