Blog / DevLearn Digest 13: The biggest MEL mistake you’ve never heard of
25 July 2024

DevLearn Digest 13: The biggest MEL mistake you’ve never heard of

DevLearn Digest 13: The biggest MEL mistake you’ve never heard of

We are excited to launch registration for our November 2024 online courses on market systems development (MSD) and monitoring, evaluation and learning (MEL). There is an early bird discount of 10% until the 31st of August – but spaces are already starting to sell out, so if you want to build your skills in MSD and MEL, please register soon. 

All our marketing is through word of mouth, so if you have enjoyed a course or this newsletter, we would be very grateful if you could share the information with a colleague or friend. Click here to find out more and register. 

And now, on with the topic of this newsletter!

There’s a fundamental flaw in the way some organisations calculate their impact. It’s a subtle but important mistake, and allows interventions that are completely useless – or even actively damaging – to report very positive results. Anecdotally, I have heard from several well-reputed MSD programmes that they apply this method – so I suspect that it is quite widespread.

Be warned, this post is more technical than normal, and might even require you to download an Excel spreadsheet and follow along at home. But don’t worry, there is no complex maths – so just take a deep breath, sip your tea, and lets dive in.

Here’s the mistake, which I will call the Positivity Error. An MEL team does a baseline and endline survey of a target group, before and after the intervention. They want to report against two indicators. Firstly, the number of people improving their income, and secondly, the amount of additional income earned. In order to do this, they firstly count the number of people who improved their income between baseline and endline. They then average the amount of income increase among those people.

At this point, you might be thinking ‘that doesn’t sound so bad’. To illustrate the flaw, lets look at an example, from an intervention with just two farmers. The first farmer increases their income between baseline and endline, and the second experiences a decline in income. The above method would conclude that one farmer (50% of the total) increased their income, by a total of 20 USD. It looks like the intervention worked well.

Example intervention with two farmers
Farmer codeBaseline income (USD)Endline income (USD)
A2545
B6040

Hopefully, the simplicity of the example helps you see the flaw in the Positivity Error. By focusing just on positive changes, it will always give positive results. Since surveys typically have a lot of variation in responses, due to measurement error, weather, poor recall, and individual circumstances, there will always be some respondents who increase their income between baseline and end-line. So regardless of the intervention, you can count on a positive impact.

Even with a control group, the Positivity Error will still give erroneous results. Organisations compare the change of each individual farmer in the treatment group to the change of the average farmer in the control group. But even if the treatment group average is worse than the control group average, there are likely to be some individual farmers who do better. So applying the Positivity Error will still give incorrectly positive results. The Positivity Error is not the same as the challenge of measuring attribution; it is a fundamental mistake in analysis, which will give wrong results regardless of the methodology you are applying.

To illustrate, I’ve put together a spreadsheet of fictional data. This shows baseline and endline income for a hundred farmers in a control group and another hundred in a treatment group. The data has been designed so that there is no difference between the treatment and control group. This is shown in the first table in the analysis tab, entitled “Correct analysis”. Using the Positivity Error, however, shown in the second table, we see that 51 people increased their income (with no control), or 21 people increased their income (using a control). Suddenly, our failed intervention is turned into an impressive-sounding success.

Firstly, it is important to know if the reported successes of MSD programmes are built on shaky foundations. I only have limited anecdotal evidence of the Positivity Error and would love to hear from other MEL staff (in confidence) if it is widespread or just a once-off.

Secondly, and more fundamentally, donors, managers, and consultants need to think more carefully about what is realistic to ask programmes to report. In my experience, there is too much expectation on programmes to report against challenging indicators, which leads to mistakes like the Positivity Error.

For example, it is really common for MSD results frameworks to have the following indicators:

Number of people increasing their income
Average increase of income per person
Counter-intuitively, the second of those is actually easier to meaningfully measure. You could simply compare average income at baseline and end-line, or compare results to a control group to assess attribution. If a control group is not possible, you could use qualitative methods or secondary data to model an approximate change in average income.

The number of people increasing their income, however, is completely unmeasurable without a good control group. For any individual respondent, there is no way of telling whether their income increased due to the intervention, or due to other factors. In any group, some respondents’ incomes will go up, others down. Consequently, any attempt to count up the number of people increasing incomes suffers from the same problems as the Positivity Error. A more accurate approach would be to count up the number of respondents who experienced an increase in income and subtract those who experienced a decrease, but that does not make for a very easy-to-communicate indicator.

In practice, few (if any) programmes really have the capacity to run strong quasi-experimental studies at scale. Attempting to do so without strong expertise in quantitative methods, under pressure to deliver positive results, and with multiple interventions requiring different approaches, risks seriously mistaken reporting.

A better approach would be to rethink the indicators that we are reporting. Rather than the number of people increasing income, ask for the number of people using new services, accepting that there is no way to tell which of them benefit. Focus more on the differences between groups, qualitative information from users, and data on the performance of partners and support market actors. More modest indicators, linked more closely to programme efforts, are more likely to deliver credible results as well as useful information for programme management.

That concludes our newsletter for this quarter. I hope you found it interesting – and please let me know by email if you disagree (or agree) with any of the above,

Adam and the DevLearn Team