If the global development community were to put together a list of the most overused—and perhaps misused—terminology of 2013, I would advocate for the inclusion of “evidence” and “impact.” Bureaucratic groupthink has narrowed the definitions of these two words so that only certain types of evidence and impacts can be labeled as such. Me explico. International organizations and donors have become so focused on demonstrating what works that they’ve lost sight of understanding why it works and under what circumstances. I can’t help but feel that the development industrial complex has come down with a case of “Keeping up with the Joneses.” My impact evaluation is more rigorous than yours. My evidence is more conclusive than yours. It’s a race to the top (or bottom?) to see who can drum up the most definitive answers to questions that might be asking us to “check all that apply” instead of “choose the most correct response.” We’re struggling to design multiple-choice answers for questions that might merit a narrative response.
I can’t help but question the motives behind such a movement. Sure, we’re all struggling to stay relevant in an ever-changing world. The global development community has responded to long-time critiques of development that is done to or for communities by launching programs and policies that emphasize development with and by local communities. This is a step in the right direction. While the international development community might claim to be transferring responsibility for technical program knowledge to local consultants and contractors, it has carefully written itself a new role: M&E (emphasis on the E). “Evidence” and “impact” narrowly defined are linked to contracts and consultancies that are linked to big money. It feels like a desperate attempt to keep expertise in the hands of a few, attempting to rally support to scale up select policies and programs that have been rigorously evaluated for impact by some of the major players in the field. Let’s not forget that impact evaluation—if we maintain the narrow definition that’s usually offered—can come with a hefty price tag. There are certainly times when impact evaluations such as RCTs are the best methodological choice and the costs of conducting that evaluation would be relative to the benefits. But we must be very careful about conflating evaluation type/purpose with methodology. And even more careful about when, where, and why we are implementing impact evaluations (again, narrowly defined).
I just finished reading a great piece by Justin Sandefur at the Center for Global Development: The Parable of the Visiting Impact Evaluation Expert. Sandefur does an excellent job of painting an all too familiar picture: the development consultant who (perhaps quite innocently) has been misled to believe that conclusive findings derived in one context can be used to implement programs in completely different contexts. At the individual level, these experts might be simply misguided. The global conversation on impact and evidence leads us to believe that “rigor” matters and that programs or policies rigorously tested can be proven to work. However, as Sandefur reminds us, “there is just no substitute for local knowledge.” What works in Country A might not work in Country B, and what might not work in Country B probably will not work in Country C. It is unwise—and dangerous—to make blind assumptions about the circumstances under which impact evaluations were able to establish significant results.
I would urge anyone interested in reclaiming the conversation on evidence to check out the Big Push Forward, which held a Politics of Evidence Conference in April with more than one hundred development professionals in attendance. The conference report has just been released on their website and is full of great takeaways.
Are you pushing back on the narrow definitions of evidence and impact? How so?
I’ve been doing a lot of thinking recently about my contribution to the debate on RCTs. Several weeks after the Evaluation Conclave in Kathmandu, I’m ready to give my two cents. First things first: A little context. RCT = Randomized Control Trial, an impact evaluation method that establishes “rigor” by using control and treatment group(s) to determine whether particular outcomes can be attributed to a particular program or intervention. A quick review of literature or participation in enough conferences, and one can see that RCTs are often presented as the “gold standard” in evaluation for their ability to show statistically significant differences in outcomes while controlling for various influencing factors. Sounds good, right? Certainly many students, practitioners and policymakers are seduced by its empirical and scientific nature. As with any subject in international development, however, it’s not so simple. Michael Quinn Patton’s keynote at the Evaluation Conclave presents a strong argument against the uncritical acceptance of RCTs as the method for showing impact. I urge you to watch it (and read his book on Developmental Evaluation while you’re at it!)
Now some in the evaluation field may think the RCT debate is “stale,” yet the sheer proliferation of donors and implementing organizations commissioning such evaluations proves that this is not the case. In fact, graduate schools across the country are churning out impact evaluators by the dozen. On the one hand, top schools can’t be blamed for teaching skills that are high in demand; their students will surely get jobs, and with high profile organizations at that. But they are producing far too many “development as usual” professionals who are hesitant to engage in critiques about the way development is done and about the way development projects are evaluated. Is evaluation just another manifestation of development being “done to” countries rather than “done with” countries? The trend towards RCTs surely seems to lead to this conclusion. Who are evaluations being produced for? And why? Local governments aren’t the ones begging for RCTs; the donors are asking and implementing organizations are producing! De facto policy can be made pretty quickly with enough money to incentivize it. Is there a time and a place for RCTs? Of course (Read here for a great post about various options for impact evaluations). RCTs in and of themselves are not “evil” as some opponents would suggest—they have strong merits in many cases (though sometimes questionable ethics when it comes to assigning beneficiaries to life changing programs!)
It comes down to balancing accountability with learning and research with evaluation. Donors must make smart investments, and organizations must be accountable for funds they’ve been awarded. But there is too much pressure to take learning out of the equation. Who are the end users of evaluations? What purpose(s) do they serve? We cannot remove context when years of research and experience show that context can make or break a project. If a randomized control trial experiment finds that increased school attendance in Honduras can be attributed to a specific education program, what are the implications? Will we attempt to “scale up” the project based on information that tells us little about why the program worked in a particular place and time? Can we use that evidence to justify a similar project in Cambodia? Tajikistan? Mozambique? I think we can do better in terms of evaluating program impact in context-specific ways that provide useful information for those on the ground. This is particularly important for those who may not find highly technical RCT results to be readily accessible, but who need to understand why programs succeed or fail. After all, if partnering with local governments and local NGOs can lead to more successful program implementation, it can also lead to more successful (and useful) program evaluation. But the evaluations should be designed according to terms agreed to by everyone involved. Given all available options, I’d be interested to see how many times an RCT would be universally selected.
As the blog post title suggests, this is a debate folks, so let me know where you fall in the “continuum” of opinions!