I’ve been doing a lot of thinking recently about my contribution to the debate on RCTs. Several weeks after the Evaluation Conclave in Kathmandu, I’m ready to give my two cents. First things first: A little context. RCT = Randomized Control Trial, an impact evaluation method that establishes “rigor” by using control and treatment group(s) to determine whether particular outcomes can be attributed to a particular program or intervention. A quick review of literature or participation in enough conferences, and one can see that RCTs are often presented as the “gold standard” in evaluation for their ability to show statistically significant differences in outcomes while controlling for various influencing factors. Sounds good, right? Certainly many students, practitioners and policymakers are seduced by its empirical and scientific nature. As with any subject in international development, however, it’s not so simple. Michael Quinn Patton’s keynote at the Evaluation Conclave presents a strong argument against the uncritical acceptance of RCTs as the method for showing impact. I urge you to watch it (and read his book on Developmental Evaluation while you’re at it!)
Now some in the evaluation field may think the RCT debate is “stale,” yet the sheer proliferation of donors and implementing organizations commissioning such evaluations proves that this is not the case. In fact, graduate schools across the country are churning out impact evaluators by the dozen. On the one hand, top schools can’t be blamed for teaching skills that are high in demand; their students will surely get jobs, and with high profile organizations at that. But they are producing far too many “development as usual” professionals who are hesitant to engage in critiques about the way development is done and about the way development projects are evaluated. Is evaluation just another manifestation of development being “done to” countries rather than “done with” countries? The trend towards RCTs surely seems to lead to this conclusion. Who are evaluations being produced for? And why? Local governments aren’t the ones begging for RCTs; the donors are asking and implementing organizations are producing! De facto policy can be made pretty quickly with enough money to incentivize it. Is there a time and a place for RCTs? Of course (Read here for a great post about various options for impact evaluations). RCTs in and of themselves are not “evil” as some opponents would suggest—they have strong merits in many cases (though sometimes questionable ethics when it comes to assigning beneficiaries to life changing programs!)
It comes down to balancing accountability with learning and research with evaluation. Donors must make smart investments, and organizations must be accountable for funds they’ve been awarded. But there is too much pressure to take learning out of the equation. Who are the end users of evaluations? What purpose(s) do they serve? We cannot remove context when years of research and experience show that context can make or break a project. If a randomized control trial experiment finds that increased school attendance in Honduras can be attributed to a specific education program, what are the implications? Will we attempt to “scale up” the project based on information that tells us little about why the program worked in a particular place and time? Can we use that evidence to justify a similar project in Cambodia? Tajikistan? Mozambique? I think we can do better in terms of evaluating program impact in context-specific ways that provide useful information for those on the ground. This is particularly important for those who may not find highly technical RCT results to be readily accessible, but who need to understand why programs succeed or fail. After all, if partnering with local governments and local NGOs can lead to more successful program implementation, it can also lead to more successful (and useful) program evaluation. But the evaluations should be designed according to terms agreed to by everyone involved. Given all available options, I’d be interested to see how many times an RCT would be universally selected.
As the blog post title suggests, this is a debate folks, so let me know where you fall in the “continuum” of opinions!