Search this site:


Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

CRS Report for Congress
Prepared for Members and Committees of Congress
Does Foreign Aid Work? Efforts to Evaluate
U.S. Foreign Assistance
Marian Leonardo Lawson
Analyst in Foreign Assistance
November 19, 2012
Congressional Research Service
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service
Congress’s recent focus on reducing federal spending raises questions about the relative
efficiency and effectiveness of all federal programs. In this context, evaluation of foreign
assistance programs is of growing interest to many Members of Congress as they scrutinize the
Administration’s international affairs budget request and debate foreign aid spending priorities.
Policymakers, taxpayers, and aid recipients alike want to know what impact, if any, foreign aid
dollars are having, and whether foreign aid programs are achieving their intended objectives.
In most cases, the success or failure of U.S. foreign aid programs is not entirely clear, in part
because historically, most aid programs have not been evaluated for the purpose of determining
their actual impact. The purpose and methodologies of foreign aid evaluation have varied over the
decades, responding to political and fiscal circumstances. Aid evaluation practices and policies
have variously focused on meeting program management needs, building institutional learning,
accounting for resources, informing policymakers, and building local oversight and project design
capacity. Challenges to meaningful aid evaluation have varied as well, but several are recurring.
Persistent challenges to effective evaluation include unclear aid objectives, funding and personnel
constraints, emphasis on accountability for funds, methodological challenges, compressed
timelines, country ownership and donor coordination commitments, security, and agency and
personnel incentives. As a result of these challenges, aid agencies do not undertake rigorous
evaluation for all foreign aid activities.
The U.S. government agencies managing foreign assistance each have their own distinct
evaluation policies; these policies have come into closer alignment in the last two years than in
the past. The Obama Administration’s Quadrennial Diplomacy and Development Review
(QDDR) resulted in, among other things, a stated commitment to plan foreign aid budgets “based
not on dollars spent, but on outcomes achieved.” This focus on evaluating the impact of foreign
assistance reflects an international trend. USAID put this idea into practice by introducing a new
evaluation policy in January 2011. The State Department, which began to manage a growing
portion of foreign assistance over the past decade, followed suit with a similar policy in February
2012. The Millennium Challenge Corporation, notable for its demanding but little-tested
approach to evaluation, also recently revised its policy. While differing in several respects,
including their support for impact evaluation, the policies reflect a common emphasis on
evaluation planning as a part of initial program design, transparency and accessibility of
evaluation findings, and the application of data to inform future project design and allocation
decisions. Aspects of the three evaluation policies are compared in Appendix A.
Though recent evaluation reform efforts have been agency-driven, Congress has considerable
influence over their impact. Legislators may mandate a particular approach to evaluation directly
through legislation (e.g., H.R. 3159, S. 3310), or can support or undermine Administration
policies by controlling the appropriations necessary to implement the policies. Furthermore,
Congress will largely determine how, or if, any actionable information resulting from the new
approach to evaluations will influence the nation’s foreign assistance policy priorities.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service
Introduction ...................................................................................................................................... 1
Does Aid Work? A Brief Summary .................................................................................................. 2
Impact and Performance Evaluations .............................................................................................. 4
History of U.S. Foreign Assistance Evaluation ............................................................................... 5
Evaluation Challenges ................................................................................................................... 10
Applying Evaluation Findings to Policy ........................................................................................ 16
Current Agency Evaluation Policies .............................................................................................. 17
Issues for Congress ........................................................................................................................ 20
Conclusion ..................................................................................................................................... 21
Appendix A. Select Aspects of Current USAID, State Department, and MCC Evaluation
Policies........................................................................................................................................ 23
Author Contact Information........................................................................................................... 25
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 1
Congress’s strong focus on reducing federal spending raises questions about the relative
efficiency and effectiveness of all federal programs, and foreign assistance is a subject often
raised in broad budget debates. Foreign assistance evaluation is one aspect of a government-wide
effort to link program effectiveness to budgeting decisions. It is also an element of broader
foreign aid reforms implemented in recent years. The 2010 Quadrennial Diplomacy and
Development Review (QDDR), the basis of many recent aid policy initiatives, called for the State
Department and the U.S. Agency for International Development (USAID) to plan foreign aid
budgets and programs “based not on dollars spent, but on outcomes achieved,” and for USAID to
become “the world leader in monitoring and evaluation.”1 Rigorous evaluation is also a
cornerstone of the Millennium Challenge Corporation (MCC), established in 2004 to promote a
new model of development assistance.2 According to USAID Administrator Rajiv Shah, global
development policies and practices are experiencing a “transformation based on absolute demand
for results.”3 That demand comes, in part, from some Members of Congress as they scrutinize the
Administration’s international affairs budget request and consider foreign aid spending priorities.4
It also comes from aid beneficiaries and American taxpayers who want to know what impact, if
any, foreign aid dollars are having and whether foreign aid programs are achieving their intended
The current emphasis on evaluation is not new. The importance, purpose and methodologies of
foreign aid evaluation have varied over the decades since USAID was established in 1961,
responding to political and fiscal circumstances, as well as evolving development theories. There
are a number of reasons that this issue has gained prominence in recent years. For one, foreign aid
funding levels have increased over the past decade while evaluations have decreased, raising
questions about the knowledge basis for aid policy.5 Analysts have noted that after decades of aid
agencies spending billions of dollars on assistance programs, very little is known about the
impact of these programs.6 Some wonder how policymakers can develop effective foreign aid
strategies without a clear understanding of how and why prior assistance has succeeded or failed.
This report focuses primarily on U.S. bilateral assistance, and less on the work of multilateral aid
entities, such as the World Bank, to which the United States contributes. While a wide range of
federal agencies provide foreign assistance in some form,7 this report focuses on the three
1 U.S. Department of State, Quadrennial Diplomacy and Development Review, 2010, Leading Through Civilian Power,
p. 103.
2 For more information about the MCC model, see CRS Report RL32427, Millennium Challenge Corporation, by Curt
3 Statement of USAID Administrator Rajiv Shah to The Cable, as reported in The Cable, June 13, 2012.
4 While not often discussing evaluation policy per se, some Members appear to be influenced in their policy decisions
by their sense of what aid is working and what is not. For example, when introducing her subcommittee’s FY2013
proposal at full-committee mark-up on May 17, 2012, House State-Foreign Operations Appropriations Subcommittee
Chairwoman Kay Granger remarked that the legislation “only supports programs that work.” Senator Lindsay Graham
of the Senate State-Foreign Operations Appropriations Subcommittee, explaining the sharp reduction in aid for Iraq in
the Senate’s FY2013 proposal at a May 22, 2012, mark-up, said “there’s no point in throwing good money after bad.”
5 For historic information on foreign aid spending, see CRS Report R40213, Foreign Aid: An Introduction to U.S.
Programs and Policy, by Curt Tarnoff and Marian Leonardo Lawson.
6 When Will We Ever Learn?: Improving Lives Through Impact Evaluation, Report of the Evaluation Gap Working
Group, Center for Global Development, May 2006, p. 1.
7 According to U.S. Overseas Loans and Grants, 21 U.S. Government agencies reported disbursing foreign assistance in
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 2
agencies that have primary policy authority and implementation responsibility for U.S. foreign
assistance—USAID, the State Department, and the Millennium Challenge Corporation (MCC). It
discusses past efforts to improve aid evaluation, as well as ongoing issues that make evaluation
challenging in the foreign assistance context. The report also provides an overview of the current
evaluation policies of the primary implementing agencies, and discusses related issues for
Congress, including recent legislation.
Program Evaluation Government-Wide
Program evaluation is an important issue throughout the U.S. government, and foreign assistance evaluation is just
one part of a broader effort by the federal government to improve accountability and program performance through
stronger evaluation processes. With the Government Performance and Results Act (GPRA) of 1993, Congress
established unprecedented statutory requirements regarding the establishment of goals, performance measurement
indicators, and submission of related plans and reports to Congress for its potential use in policy development and
program oversight. The GPRA Modernization Act of 2010 updated the original law, requiring more frequent plan
updates and on-line posting of data.8 The agency-specific evaluation plans discussed in this report are intended to
comply with and build upon this government-wide effort. Most recently, in a May 18, 2012, memorandum, the Office
of Management and Budget (OMB) directed all federal agencies to demonstrate the use of evidence from rigorous
evaluation throughout their FY2014 budget submissions.9 While OMB has emphasized use of evidence in prior years,
this memorandum appears to take the issue to a more formal level, and suggests that evaluation data may be closely
linked to budget approval in future fiscal years.
Does Aid Work? A Brief Summary
To know whether aid is successful, one must understand its purpose. The Foreign Assistance Act
(FAA) of 1961 (P.L.87-195), as amended, is the authorizing legislation for most modern foreign
aid programs. The FAA declared that
the principal objective of the foreign policy of the United States is the encouragement and
sustained support of the people of developing countries in their efforts to acquire the
knowledge and resources essential to development, and to build the economic, political, and
social institutions that will improve the quality of their lives.10
The original legislation lists five principal goals for foreign aid: (1) the alleviation of the worst
physical manifestations of poverty among the world’s poor majority; (2) the promotion of
conditions enabling developing countries to achieve self-sustaining economic growth and
equitable distribution of benefits; (3) the encouragement of development processes in which
individual civil and economic rights are respected and enhanced; (4) the integration of the
developing countries into an open and equitable international economic system; and (5) the
promotion of good governance through combating corruption and improving transparency and
accountability.11 Amending legislation over the years added dozens of new, though often
overlapping, aid objectives. For example, “the suppression of the illicit manufacturing of and
FY2010. See
8 For more on current GPRA requirements, see CRS Report R42379, Changes to the Government Performance and
Results Act (GPRA): Overview of the New Framework of Products and Processes, by Clinton T. Brass.
9 Use of Evidence and Evaluation in the FY2014 Budget, Memorandum to the Heads of Executive Departments and
Agencies, Jeffrey D. Zients, Acting Director, Office of Management and Budget, May 18, 2012.
10 Foreign Assistance Act of 1961, P.L. 87-195), §101(a).
11 Ibid.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 3
trafficking in narcotic and psychotropic drugs” was added in 1971,12 “to alleviate human suffering
caused by natural and manmade disasters” was added in 1975,13 and “to enhance the antiterrorism
skills of friendly countries by providing training and equipment” and “to strengthen the bilateral
ties of the United States with friendly governments by offering concrete [antiterrorism]
assistance”14 were added in 1983. In short, U.S. foreign aid is intended to be a tool for fighting
poverty, enhancing bilateral relationships, and/or protecting U.S. security and commercial
In this broad view, some instances of specific development assistance projects and programs are
widely viewed as successful. The largest aid program of the last century, the Marshall Plan (1948-
1952), for example, is acclaimed as a key factor in the post-World War II reconstruction of
European states that have gone on to become major strategic and trade partners of the United
States. In the late 1960s and 1970s, aid associated with the “green revolution” was credited with
greatly improving agricultural productivity and addressing hunger and malnutrition in parts of
Asia, and global health programs were credited with virtually eradicating smallpox. Korea,
Taiwan, and Botswana are often cited as aid success stories as a result of remarkable economic
progress following significant aid infusions. More recently, unquestionable progress in battling
public health crises, such as HIV/AIDS, across the globe can be largely attributed to massive
foreign assistance programs, both bilateral and multilateral. Even in these instances, however,
close analysis often reveals many caveats.
In other specific instances foreign aid programs and projects have been considered to be
conspicuously unsuccessful, or even harmful to intended beneficiaries. Critics of foreign
assistance cite decades of aid to corrupt governments in Africa, which enriched corrupt leaders
and did little to improve the lives of the poor.15 In Latin America, U.S. aid to anti-communist
rebels and regimes during the Cold War was associated with brutal violence and believed by
many to have damaged U.S. credibility as a champion of democracy. Numerous examples exist of
hospitals, schools, and other facilities that were built with donor funds and left to rot, unused in
developing countries that did not have the resources or will to maintain them. In some instances,
critics assert that foreign aid may do more harm than good, by reducing government
accountability, fueling corruption, damaging export competitiveness, creating dependence, and
undermining incentives for adequate taxation.16
The most notable successes and conspicuous failures of foreign aid give fodder to both aid
advocates and detractors, but in all likelihood represent just a small segment of assistance
activities. In most cases, clear evidence of the success or failure of U.S. assistance programs is
lacking, both at the program level and in aggregate. One reason for this is that aid provided for
development objectives is often conflated with aid provided for political and security purposes.
Another reason is that historically, most foreign assistance programs are never evaluated for the
purpose of determining their impact, either at the time or retrospectively. Furthermore, evaluation
practices are not consistent enough to allow for the use of project level data as the basis for
12 FAA, as amended, §481(a)(1)(C).
13 FAA, as amended, §491(a).
14 FAA, as amended, §572 (1) and (2).
15 Several examples of this are discussed in, Economic Gangsters: Corruption, Violence and the Poverty of Nations, by
Raymond Fisman and Edward Miguel, Princeton University Press, 2008.
16 See Dambisa Moyo, Dead Aid: Why Aid is Not Working and How There Is a Better Way for Africa, Farrar, Straus
and Giroux, New York, 2009, p. 48.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 4
broader, strategic evaluations. According to one 2009 review of monitoring and evaluation across
U.S. foreign assistance implementing agencies, evaluation of foreign assistance programs “is
uneven across agencies, rarely assesses impact, lacks sufficient rigor, and does not produce the
necessary analysis to inform strategic decision making.”17
Impact and Performance Evaluations
The Department of State, USAID, and other U.S. agencies implementing foreign assistance
programs have long evaluated the performance of their own personnel and contractors in meeting
discrete objectives. Depending on the nature of the project or program, staff and contractors
might monitor the miles of road built, number of police officers trained, or changes in the use of
fertilizers by farmers. These results can be compared to the initial program goals and expectations
to determine whether the project or contract has been performed successfully. This type of
oversight is called performance monitoring, and if the resulting data are analyzed in an effort to
explain how and why a program meets or fails to meet strategic objectives, this is called
performance evaluation. Performance monitoring and evaluation are widely viewed as essential
aspects of oversight, and performance evaluations represent the vast majority of foreign aid
evaluation to date. Financial audits by agency Inspectors General, which examine whether funds
are being used as intended, are also a common form of evaluation, particularly at the State
Performance evaluation and financial audits play an important part in project management but do
little to answer questions about foreign aid effectiveness. Addressing this question, some argue,
requires impact evaluations. Impact evaluations can take many forms, but their common element
is that they use a defined counterfactual, or control group, and baseline data to measure change
that can be attributed to an aid intervention.18 Impact evaluations look not at the output of an
activity, but rather at its impact on a development objective. For example, while a performance
evaluation of an education program may look at the number of textbooks provided and teachers
trained, an impact evaluation may determine how or if literacy or math skills had improved for
the target group as compared to a similar group that did not receive the textbooks or teacher
training. A performance evaluation of an HIV prevention project may report the number of public
awareness events held or condoms distributed, while an impact evaluation of the same program
would monitor changes in the HIV/AIDS infection rate of the targeted population. An impact
evaluation of a police training program would look at the program’s impact on civil order and
public safety rather than simply report how many officers were trained or the value of equipment
supplied. Randomized controlled trials, in which beneficiaries are randomly selected from a
prequalified group and compared before and after the program to those not selected, are widely
viewed as best practice for impact evaluation, but less rigorous methods are used as well.
Impact evaluations can be key to determining whether a foreign assistance program “works.”
However, impact evaluations are generally far more complex and resource-intensive than
17 Beyond Success Stories: Monitoring and Evaluation For Foreign Assistance Results, Evaluator Views of Current
Practice and Recommendations for Change, by Richard Blue, Cynthia Clapp-Wincek and Holly Benner, May 2009, p.
18 For a thorough, yet non-technical, discussion of the use of impact/attribution evaluation, see “An introduction to the
use of randomized control trials to evaluate development interventions,” by Howard White, International Initiative for
Impact Evaluation, Working Paper 9, February 2011.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 5
performance evaluations. Agencies implementing foreign assistance must balance the potential
knowledge to be gained from impact evaluation with the additional resources necessary to carry
out such evaluations. As a result, while the potential learning benefits of impact evaluation have
long been recognized by aid officials, the use of rigorous impact evaluation has been, and
continues to be, very limited. More typically, agencies aim for evaluation practices that are, as
one expert has put it, “cost-effectively rigorous,” and, at minimum, “independent, transparent,
and consistent, thus persuasive.”19
History of U.S. Foreign Assistance Evaluation
The practice of foreign assistance evaluation has changed over time to reflect evolving, or some
might say cyclical, attitudes about the purpose and relative importance of evaluation.20 This is
evident both in the United States and internationally. Aid evaluation practices and policies have
variously focused on different evaluation objectives, including meeting program management
needs, institutional learning, accountability for resources, informing policymakers, and building
local oversight and project design capacity.
The history of U.S. foreign assistance evaluation begins with USAID, which implemented the
vast majority of U.S. foreign assistance prior to the last decade. In its early years, USAID was
primarily involved in large capital and infrastructure projects, for which evaluations focused on
financial and economic rates of return were appropriate. However, the agency soon shifted focus
towards smaller and more diverse projects to address basic human needs, and found that the rate
of return evaluation model was no longer sufficient.21 The agency established its first Office of
Evaluation in 1968, and used a Logical Framework (LogFrame) model as its primary system for
monitoring and evaluation. The LogFrame approach, subsequently adopted by many international
development agencies, employed a matrix to identify project goals, purposes, results, and
activities, with corresponding indicators, verification methods, and important assumptions.
Baseline data were to be used for each indicator, and results were reported at quarterly points
during the life of a project. However, these data were not analyzed to look for competing
explanations of the results or unintended consequences of activities.
While the LogFrame approach established USAID as a thought leader with respect to evaluation
policy, in practice, evaluations varied significantly from project to project. A 1970 evaluation
handbook included a diagram of the “ideal” program evaluation design, which resembles a
randomized controlled trial, but notes that “there are a great many reasons why it may not be
possible to reach the ideal.”22 Reviews of foreign assistance evaluation over decades revealed
shortcomings. For one, the system had become decentralized over time, suitable to meet the
information needs of project managers in the field but not contribute to broader learning or policy
making. A 1982 report by the General Accounting Office (now the Government Accountability
Office, GAO) found that “AID staff does not apply lessons learned in the development of new
19 Clemens, Michael. “Impact Evaluation in Aid: What For? How Rigorous?” Presentation at the Overseas
Development Institute, July 3, 2012, video recording available at
20 Trends in Development Evaluation Theory, Policies and Practices, USAID, 17 August 2009, p. 4.
21 The USAID Evaluation System: Past Performance and Future Direction, Bureau for Program and Policy
Coordination, USAID, September 1990, p. 9.
22 Evaluation Handbook, Office of Program Evaluation, USAID, November 1970, p. 40.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 6
projects,” and that “lessons learned are neither systematically nor comprehensively identified or
recorded by those who are directly involved.”23 In response to the GAO report’s recommendation
that USAID build an “information analysis capability,” the agency created the Center for
Development Information and Evaluation (CDIE) in 1983, with a mandate to “foster the use of
development information in support of AID’s assistance efforts.”24 CDIE carried out metaevaluations
to reveal broader trends in aid impact, provided information and training on
evaluation best practices to mission staff, and made a wide range of evaluation reports accessible
to implementers in the field. Aid officials suggest that CDIE’s evaluation work played a
significant role in shaping USAID strategies and priorities in many sectors over decades.
An internal USAID review in 1988 found that
CDIE had greatly increased the use of aid
evaluation information by implementers, but
also identified a need to improve the quality
and timeliness of evaluation reports.26 While
the evaluation policy at the time still called for
rigorous, statistical methods of evaluation, it
was found that this approach was never
actually widely used at USAID because the
required skills, time, and expense made
implementation difficult.27 As one internal
review noted, “statistical rigor in evaluation
methods was deemphasized in favor of
‘reasonably’ valid evidence about project
performance.”28 Guidance to missions
encouraged the use of low-cost and timely
qualitative evaluation methodologies,
including the use of key informant interviews,
focus group discussions, community meetings,
and informal surveys.29
In the early 1990s, accountability for funds
became a primary focus of aid evaluation.
After a 1990 GAO review concluded that
USAID evaluation practices made it difficult or impossible to account for use of aid funds,30
attention turned to tracking where aid money was going, not measuring what it was
23 Experience – A Potential Tool for Improving U.S. Assistance Abroad, U.S. Government Accountability Office,
GAO-ID-82-36, June 15, 1982, p. i (summary).
24 The History of CDIE, CDIEHIST.017/SESmith;JREriksson/10-17-94, p.4.; available through the Development
Experience Clearinghouse on the USAID website.
25 The Community-Based Family Planning Services Family Planning Health and Hygiene Project, prepared by Bruce
Carlson, MSPH, and Malcolm Potts, M.D. under the auspices of The American Public Health Association, USAID,
1979, pp. 5, 7.
26 Ibid.
27 The A.I.D. Evaluation System: Past Performance and Future Directions, Bureau for Program and Policy
Coordination, Agency for International Development, September 1990, p. 10.
28 Ibid., p. 11.
29 Ibid., p. 11.
30 Accountability and Control Over Foreign Assistance, GAO/T-NSIAD-90-25, March 29, 1990, p. 6, 11. The review
Testing Family Planning Project Design
in Thailand, 1979
Many evaluations are designed to answer specific
questions about project design. One example is the
Family Planning Health and Hygiene Project, a 1979
independent evaluation of USAID support for the
government of Thailand’s family planning policy.
Implemented by the American Public Health Association,
the evaluation used a baseline survey and experimental
design to test the hypothesis that contraception services
would be more cost effective and acceptable to
communities if combined with basic health services
rather than implemented in isolation. Obtaining the
appropriate information to inform resource allocation
was a primary objective of the evaluation. According to
the report, “the evaluation was implemented with
sufficient precision and adherence to experimental
requirements to provide information on which to make
management decisions about the best use of resources.”
Evaluators found that the hypothesis was not supported
by the evidence. Adding basic health services doubled the
cost of programs but was not associated with increased
contraceptive use. As a result, the evaluators
recommended that future decisions about family planning
and basic health services programs be considered
without any assumption that a linkage between the two
would increase the acceptance of contraception use.25
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 7
accomplishing. At the same time, USAID was facing increasing budgetary pressure and
increasing congressional and public concern about what was being achieved through foreign
assistance.31 In response, USAID carried out an Evaluation Initiative from 1990 to 1992, greatly
expanding the staff and budget of CDIE and making significant investments in rigorous
evaluation designs and innovative methods to evaluate sector-wide results.32 However, by the
mid-1990s the priorities changed once again. A 1993 agency reorganization led to the 1994
elimination of an Office of Evaluation within CDIE, a reduction of overall CDIE staff,33 and a
new emphasis on “rapid appraisal techniques,” which guidance documents describe as a
compromise between slow, costly, and credible formal evaluation methods and cheap, quick,
informal methods (focus group, etc.) that may be less reliable.34
In 1995, USAID replaced the requirement to conduct mid-term and final evaluations of all
projects with a policy calling for evaluation only when necessary to address a specific
management question.35 The rationale was that the required evaluations had become pro forma, as
GAO reviews had suggested, and that fewer, more comprehensive evaluations would be a better
use of time and resources. As a result, the number of completed evaluations dropped from 425 in
1993 to an estimated 138 in 1999,36 but the depth and scope of new evaluations reportedly did not
change.37 One study suggests that inconsistent guidance on evaluation in these years allowed
many already overburdened mission staff to ignore agency-wide requirements, but noted that the
Global Health, Africa, and Europe & Eurasia bureaus, which had their own evaluation
procedures, continued to carry out quality evaluation work.38
Foreign assistance levels grew rapidly starting in 2003 to support military activities in
Afghanistan and Iraq, as well as the President’s Emergency Plan for AIDS Relief (PEPFAR) and
the creation in 2004 of the Millennium Challenge Corporation (MCC). Accountability to
Congress became a major evaluation priority. In 2005, inspired by remarks made by House
Foreign Operations Appropriations Subcommittee Chairman Jim Kolbe regarding the importance
of being able to clearly demonstrate results of aid expenditures, USAID Administrator Andrew
Natsios sought to revitalize evaluation within the agency. He sent a cable to all mission directors
calling for the inclusion of evaluation plans, and higher quality evaluations, in all program
found that military assistance managed by State and the Department of Defense was also inadequately monitored and
accounted for.
31 The History of CDIE, p.6; The A.I.D. Evaluation System, p. 11.
32 Ibid, pp. 6-7.
33 Ibid. p. 8.
34 The Role of Evaluation in USAID, Performance Monitoring and Evaluation TIPS, USAID CDIE, 1997, Number 11,
p. 3.
35 Beyond Success Stories, p.7; Evaluation of Recent USAID Evaluation Experience, Cynthia Clapp-Wincek and
Richard Blue, Working Paper No. 320, U.S. Agency for International Development, Center for Development
Information and Evaluation, June 2001, p. 31.
36 Evaluation of Recent USAID Evaluation Experience, p. 5. The report authors note that while some of the declining
numbers can be attributed to missions not submitting their evaluations to the Development Experience Clearinghouse,
as policy required, making the specific numbers unreliable, the trend of decline is unmistakable.
37 Evaluation of Recent USAID Evaluation Experiences, p. 12.
38 The Evaluation of USAID’s Evaluation Function: Recommendations for Reinvigorating the Evaluation Culture
Within the Agency, Janice M. Weber, Bureau for Program and Policy Coordination, USAID, September 2004, pp. 5, 10.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 8
designs; designated monitoring and evaluation officers at each post; and set aside funding for
evaluations and incentives for employees who do evaluations; among other things.39
In 2006, in further pursuit of accountability, as
well as a desire to rationalize the bilateral
assistance efforts of multiple U.S. agencies,
Secretary of State Condoleezza Rice created
the Office of the Director of Foreign
Assistance (F Bureau) at the State
Department. In addition to consolidating many
USAID and State policy and planning
functions for foreign assistance, the F Bureau
established an extensive set of standard
performance indicators “to measure both what
is being accomplished with U.S. Government
foreign assistance funds and the collective
impact of foreign and host-government efforts
to advance country development.”42 Prior to
this initiative, the State Department, which
traditionally had managed a much smaller aid
portfolio than USAID, is said to have made a
de facto decision not to evaluate its assistance
programs on a systematic basis.43 As a result,
the data collected through the “F process,”
which remains in place today, allow for a
marked improvement in aid transparency,
demonstrating comprehensively where and for what purpose aid funds are allocated by State and
USAID as of FY2006.44 However, the demands of F process reporting were believed by some to
have interfered with more results-oriented evaluation work at USAID, and a 2008 assessment of
State’s evaluation capacity found that several bureaus, including those that manage State’s
security assistance programs, still had little or no evaluation capacity.45
39 Actions Required to Implement the Initiative to Revitalize Evaluation in the Agency, UNCLAS STATE 127594, July
8, 2005.
40 For an overview of this evaluation, as well as links to related studies, see
41 Roetman, Eric. A Can of Worms? Implications of Rigorous Impact Evaluations for Development Agencies,
International Initiative for Impact Evaluations, Working Paper 11, March 2011, p. 5.
42 See It was originally expected by many that the F Bureau would
eventually track all foreign assistance provided by U.S. agencies, not just State and USAID. As of 2012, some MCC
data has been added to the Bureau’s public database (, but there does not appear to be
momentum toward any expansion of F Bureau authority.
43 Beyond Success Stories, p. 14. The State Department traditionally has used a variety of resources for monitoring its
foreign assistance programs, including Mission and Bureau Strategic Plans, annual performance and accountability
reports, and Office of Inspector General and Government Accountability Office reports, but had no systematic
evaluation process (Department of State Program Evaluation Plan, FY2007-2012 Department of State and USAID
Strategic Plan, Bureau of Resource Management, May 2007, Appendix II).
44 The data is publically available at
45 Beyond Success Stories, p. 8.
Primary School Deworming in Kenya
One well-known example of an impact evaluation that
yielded useful information looked at a World Banksupported
project in Kenya that treated children for
intestinal worms, a prevalent affliction that results in
listlessness, diarrhea, abdominal pain, and anemia. The
stated development objective was to increase the
number of children completing their primary education.
In collaboration with the local health ministry, NGO
implementers treated 30,000 children in 75 schools with
a drug that cost $3.27 annually per child, using baseline
data and a random phase-in approach that allowed for a
controlled comparison. The evaluation found that the deworming
resulted in a 25% reduction in absenteeism, or
10-15 more days of school attendance per child per year.
This case is also an example of the value of consistent
methodology and the use of sector- or region-wide
evaluation that looks at results beyond the project level.
Similar evaluation methods were used for other
interventions (providing free uniforms, textbooks, and/or
meals) with the same goal and in the same region,
allowing evaluators to do a comparative analysis and
determine that the de-worming intervention was the
most effective of these interventions in increasing school
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 9
The structural reforms of the F Bureau came at a time of heightened congressional scrutiny of
foreign aid. In 2004, Congress established the Helping to Enhance the Livelihood of People
(HELP) Around the Globe Commission, through a provision in P.L. 108-199, to independently
review foreign assistance policy decisions, delivery challenges, methodology, and measurement
of results. After nearly two years of work, the HELP Commission released its report in late 2007.
On the subject of evaluation, the report noted that “everyone to whom members of the
Commission spoke about monitoring and evaluation expressed concern about the inadequacy of
the existing process” and concluded that “unless our government better evaluates projects based
on the outcomes they achieve, it will not improve the effectiveness of taxpayer dollars.”46 The
commission recommended creation of a unified foreign assistance policy, budgeting, and
evaluation system within State, quite similar to the F process, which was established before the
report was released. Other HELP Commission recommendations included ensuring that
evaluation strategies use control groups and randomization as much as possible; considering new
evaluation methods, such as the use of professional associations or accreditation agencies; and
building, in collaboration with other donors, the capacities of recipient governments to provide
reliable baseline data.47
At the same time the F Bureau was established, and the HELP Commission was active, the
international donor community began to prioritize aid effectiveness, sparking renewed interest in
rigorous impact evaluation (see the “A Global Perspective on Aid Evaluation” text box below).
Some aid professionals viewed the F process as an opportunity to build a cross-agency aid
evaluation practice focused on impact, and were disappointed that the common indicators used by
the F Bureau, while an improvement with respect to comparability, measured outputs rather than
impact. Furthermore, the use of more rigorous evaluation methodologies was not a focus of the
reform. These issues were revisited by the Obama Administration when it embarked in 2009 on a
Quadrennial Diplomacy and Development Review (QDDR) to examine how State and USAID
could be better prepared for current and future challenges. As a result of that review, the
Administration committed itself in December 2010 to several principles of foreign assistance
effectiveness, including “focusing on outcomes and impact rather than inputs and outputs, and
ensuring that the best available evidence informs program design and execution.”48 The QDDR
became the basis of many recent and ongoing changes at State and USAID, including the creation
of a new Office of Learning, Evaluation and Research at USAID and a new USAID evaluation
policy, which took effect in January 2011. State followed suit and adopted an evaluation policy
similar to that of USAID in February 2012. These policies are discussed later in this report.
The Millennium Challenge Corporation is a relative newcomer to foreign assistance, and has a
very limited evaluation history. Nevertheless, since its establishment in 2004, MCC has been
regarded by many as a leader in aid evaluation, largely as a result of its demanding evaluation
policy. MCC provides funding and technical assistance to support five-year development plans,
called “compacts,” created and submitted by partner countries. Since its inception, MCC policy
has required that every project in a compact be evaluated by independent evaluators, using preintervention
baseline data. MCC has also put a stronger emphasis on impact evaluation than State
and USAID; of the 25 MCC impact evaluation plans (not completed evaluations) made publicly
46 Beyond Foreign Assistance: The HELP Commission Report on Foreign Assistance Reform, The United States
Commission on Helping to Enhance the Livelihood of People (HELP) Around the Globe Commission, December 7,
2007, p. 15.
47 HELP Report, p. 99.
48 QDDR, p. 110.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 10
available, 11 employ a rigorous randomized control trial methodology rarely used by other aid
agencies.49 MCC to date has released five evaluations, all related to specific farmer training
activities, and has not completed any final compact evaluations. A GAO report on the first two
completed MCC compacts suggests that significant changes were made to the original evaluation
plans, raising questions about whether the agency’s practices will reflect its policy over the long
MCC’s First Impact Evaluations
MCC released its first set of independent impact evaluations on October 23, 2012.51 While the evaluations all look at
farmer training activities, and reflect a small portion of MCC compacts in the respective countries (Armenia, Ghana,
El Salvador, Honduras, and Nicaragua), they were much anticipated in the development community as harbingers of
the success or failure of MCC’s evidence-based approach to evaluation. The evaluation results were mixed. MCC
reports meeting or exceeding output and outcome targets for most of the evaluated activities, but not seeing
measurable changes in household incomes, which was the intended impact. The reports also describe some problems
with evaluation design and implementation. Many development experts praised MCC’s transparency about both the
successes and shortcomings of its programs, and apparent commitment to continuous improvement.52 The evaluation
reports were published in full on MCC’s website, along with MCC analysis of lessons learned (e.g., phased
implementation doesn’t work well on a tight schedule, as delays undermine the entire evaluation model) and
questions raised (e.g., should the assumption that increased farm income leads to increased household income be
reconsidered?). According to at least one development professional, this first set of evaluations is a “game changer”
that has set a new standard for development agencies.53
Evaluation Challenges
The current evaluation emphasis on measuring impact and broader learning about what works is
not new; as discussed above, it was the basis of USAID evaluation policy in the 1970s and at
various times since. Nevertheless, a 2009 meta-evaluation of U.S foreign aid programs indicated
that rigorous impact evaluation—the kind that could determine with credibility whether a specific
aid intervention or broader sector strategy worked to produce a specific development outcome—
was rarely attempted. Of the 296 evaluations reviewed, only 9% reported on a comparison group
and only one used an experimental design involving randomized assignment, the method most
likely to produce accurate data.54 A 2005 review of USAID evaluations (focused on democracy
and governance programs) found that “as a group, they lacked information that is critical to
demonstrating the results of USAID projects, let alone whether the projects were the real cause of
whatever change the evaluation reported.”55 This gap between evaluation goals and actual
49 See
50 Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11-
728, pp. 32-38.
51 MCC’s statement on the release, which summarizes the findings, is available at
52 Statements of various leaders in the development community with respect to the MCC evaluations are available at
53 See comments of William Savedoff from the Center for Global Development at
54 Trends in Development Evaluation Theory, Policies and Practices, USAID, 17 August 2009, p. 46.
55 Trends in International Development Evaluation Theory, Policies and Practices; USAID, 17 August 2009, p. 13.
The report was prepared for USAID by Molly Hageboeck of Management Systems International.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 11
practices has been documented repeatedly over the history of U.S. foreign assistance; so too have
the challenges that make it difficult for implementers to achieve ideal evaluation practices in the
field. Some of these challenges are discussed below.
Mixed Objectives. The U.S. foreign
assistance program has dozens of official
objectives written into statute, and many aid
programs are designed to meet multiple
objectives. Often there are both strategic
objectives and development objectives
attached to an aid intervention, which may or
may not be acknowledged in budget and
planning documents. For example, assistance
to Uzbekistan may be requested and
appropriated for specific agriculture sector
activities, but may be motivated primarily by a
desire to secure U.S. overflight privileges for
military aircraft bringing troops and supplies
to Afghanistan. An evaluation of the
agricultural impact may be of no use to
policymakers who are more interested in the
strategic goal, nor to aid professionals who are
unlikely to view any lessons learned in these
circumstances as applicable to agricultural
development projects in a less politically
affected environment. Another example is the
Food for Peace program, which provides U.S.
agricultural commodities to countries facing
food insecurity. One objective of the program
is to feed hungry people, but long-standing
requirements that most of the food be
provided by U.S. agribusiness and be shipped
by U.S.-flagged vessels make clear that
supporting the U.S. agriculture and shipping
industries is a program objective as well, and a
potentially conflicting one. Studies have shown that the buy and ship America provisions, as they
are known, may lessen the hunger-alleviation impact of food aid by up to 40%.57
Despite the political and diplomatic considerations that arguably underlie the majority of foreign
aid, strategic evaluations that examine those objectives are rare (or at least not publicly available).
This may be understandable, as such evaluations would often be politically and diplomatically
sensitive. Nevertheless, evaluation that focuses only on the development or humanitarian impact
56 All information in this text box is based on USAID/OTI’s Integrated Governance Response Program in Colombia, A
Final Evaluation, produced for USAID by Caroline Hartzell, Robert Lamb, Phillip McLean and Johanna Mendelson
Forman, April 2011. Direct quotes, in order of appearance, are from pages 20 and 13.
57 The Developmental Effectiveness of Untied Aid, OECD, p.1, available at
OTI Consolidation in Colombia,
A 2011 evaluation of USAID’s Office of Transition
Initiatives (OTI) Integrated Governance Response
Program (IGRP) in Colombia demonstrates the difficulty
in quantifying the success of certain types of foreign aid.
The IGRP was intended to strengthen the government of
Colombia’s credibility and legitimacy in communities
once controlled by rebels, a process known as
“consolidation.” When the Colombian military reestablished
control over a community, OTI provided
funds and technical assistance to support rapid-response
community-based projects, such as school rehabilitation,
and small income-generation programs, such as providing
agricultural inputs, designed to increase citizen
confidence in, and cooperation with, the government.
The loosely defined objectives and ex-post approach to
evaluation, however, made it difficult to determine the
program’s effectiveness. As the evaluation report notes,
without a defined endpoint for the consolidation process
or concrete indicators for what constitutes success, the
evaluation is “necessarily impressionistic in nature.”
While a more rigorous evaluation methodology would be
possible with better planning (for example, using a preintervention
survey as a baseline to measure changing
attitudes), it may not be practical. Rapid response was a
key element of the OTI approach, which focused on
citizens seeing an immediate and beneficial impact of
government control, and delay for the sake of rigorous
evaluation design could have undermined that strategy.
Evaluators used literature reviews, interviews, and site
visits to find that the program was a success because it
“nurtured a mindset” among both Colombians and
Americans working on consolidation that is valuable in
achieving policy objectives in conflict zones.56
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 12
of a particular program or project, when broader strategic objectives are drivers of the aid, may
largely miss the point.
Funding and Personnel Constraints. The more rigorous and extensive an evaluation, the
costlier it tends to be, both in funds and staff time. Impact evaluations are particularly costly and
require specially trained implementers. Absent a directive from agency leadership, aid
implementers are unlikely to make resources available for evaluation at the expense of other
program components. As one internal USAID review explained, “since USAID’s development
professionals have limited staff, limited budget, and copious priorities, unfortunately, due to lack
of training on the crucial role of evaluation in the development process, most have chosen to
eliminate evaluation from their programs.”58 Competitive contracting plays a role as well. At a
time when most program implementation is contracted out, and cost is a key factor in winning
contract bids, some argue that there is little incentive to invest in the up-front costs, such as
baseline surveys, of a well designed evaluation plan in the absence of an enforced requirement.59
As a result, ad hoc evaluations of limited scope and learning value—as one report describes it, the
“do the best you can in three weeks” approach—often prevail by default.60 “It is rare,” according
to one report, “that the resources provided for an evaluation are sufficient to develop and apply
more rigorous research methods that would produce valid empirical evidence regarding outcomes
and attributable impact.”61 Sometimes the limited resource is personnel, rather than funding.
Reviews of assistance evaluation repeatedly cite lack of trained evaluation personnel as a
Emphasis on Accountability of Funds. Aid evaluations in recent years have primarily focused
on accountability of funds because that is what stakeholders, including Congress, generally ask
about. Concerned about corruption and waste, bound by allocation limits, and required by law to
report on various aspects of aid administration, implementing agencies have developed
monitoring, evaluation, and data collection practices that are geared toward tracking where funds
go and what they have purchased rather than the impact of funds on development or strategic
objectives. For example, the F Bureau’s Foreign Assistance Framework, launched in 2006, was
created largely to address the information demands of stakeholders, who wanted more data on
how aid funds are being spent. It worked, to the extent that it is now easier to find information on
how much aid is being spent in a given year on counterterrorism activities in Kenya, for example,
or on agricultural growth programs in Guatemala.62 But little if any of the resulting data addresses
the impact of aid programs. If stakeholders had instead expressed sustained interest in aid impact,
the so-called “F process” may have taken a different form.
Methodological Challenges. In the complex environment in which many aid projects are carried
out, it can be challenging to employ high quality evaluation methods. U.S. agency policies allow
for a variety of evaluation methods (see Appendix A), acknowledging that the most rigorous
methods are not always practical. Sometimes it is impossible to identify a comparable control
group for an impact evaluation, or unethical to exclude people from a humanitarian intervention
58 An Evaluation of USAID’s Evaluation Function, p. 5.
59 Beyond Success Stories, p. 16.
60 Ibid.
61 Ibid.
62 Foreign aid data from FY2006-FY2012 estimates, sorted by recipient country, year, agency (only State, USAID and
MCC), appropriations account, and objective is readily available through the “Foreign Assistance Dashboard” at
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 13
for the purpose of comparison. Sometimes the goals are intangible and cannot be accurately
documented through metrics. For example, it may be much harder to measure the impact of
programs such as the Middle East Partnership Initiative, designed to strengthen relationships, than
to measure more concrete objectives, such as reducing malaria prevalence. This may be one
reason why reviews have found that global health assistance has a stronger evaluation history
than other aid sectors;63 disease prevalence and mortality rates lend themselves to quantification
better than military personnel attitudes towards human rights or the strength of civil society.
Rigorous methodology can also limit program flexibility, as making program changes midcourse,
in response to changed circumstances or early results, can compromise the evaluation
design. Even MCC, with its emphasis on rigorous evaluation, has chosen to use less rigorous
qualitative methods for certain projects that do not, in the agency’s opinion, lend themselves to
quantitative evaluation.64
Even when metrics and baselines are well established, it can still be very difficult to attribute
impact to a specific U.S. aid intervention when such programs are often carried out in the context
of a broader trade, investment, political, and multi-donor environment.65 Also, some aid
professionals see broader drawbacks to rigorous impact evaluation methods. Some assert that the
use of randomized control groups, which generally require the use of independent evaluators,
limits the participation of affected individuals and communities in project design. They argue that
community participation in project planning and evaluation, which can lead to greater buy-in and
local capacity building, is more valuable in the development context than high-quality evaluation
findings.66 Others counter that more participatory methodologies are often weakened by bias, and
that it is unwise and even unethical to replicate programs, which may profoundly affect
participants, without having properly evaluated them.67
Compressed Timelines. While development assistance, in particular, is recognized as a longterm
endeavor, aid strategies can be trumped by political pressures, which can influence
evaluation. In 2001, a USAID survey report stated that “the pattern found was that evaluation
work responds to the more immediate pressures of the day.”68 Policymakers facing relatively
short budget and election cycles do not always allow adequate time for programs to demonstrate
their potential impact. Such pressures have only increased over the last decade, particularly in the
politically charged environments of Iraq, Afghanistan, and Pakistan. As a Senate Foreign
Relations Committee report on aid to Afghanistan explains, “the U.S. Government has strived for
quick results to demonstrate to Afghans and Americans alike that we are making progress. Indeed,
the constant demand for immediate results prevented the implementation of programs that could
have met long-term goals and would now be bearing fruit.”69
63 Beyond Success Stories, p. 9.
64 Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11-
728, p. 33.
65 The QDDR states that “we know that in many cases the outcome-level results are not solely attributable to U.S.
government investments and activities; we will focus on outcome-level progress in locations and subsectors where the
U.S. government is concentrating support.” (QDDR 2010, p. 104).
66 A Can of Worms, p. 8.; Beyond Success Stories, p. 17.
67 Improving Lives Through Impact Evaluation, p. 15
68 Evaluation of Recent USAID Evaluation Experiences, p. 26.
69 S.Prt. 112-21, Evaluating U.S. Foreign Assistance to Afghanistan, June 8, 2011, p. 14.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 14
The type of evaluation necessary to determine whether aid has real impact is both hard to do and
of limited use in a short-term context. Timelines are particularly restrictive for MCC, which
originally intended to complete evaluations during the compact implementation period. This goal,
which reflects broad support for limited timeframes on foreign assistance, was found not to be
feasible during implementation of MCC’s first compacts in Cape Verde and Honduras.70 Baseline
data and evaluation models can be rendered worthless if program timelines change. For example,
an MCC evaluation of a farmer training program in Armenia found that the planned impact
evaluation model—a phased roll-out—was compromised by a delay in implementing one
component of the program and the five-year compact timeline.71
70 Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11-
728, p. 33.
71 Measuring Results of the Armenia Farmer Training Investment, October 23, 2012, p.4, available at
Sector Evaluation Example: Trade
Capacity Building
Many analysts have suggested that cross-country
evaluations of aid for a specific sector may be more
useful for shaping policy than the more common
individual project evaluations. One example of this
approach is an evaluation commissioned by USAID to
look at the impact of 256 U.S. trade capacity building
(TCB) assistance projects in 78 countries from 2002 to
2006. The United States obligated about $5 billion during
this period for TCB activities, through several federal
agencies, including assistance to help developing
countries strengthen their public institutions and policies
related to trade, as well as programs to make private
industries more knowledgeable about and competitive in
global markets. The evaluation was designed after the
fact, making a randomized controlled trial unfeasible, and
had to account for variations in reporting across
projects. Much of the report highlights anecdotal
examples of issues that could not be analyzed
systematically as a result of inconsistent data collection
methodologies across projects. However, using
regression analysis, evaluators found a relationship
suggesting that each additional $1 invested in U.S. aid
(from all agencies) for TCB is associated with a $53
increase in the value of recipient country exports two
years later. For TCB aid specifically managed by USAID,
the relationship was $1 invested for $42 in increased
exports. No similar association was found between TBC
assistance and recipient country imports or foreign
direct investment. While this evaluation’s methodology
was not sufficient to demonstrate actual aid impact or
causation, its findings may be useful to policymakers in
both demonstrating a correlation between TCB aid and
export growth, as well as forming the basis of a
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 15
Country Ownership and Donor
Coordination. The United States and other
aid donor countries have made pledges in recent years to both coordinate their efforts and
increase recipient country control, or “ownership,” over the planning of aid projects and the
management of aid funds. The QDDR also promotes these objectives.73 Country ownership is
believed by many to increase the odds that positive results will be sustained over time both by
ensuring aid projects are consistent with recipient priorities and by helping to build the budget
and project management capacity of recipient country governments and non-governmental
organizations (NGOs) that administer the assistance. Donor coordination of assistance efforts is
supposed to promote efficiency, ease administrative burdens on aid recipients, and avoid
duplication, among other things. USAID, as part of its ongoing procurement reform process, aims
to channel 30% of aid directly to governments and local organizations in developing countries by
2015. However, greater country ownership, and the pooled funds that may result from donor
coordination, generally means diminished donor control, and a lesser ability to evaluate how U.S.
funds contributed to a particular outcome. Accountability concerns often greatly overshadow the
learning aspects of evaluation in such a context, as Congress has expressed concern about the
heightened potential for corruption and mismanagement when funds flow directly to recipient
country institutions.
Security. Over the past decade, a significant percentage of foreign aid has been allocated to
countries where security concerns have presented major obstacles to implementing, monitoring
and evaluating foreign aid. A 2012 evaluation of a USAID agricultural development program in
rural Pakistan, for example, states “the operating environment for development projects has been
especially testing in recent years in the presence of an insurgency and frequent targeted killings
and kidnappings.”74 Development staff in Afghanistan and Iraq have not always been able to
safely visit project sites to verify that a structure has been built or supplies delivered, much less
be out on the streets conducting the types of surveys that certain evaluations would normally call
for. A 2011 USAID Inspector General report noted that more than half of performance audits in
Iraq indicated security concerns. In the most insecure environments, monitoring and evaluation of
aid programs have often fallen by the wayside. Even in less hostile environments, security
concerns can undermine evaluation quality. For example, a 2011 evaluation of Office of
Transition Initiatives governance activities in Colombia noted that “security considerations
limited to some degree the evaluation team’s freedom to interview community members in
project sites at will. This fact made it difficult to be certain that field research did not suffer from
a form of sampling bias.”75 While security challenges may weigh against the use of aid in certain
regions, the most insecure places are sometimes where the U.S. foreign policy interests are
greatest, and policymakers must consider whether the risk of being unable to evaluate even the
performance of an aid intervention is worth taking for other reasons.
72 From Aid to Trade: Delivering Result. A Cross-Country Evaluation of USAID Trade Capacity Building, prepared for
USAID by Molly Hageboeck of Management Systems International, November 24, 2010; Executive Summary.
73 Leading Through Civilian Power, U.S. Department of State, Quadrennial Diplomacy and Development Review,
2010, p. 95.
74 United States Assistance to Balochistan Border Areas: Evaluation Report, Prepared by Management Systems
International for USAID, January 16, 2012, p. vi.
75 USAID/OTI’s Integrated Governance Response Program in Colombia, Final Evaluation, prepared by Caroline
Hartzell et al., April 2011, p. 7.
discussion about the comparative advantages of various
U.S. agencies in managing TCB aid.72
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 16
Agency and Personal Incentives. Given discretion in the use and conduct of evaluations,
observers have noted the inclination of foreign assistance officials to avoid formal evaluation for
fear of drawing attention to the shortcomings of the programs on which they work. While agency
staff are clearly interested in learning about program results, many are reportedly defensive about
evaluation, concerned that evaluations identifying poor program results may have personal career
implications, such as loss of control over a project, damage to professional reputation, budget
cuts, or other potential career repercussions.76 As explained by one USAID direct-hire in response
to a survey, “if you don’t ask [about results], you don’t fail, and your budget isn’t cut.”77 That
same study revealed that staff felt more pressure to produce success stories than to produce
balanced and rigorous evaluations, and that “professional staff do not see any Agency-wide
incentive to advance learning through evaluations.”78 Few observers consider risk taking and
accepting failure as a necessary component of learning to be hallmarks of USAID or State
Department culture. MCC’s institutional attitude toward adverse results may be tested in the
coming year, as its first evaluations are being made public for the first time.
Applying Evaluation Findings to Policy
A consistent theme in past reviews of foreign aid evaluation practices is that even when quality
evaluation takes place, the resulting information and analysis are often not considered and applied
beyond the immediate project management team. Evaluations are rarely designed or used to
inform policy. Lack of faith in the quality of the evaluation, irregular dissemination practices, and
resistance to criticism may all contribute to this problem, as does lack of time on the part of aid
implementers and policymakers alike to read and digest evaluation reports. A survey of U.S. aid
agencies found that “bureaucratic incentives do not support rigorous evaluation or use of
findings,” “evaluation reports are often too long or technical to be accessible to policymakers and
agency leaders with limited time,” and learning that takes place, if any, is “largely confined to the
immediate operational unit that commissioned the evaluation.”79 The shift in recent decades
towards the use of contractors and implementing partners for most project implementation, and
most project evaluation, may also impact the learning process. As one report notes, “partner
organizations are learning from the experience, but USAID is not,” and most evaluation work
does not circulate beyond the partner.80
The lack of a “learning culture,” as some describe it, has been a perennial criticism that agencies
appear to have been largely unsuccessful addressing in the past, though the prominent “lessons
learned” sections in the first batch of MCC evaluations may set a new standard. Some assert that
outside pressure, such as a legislative mandate, may be necessary. Congress expressed some
interest in this issue with the Initiating Foreign Assistance Reform Act of 2009 (H.R. 2139 in the
111th Congress), which called for “a process for applying the lessons learned and results from
evaluation activities, including the use and results of impact evaluation research, into future
budgeting, planning, programming, design and implementation of such United States foreign
assistance programs.” No such requirements were enacted in the 111th Congress, but the May
76 Evaluation of Recent USAID Evaluation Experiences, p. 22.
77 Ibid., p. 24.
78 Ibid., pp. 26-27.
79 Beyond Success Stories, p.iv.
80 Evaluation of Recent USAID Evaluation Experiences, p. 27.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 17
2012 memorandum from OMB, calling on all agencies to use evaluation data in their FY2014
budget submissions, may have similar impact.81
The learning aspect of evaluation relies heavily on agency culture, which may be shaped more by
leadership than policy. The effective application of evaluation information depends also on the
details of implementation, such as evaluation questions being based on the information needs of
policymakers and program managers, and information being presented in a format and to a scale
that is useful. Policymakers, for example, may be much better able to make actionable use of a
meta-evaluation of microfinance programs, presented in a short report highlighting key findings,
than a whole database of detailed analysis of single projects, the results of which may or may not
be more broadly applicable. Experts have pointed out that individual project evaluations, even
when well done, do not roll up nicely into a document showing what works and what does not.
They contend that for maximum learning, an effort must be made at the cross-agency or even
whole-of-government level to develop evaluation meta-data that is responsive not only to the
needs of a project manager interested in the impact of a particular activity, but also to agency
leadership and policymakers who want to know, more broadly, what foreign assistance is most
effective. This view has been reflected in legislation introduced in recent years, including the
Foreign Assistance Revitalization and Accountability Act of 2009 (S. 1524 in the 111th Congress),
which called for the creation of a Council on Research and Evaluation of Foreign Policy to do
cross-agency evaluation of aid programs.
As important as evaluation can be to improving aid effectiveness, not every aid project has broad
learning potential. Knowing which potential evaluations could have the greatest policy
implications may be key to maximizing evaluation resources. Many USAID projects, for
example, are designed as small-scale demonstrations, with no intention that they be scaled up or
replicated elsewhere. In other situations, an approach may have already been well proven. In such
instances, a basic performance evaluation for accountability may be appropriate, but rigorous
evaluation may be a poor use of resources. A 2012 USAID “Decision Tree for Selecting the
Evaluation Design” asks staff to first consider whether an evaluation is needed, and decline to
evaluate if the timing is not right, if there are no unanswered questions for the evaluation to
address, or if there is no demand from stakeholders.82
Current Agency Evaluation Policies
The primary U.S. government agencies managing foreign assistance each have their own distinct
evaluation policies, but these policies have come into closer alignment in the last two years. The
Quadrennial Diplomacy and Development Review (QDDR) report of December 2010 stated the
intent that USAID would reclaim its leadership role with respect to evaluation and learning, and
referenced a new USAID evaluation policy in the works to reflect the growing demand for results
data and attempt to address some persistent evaluation challenges. That policy took effect January
2011. The State Department followed suit in February 2012 with an new evaluation policy that is
similar in many respects to the USAID policy, and MCC updated its policy in May 2012.
81 This memo is discussed in the text box on page 2. See Use of Evidence and Evaluation in the FY2014 Budget,
Memorandum to the Heads of Executive Departments and Agencies, Jeffrey d. Zients, Acting Director, Office of
Management and Budget, May 18, 2012.
82 Decision Tree for Selecting the Evaluation Design, USAID, June 2012, p. 1, available on USAID’s Development
Experience Clearinghouse website.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 18
Appendix A compares key provisions of the current evaluation policies of USAID, State, and
The new State and USAID policies share much in common, balancing the costs and expected
gains from evaluation. For example, both require performance evaluations of all larger-thanaverage
projects and experimental/pilot projects, but not all projects. Both also include a target
allocation of funds for program evaluation: 3% for USAID and 3%-5% for State. The policies
share an emphasis on accessibility of information, with provisions to promote consistent and
timely dissemination of evaluation reports. In their introductory language, both policies
emphasize the learning benefits of evaluation, in addition to accountability. The USAID policy is
notably more detailed than State’s on many of the issues. The USAID policy establishes required
features for evaluation reports, and specifies that evaluation questions be identified in the design
phase of projects, issues which the State policy does not address. USAID states that most
evaluations will be conducted by third party contractors or grantees, to promote independence,
while State’s policy does not explicitly mention use of independent evaluators. State’s evaluation
reporting requirements also focus on internal dissemination, while USAID requires public
availability. According to State officials, however, many of these issues are fleshed out in
subsequent internal guidance documents and the State and USAID policies, in practice, differ
only on the use of impact evaluation. USAID’s policy calls for impact evaluation whenever
feasible, while the State policy sets a clear expectation that impact evaluation will be rare.83
MCC’s evaluation policy shares many elements of the State and USAID policies, but goes farther
in many respects. MCC requires independent evaluations of all compact projects, using indicators
and baselines established prior to project implementation. It may be, however, that first-hand
experience with the challenges of evaluation is bringing MCC policy and practice closer to that of
USAID over time. MCC’s 2012 policy revision adopts definitions from USAID’s 2011 evaluation
policy and includes a new section on institutional learning. The update also appears to move
closer to the USAID model with respect to impact evaluation, calling for impact evaluations
“when their costs are warranted,” whereas the previous iteration referred to independent impact
evaluations as an “integral part” of MCC’s focus on results.84 The MCC policy still appears to
have the strongest enforcement mechanism among the three agency policies, conditioning the
release of quarterly disbursements on substantial compliance with the policy. USAID’s policy, in
contrast, calls only for occasional compliance audits, and State’s policy does not address
compliance at all.
While some experts have called for greater uniformity of evaluation practices across agencies to
allow for comparative analysis, others view the differences in State, USAID, and MCC evaluation
polices as reflecting the different experience, scope of work, and priorities of the agencies.
USAID, with the largest and most diverse assistance portfolio among the agencies, and numerous
small projects, may require a more flexible approach to evaluation than MCC, which is narrowly
focused on economic growth and recipient government ownership. At State, foreign assistance is
just one part of a broader portfolio (including diplomatic activities), potentially impacting what
type and scope of evaluation is useful or possible.
83 Author’s communication with State officials via e-mail, October 10, 2012.
84 Policy for Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 1, 2012, p.18; Policy for
Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 12, 2009, p. 17.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 19
These current evaluation policies represent a step towards improving knowledge of foreign
assistance measures of effectiveness at the program or project level, and increasing transparency
of the evaluation process. They do not, however, attempt to establish a systemic approach to aid
evaluation that would make country-wide, sector-wide, or cross-agency evaluation or aid more
feasible. They look similar to earlier initiatives to improve aid evaluation. Many aspects of the
new USAID policy, for example, are strikingly similar to the required actions called for in the
2005 cable to USAID missions (e.g., evaluation planning as part of all program designs,
designated evaluation officers at each post, and set-aside evaluation funds). It is too early to know
whether this new initiative will have more real or lasting impact than its predecessors. The State
Department policy has only recently taken effect. MCC just released its first five project
evaluation reports in October 2012,85 and has yet to produce a compact evaluation. USAID, a
year into implementation of its policy, reports that insufficient time has passed to document any
changes in evaluation quality, as no evaluations have gone from start to finish under the new
requirements. However, the quantity of USAID evaluations has increased notably, from 89 in
2010 to 295 in 2011,86 and the agency aims to complete 250 “high quality” evaluations by
January 2013.
85 See
86 USAID Evaluation Policy: Year One, First Annual Report and Plan for 2012 and 2013, p. 2.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 20
A Global Perspective on Aid Evaluation
U.S. foreign assistance evaluation efforts have evolved in the context of a global movement by public and private aid
donors to improve aid effectiveness, with improved evaluation practices as one of many strategies. Representatives of
aid donor countries meet regularly under the auspices of the OECD Development Assistance Committee (DAC) to
discuss evaluation practices, among other things, as a means of implementing the aid effectiveness agenda laid out in
the 2005 Paris Declaration on Aid Effectiveness and the 2008 Accra Agenda for Action. A 2010 OECD/DAC survey
and report on evaluation in the development agencies of major donor countries highlighted several issues that are
common to U.S.-specific aid evaluation.87 The report found a heavy reliance on measuring outputs, but also a trend
toward measuring aid impact and larger strategic questions of development effectiveness. It identified new emphasis
on dissemination of evaluation findings, and found that while bilateral aid agencies on average allocated 0.1% of their
development assistance budget to evaluation, lack of human resources—people qualified to do rigorous impact
evaluations, evaluations of direct budget support, or requiring specific language skills, in particular—presented a bigger
obstacle to evaluation goals than did financial constraints.
Non-governmental organizations have focused on evaluation in recent years, as well. In 2004, an Evaluation Gap
Working Group was convened by the Center for Global Development with support from the Bill & Melinda Gates
Foundation and the William and Flora Hewitt Foundation. The Working Group focused on why rigorous impact
evaluations of development assistance were so rare. The resulting report, “When Will We Ever Learn?,” is a key
resource for this report. The group made two recommendations: (1) that donors invest more in their own evaluation
capacity, and (2) that an independent institution be created to evaluate aid.88 The offshoot of the latter
recommendation is the International Initiative for Impact Evaluation (3ie), established in 2009, with a mission to use
impact evaluations, specifically, to generate high quality evidence for use in shaping effective development policies. 3ie
both funds evaluations and produces extensive materials on evaluation methods, implementation practices, and
application to policy, as a means to improve evaluators’ technical capacity. USAID and MCC are official partners of
3ie, as are many other official aid agencies, private foundations, and non-profit organizations such as the Hewlett and
Gates foundations and Save the Children.
Issues for Congress
While recent momentum on foreign aid evaluation reform has originated within the
Administration, Congress may have significant influence on this process. Not only can Congress
mandate or promote a certain approach to evaluation directly through legislation, as has been
proposed, it can modulate Administration policies by controlling the appropriations necessary to
implement the policies. Congress may also influence how, or if, the information resulting from
evaluations will impact foreign assistance policy priorities. These issues are discussed in greater
detail below.
Reform Authorization Legislation. There is at least one proposal in the 112th Congress that
focuses specifically on foreign aid evaluation. The Foreign Aid Transparency and Accountability
Act of 2012 (H.R. 3159; S. 3310) seeks to evaluate the performance of U.S. foreign assistance
programs and improve program effectiveness by requiring the President to establish guidelines on
measurable goals, performance metrics, and monitoring and evaluation plans for foreign
assistance programs that can be applied on a uniform basis across implementing agencies, both
U.S. and multilateral. The legislation also calls for the creation of a website, within two years of
enactment, that would make detailed, program-level information on foreign assistance, including
country strategies, budget documents, budget justifications, actual expenditures, and program
reports and evaluations available to the public. The bill’s requirements are similar in many
87 Evaluation in Development Agencies, Better Aid, OECD Publishing, 2010, available at
88 When Will We Ever Learn?: Improving Lives Through Impact Evaluation, Report of the Evaluation Working Group,
Center for Global Development, May 2006.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 21
respects to the F Process, but would extend the requirements across the various federal and
multilateral agencies that administer aid programs. The benefit of such broad uniformity,
arguably, is that it could enable policymakers, the public, and other stakeholders to better
compare the activities of various agencies and get a more comprehensive picture of total U.S.
foreign assistance. A potential drawback is the effort and expense required to impose such
uniformity on agencies with different objectives, management structures, and information
technology systems. The legislation is focused on transparency and accountability rather than
effectiveness, and does not promote the use of impact evaluation. If performance evaluation
continues to comprise the vast majority of aid evaluations, such a cross-agency requirement may
provide comparable information on aid management from agency to agency, but is not likely to
facilitate comparative analysis of what aid works best.
Appropriations for Enhanced Evaluation. Increasing the number and quality of foreign aid
evaluations, while potentially cost effective in the long run, requires an investment of resources.
For the most part, evaluation costs are integrated into program accounts at the various
implementing agency budgets and are not scrutinized specifically by Congress. However,
USAID, in conjunction with its new policy, started in the FY2012 budget request to identify
resource needs for a centralized evaluation and learning through a “Learning, Evaluation and
Research” (LER) line item. LER is one of the seven focus areas of the USAID Forward reform
agenda, and is intended to both enhance USAID’s ability to conduct rigorous evaluations, as well
as apply the knowledge gained through evaluation to improve future assistance strategies and
design. The Administration requested $19.7 million for this purpose, through the Development
Assistance appropriations account, for FY2012. Congress provided $12.26 million. For FY2013,
USAID requested $26.67 million, to expand the number of priority evaluations it can carry out,
improve staff training, and support evaluation collaborations with international partners. The
ultimate funding level established by Congress, together with any related legislative directives,
may play a role in determining the extent of the Administration’s efforts to strengthen evaluation
Impact of Evidence Based Approach on Congressional Priorities. Congress has long exerted
control over foreign assistance not only through appropriated funds and restrictions, but also by
directing foreign assistance funds to certain sectors, countries, or even specific projects through
bill or report language. For example, the committee reports accompanying the FY2013 House and
Senate State-Foreign Operations appropriation proposals (H.Rept. 112-494; S.Rept. 112-172),
like most of their predecessors, provide specific funding levels for microfinance, basic education,
water and sanitation, women’s leadership training, people-to-people reconciliation programs in
the Middle East, and other sectors of particular interest to Members of Congress. Should credible
information about the relative effectiveness of these programs be made available as a result of
improved evaluation practices, Congress can weigh the importance of the data, among other
drivers, in establishing aid priorities. Some congressional directives on aid are less likely than
others to be affected by evaluation results. The availability of actionable evaluation data may not
result in a maximization of aid effectiveness, but may allow Congress to make more deliberate
trade-offs between effectiveness and other objectives.
The primary U.S. agencies charged with implementing foreign assistance have made significant
steps in the last two years to address ongoing deficiencies in evaluation practices that make it
difficult to judge whether foreign assistance is achieving its various objectives. There is
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 22
widespread agreement, reflected in new policies, on the need for consistent performance
evaluation of aid programs. The value of rigorous impact evaluation is broadly recognized as
well, though the agencies differ in their capabilities and aspirations in this respect. Past policies
and evaluation reform efforts, however, have been similarly focused but not sustained in the face
of persistent challenges, many of which remain today. Other reforms, such as the establishment of
centralized evaluation processes or the creation of an independent evaluation entity, have been
proposed in legislation yet not addressed in agency policies. Growing emphasis in Congress and
the Administration on results-based budgeting, as well as movement within the international aid
donor community toward more rigorous aid evaluation practices, may provide the context for
future change. The 113th Congress will have multiple opportunities to influence how U.S. foreign
assistance is evaluated through legislative proposals, appropriations, and oversight activities.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 23
Appendix A. Select Aspects of Current USAID,
State Department, and MCC Evaluation Policies
January 2011 February 15, 2012 May 1, 2012
PPL/LER responsible for system
implementation, while missions
and functional bureaus
responsible for conducting
evaluations. All Bureaus and
operating units must designate an
evaluation point of contact.
F and RM Bureaus monitor and
report on evaluations plans. Each
Bureau should identify a senior
staffer to serve as evaluation point of
Primary lead is MCA
(host country entity)
M&E, with input from
Operating units must conduct at
least one performance evaluation
of each project that equals or
exceeds average project size.
Projects involving an untested
hypothesis or new approach, and
that are anticipated to expand in
scale or scope, will undergo an
impact evaluation, if feasible.
All evaluations will share certain
basic features, including a full
description of methodology;
standardized recording and
maintenance of records from
evaluation; evaluation findings
based on facts, evidence, and
data, sex-disaggregated data; and
an explanation of the limitations
of the data.
Key evaluation questions will be
identified during the design phase
of every project.
All programs/projects/activities
greater than or equal to the median
size (generally using dollar value as
the measure) for the Bureau must be
evaluated at least once in their
lifetime or every five years,
whichever is less.
All pilot programs must be evaluated
once every five years.
Each Bureau must evaluate 2 to 4
projects/programs/activities in
FY2012-FY2013, with this
requirement extending to all posts in
FY2013-FY2014 period.
All Compacts and
Threshold Agreements
include monitoring and
evaluation plans, which
identify the evaluations
to be conducted for
each project, the key
evaluation questions
and methodologies,
and the data collection
strategies that will be
Final evaluations are
required for all
projects in a Compact
upon completion or
termination; mid-term
evaluations are
Selected indicators
must have baselines
established prior to the
start of the
corresponding activity.
Emphasis on quality evaluation
methods and favoring random
assignment/experimental methods
for impact evaluations when
Bureau’s discretion, based on
context but the policy establishes an
expectation that the “great majority”
of evaluations will be performance
evaluations because “impact
evaluations are more time
consuming, costly, and often difficult
to successfully design for State
programs, projects and activities.”
Impact evaluations
performed “when their
costs are warranted by
the expected
accountability and
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 24
Policy states that most
evaluations will be conducted by
third party contractors or
grantees managed by USAID, but
evaluation teams may be
composed primarily of USAID
staff, led by an outside expert,
when it is determined that this
will facilitate institutional learning.
Suggests that evaluators should be
“free from and pressure and/or
bureaucratic interference,” but does
not explicitly call for the use of
outside evaluators.
Independent evaluators
required for final
evaluations of
Mid-term compact
evaluations and final
threshold program
evaluations can be
done independently or
by MCC/MCA staff.
Recommends an average 3% of
program budgets be dedicated
specifically to external evaluation,
distinct from monitoring.
Resources for evaluation should
be concentrated on large projects
and those that are innovative or
pilot approaches.
Program managers “should identify
resources of up to 3-5% for
evaluation activities.”
Does not specify a
portion of funds that
should be used for
Public availability of evaluation
reports and summaries, within 3
months of completion, on the
Development Experience
Clearinghouse website.
Bureaus and posts must
electronically transmit final
evaluation reports as cables and post
reports on their OpenNet or
ClassNet websites.
MCAs must post their
approved Compact
M&E plans on their
website. MCC and
MCAs must “regularly”
publish results
information on their
PPL/LER will organize occasional
external technical audits of
operating unit compliance with
the policy.
No reference to compliance
Substantial compliance
required for approval
of quarterly
requested by recipient
Source: Policy for Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 1, 2012; Department of
State Evaluation Policy, Bureau of Resource Management, February 23, 2012; Evaluation: Learning from Experience,
USAID Evaluation Policy, January 2011.
Notes: PPL/LER = USAID Office of Learning, Evaluation and Research; F Bureau = Office of Foreign Assistance
Resources; RM = State Department Bureau of Resource Management; MCA = the Millennium Challenge Account
implementing entity in each compact country; M&E = monitoring and evaluation. The information in the table
refers only to what is in the actual evaluation policy document of each agency, as cited above. Information
available outside of these documents, which may provide greater details about aspects of the policies, is not
reflected here.
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Congressional Research Service 25

Back to Index