Program Evaluation 101

What is program evaluation?

Program evaluation is the systematic assessment of the processes and/or outcomes of a

program with the intent of furthering its development and improvement. As such, it is a

collaborative process in which evaluators work closely with program staff to craft and

implement an evaluation design that is responsive to the needs of the program. For example,

during program implementation, evaluators can provide formative evaluation findings so that

program staff can make immediate, data-based decisions about program implementation and

delivery. In addition, evaluators can, towards the end of a program or upon its completion,

provide cumulative and summative evaluation findings, often required by funding agencies and

used to make decisions about program continuation or expansion.

Informal vs. Formal Evaluation

Evaluation is not a new concept. As a matter of fact, as human beings we are engaged in

evaluation activities all the time. Practitioners, managers, and policy makers make judgments

about students, clients, personnel, programs, and policies daily and these judgments lead to

choices and decisions. These judgments are based on informal, or unsystematic, evaluations.

Informal evaluations can result in either faculty or wise judgments. However, informal

evaluations are characterized by an absence of breadth and depth because they lack systematic

procedures and formally collected evidence. The judgments may be clouded by one’s

experience, instinct, generalization, and reasoning. In other words, when we conduct informal

evaluations, we are less cognizant of the limitations posed by our background.

In contrast, formal evaluation is developed to assist and extend natural human abilities to

observe, understand, and make judgments about policies, programs, and other objects in

evaluation. Formal evaluation strives to be thorough, structured, and formal. Formal

evaluation seeks to help practitioners “cultivate critical intelligence” to make sense of ordinary

life events.

Distinction between Research and Evaluation

Evaluation and research share many similarities in that they both seek to answer inquiry

questions using systematic methodologies. However, while evaluation and research may be

similar in the methodology they use, there are also important distinctions between evaluation

and research. First of all, research and evaluation have different purposes. The primary

purpose of research is to add to knowledge in a field and to contribute to the growth of theory.

While the results of an evaluation study may contribute to knowledge development, its primary

purpose is to help those who hold a stake in whatever is being evaluated make a judgment or

decision. In short, research seeks conclusions and evaluation leads to judgments.

Secondly, another distinction lies in who sets the agenda. In research, the hypotheses to be

investigated are chosen by the researcher based on his/her knowledge about the discipline or

field. In evaluation, the questions to be answered are not those of the evaluator, but rather,

come from many sources, including those of significant stakeholders. An evaluator may suggest

questions but will always consult with stakeholders to determine the focus of the study.

Another difference concerns generalizability of results. Evaluation is specific to the context in

which the evaluation object rests while research seeks to generalize its findings across many

different settings.

Fourth, there are also differences in the criteria or standards used to judge the adequacy. Two

important criteria for judging the adequacy of research are internal validity (or causality) and

external validity (or generalizability to other settings and other times). To judge an evaluation,

however, accuracy (the extent to which the information obtained is an accurate reflection),

utility (the extent to which the results serve practical information needs or intended users),

feasibility (the extent to which the evaluation is realistic, prudent, diplomatic, and frugal), and

propriety (the extent to which the evaluation is done legally and ethically, protecting the rights

of those involved) are key standards.

Formative vs. Summative Evaluation

There are two terms that evaluators use to distinguish between the types of judgments,

decisions, or choices that evaluations can serve. A formative evaluation is conducted internally

by staff who are either working in the program or are embedded in the organization. Its

purpose is to gather feedback on aspects of the program that are undergoing review and

possible revision. Questions such as “What is working well and what is not”, “What needs

fixing”, and “Is there a need for midcourse corrections?” are asked. The evaluation is intended

to provide information for program improvement. In contrast, a summative evaluation is

concerned with providing information to serve decisions or assist in making judgments about a

program’s overall worth or merit in relation to important criteria. Decisions about

replacements, major overhauls, awards, or other accountability decisions often are the end

results of summative evaluations.

The audiences for formative and summative evaluations are also very different. In formative

evaluation, the audience is generally the people delivering the program or those close to it,

such as those responsible for developing the new schedule, delivering the training program, or

managing the new center. Summative evaluation audiences may include potential consumers

(students, teachers, employees, managers, or officials in agencies that could adopt the

program), funding sources, and supervisors and other officials, as well as program personnel.

The audiences for summative evaluations are often policy makers or administrators, but can be

any audience who has a stake in the decision.

Both formative and summative evaluations are essential because decisions are needed during

the development stages of a program to improve and strengthen it, and again, when it has

stabilized, to judge its final worth or determine its future. However, well-established programs

can also benefit from formative evaluations and some new programs are so problematic that

summative decisions may be made to discontinue.

Why is there a need for program evaluations?

Program evaluation can:

• Understand, verify, or increase the impact of products or services on customers or

clients.

• Improve delivery mechanisms to be more efficient and less costly

• Verify that the program is doing what it is supposed to do.

• Facilitate management’s goals and objectives

• Produce data or verify results that can be used for public relations and promoting

services in the community.

• Produce valid comparisons between programs to decide which should be retained.

• Fully examine and describe effective programs replication elsewhere

Mark, Henry, and Julnes (1999) have articulated four different purposes for evaluation:

assessment of merit and worth, oversight and compliance, program and organizational

improvement, and knowledge development. Ultimately, program evaluation is useful in

helping stakeholders make value judgments and decisions about a program, project, process or

product.

In a school district, there are various areas that program evaluation can serve a purpose. For

example,

• Program needs assessments: to establish program goals and objectives.

• Individual needs assessments: to provide insights about the instructional needs of

individual learners.

• Resource allotment: to provide guidance in setting priorities for budgeting.

• Process or strategies for providing services to learners: to provide insights about how

best to organize a school to facilitate learning in curriculum design, classroom processes,

materials of instruction, monitoring of pupil progress, learners motivation, teacher

effectiveness, learning environment, staff development, decision making, community

involvement, and board policy formation.

• Outcomes of instruction: to provide insights about the extent to which students are

achieving the goals and objectives set for them.

Roles of Internal vs. External Evaluators

The adjectives internal and external distinguish between evaluations conducted by program

employees and those conducted by outsiders. Internal evaluators are more likely to know

more about a program, its history, its staff, its clients, and its struggles than any outsider.

Internal evaluators also know more about the organization and its culture and styles of

decision-making. They are present to remind others of results now and in future and can

communicate technical results more frequently and clearly. However, internal evaluators are

also more subject to internal bureaucratic restrictions and pressures.

In contrast, external evaluators can bring greater credibility, perceived objectivity, and may

bring more breadth and depth of technical expertise. External evaluators are also more likely

to have knowledge of how other similar organizations and programs work and offer broad

perspectives. Internal evaluators and external evaluators can often collaborate to provide

evaluations of different nature. For example, internal evaluators are well-positioned to conduct

formative evaluations while external evaluators can provide the objectivity needed in

conducting summative evaluations.

What types of program evaluation are there?

There is a full array of issues program evaluators can address and these issues are characterized

as different types of program evaluation, including needs assessment, cost analysis, goals-based

evaluation, process-based evaluation, and outcomes-based evaluation.

Needs Assessment

Needs assessment is used to acquire an accurate, thorough picture of the strengths and

weaknesses of a program. The information and data gathered from needs assessments can

help decision makers determine priority goals, develop a plan, and allocate funds and

resources. Overall, in needs assessment, program evaluators are concerned with establishing

(1) whether a problem or need exists and describing that problem, and (2) making

recommendations for ways to reduce the problem, i.e., the potential effectiveness of various

interventions.

Cost-Analysis

Evaluators are sometimes called to bring precise information on costs to the attention of

program developers, deliverers, and administrators. There are four types of cost analyses that

evaluators can do: cost-benefit, cost-effectiveness, cost-utility, and cost-feasibility analyses.

Cost-benefit analysis involves comparing costs and benefits and expressing both in monetary

terms. However, it can sometimes be very difficult to translate all benefits into dollar terms.

Program evaluators often refer to literature for equivalent cost estimate for items that cannot

be quantifiable with dollar amount.

Goals-Based Evaluation

Often programs are established to meet one or more specific goals. Goals-based evaluations

allow program evaluators to evaluate the extent to which programs are meeting these

predetermined goals or objectives.

Process-Based Evaluation

Process-based evaluations are geared to fully understand how a program works—how does it

produce that results that it does. This type of studies typically describes how the program is

delivered.

Outcomes-Based Evaluation

Outcomes-based evaluations are concerned with describing, exploring, or determining changes

(program outcomes) that occur in program recipients, secondary audiences (i.e. families,

coworkers, etc.), or communities as a result of a program. When evaluating outcomes, the

outcomes can be immediate, short-term results, long-term results or long-term impacts as a

result of the program implementation.

How can program evaluations be done?

Planning a Program Evaluation

Before starting a program evaluation, different aspects need to be taken into consideration.

• What are you going to evaluate?

• What is the purpose of conducting a program evaluation? i.e. what do you want to be

able to decide as a result of the evaluation?

• Who are the audiences for the information from the evaluation? e.g. funding agency,

management/leadership, board, staff, etc.

• Who are the stakeholders of the evaluation?

• What questions will the evaluation seek to answer?

• What kinds of information are needed to answer the question?

• From what sources should the information be collected?

• How can that information be collected in a reasonable fashion? e.g. questionnaires,

interviews, examining documentation, observations, focus group study, etc.

• When is the information needed?

• What resources are available to collect the information? e.g. time, money, people?

After considering the different areas of evaluation, it may be useful to include an evaluation

plan. An evaluation plan should include:

• Evaluation questions

• Information/data required to answer the evaluation questions

• Research design (quantitative vs. qualitative types of design) used

• Data sources

• Method for collecting data

• Sampling method

• Data-gathering procedures

• Schedule for gathering data

• Data analysis procedures

• Data interpretation procedures

• Reporting procedures (audience, content, format, schedule)

A sample abbreviated worksheet for an evaluation plan can be found in the resource section.

Sometimes, when the program evaluation involves an evaluation team and more extensive

work, a detailed evaluation plan needs to be used. A more detailed sample evaluation plan

template can also be found in the resource section.

Methods for Collecting Information

The following table provides an overview of the major methods used for collecting data during

evaluations:

Data Collection Method

Characteristics

Documents

Nonofficial papers; minutes, notes, plans

Reveal actions, thinking, perceptions uninfluenced by the study

Records

Official documents: census, attendance, salaries

More valid and reliable than documents

Observation

Observations of program context and activities, participant

behaviors, and environments

Can be structured or unstructured

Useful in some way in almost every evaluation

Site visits

A subset of observation, used by regularly agencies

Surveys

Reports of attitudes, opinions, behavior, life circumstances

Can be administered in person or by mail

Telephone Interviews

Purposes are similar to those of a survey, but questions can be

more open-ended, but must be shorter

Can develop rapport and use verbal prompts

Electronic Interviews or

Surveys

Questions delivered and answered using computer technology

Items may be constructed as open or closed

Interviews

Qualitative interviews are useful for eliciting values, perspectives,

experiences, and more detailed responses

Focus Groups

Useful when group interaction can encourage and enhance

responses

Tests

Used to examine knowledge and skills

Primarily used in education and training

Alternative Assessments

Examines knowledge and skills in a direct way

Viable alternative to paper-and-pencil measures

Methods for Analyzing and Interpreting Information

After gathering data, there is a need to analyze and interpret the information/data gathered.

The purpose of data analysis is to reduce and synthesize information—to “make sense” of it—

and to allow inferences about populations.

There could be two types of data: quantitative and qualitative data. Quantitative data analysis

involves using descriptive or inferential statistics to answer evaluation questions. Some of the

inferential statistics involve applying complex statistical models and using statistical software

such as SPSS or SAS in analyzing the data. Qualitative data analysis, on the other hand, involves

searching for and identifying patterns and themes from the data. After working hypotheses are

formed, verification and confirmation checks are performed to certify the accuracy of the

conclusions.

Data analyses focus on organizing and reducing information and making statistical inferences.

Interpretation, on the other hand, attaches meaning to organized information and draws

conclusions. In the following are methods and guidelines for interpreting findings:

• Determining whether objectives have been achieved;

• Determining whether laws, democratic ideals, regulations, or ethical principles have

been violated;

• Determining whether assessed needs have been reduced;

• Determining the value of accomplishments;

• Asking critical reference groups to review the data and to provide their judgments of

successes and failures, strengths and weaknesses;

• Comparing results with those reported by similar entities or endeavors;

• Comparing assessed performance levels on critical variables to expectations of

performance or standards;

• Interpreting results in light of evaluation procedures that generated them;

• Conduct stakeholder meetings to gather multiple perspectives and convergence of

opinions;

• Seeking confirmation and consistency with other sources of information;

• Dealing with contradictory and conflicting evidence; not forcing consensus when none

exists;

• Not confusing statistical significance with practical significance;

• Considering and citing limitations of the analyses.

Resources

Relevant Surveys

• School Needs Assessment Survey

http://www.dpi.state.wi.us/sig/improvement/process.html

This link includes a few useful surveys including Characteristics of Successful School

Surveys, School Climate Surveys for Students and Staff, Self-Reflection Tool for Teachers

and Administrators and Characteristics of Successful Districts.

• Professional Development Outcomes Survey

http://www.programevaluation.org/outcomesurv.htm