Evaluating Evaluations

by Chris Cumo

The semester’s end has a routine of its own centered on final exams and grade tabulations. But no part of the routine carries more weight than teaching evaluations. Bad ratings will cost adjuncts their jobs at Georgia State University, said Educational Policy Studies assistant professor Mary Beth Gasman. Administrators expect both full- and part-time faculty to average four or better (on a scale of five) on the question that asks students to rate the overall quality of instruction, said Doug Davis, assistant professor of Educational Policy Studies at Georgia State and member of the Board of Consulting Editors for The Journal of Personnel Evaluation in Education.

He concedes that some instructors, who may feel pressure to generate high numbers, try to turn evaluations into a popularity contest.

“Are evaluations a popularity contest?” asks Rick Kroc, director of the University of Arizona’s Office of Assessment and Enrollment Research. “If by popularity you mean an engaging style, then my answer is yes.”

He recalls a new assistant professor of chemistry who was superbly organized and had a mastery of content yet received miserable ratings. A visit to her classes revealed the problem. Her shyness led her to speak in an inaudible monotone. Kroc suggested she huddle with people in the drama department, where she learned to project her voice and to vary the cadence of her speech, becoming a bit of a theatrical performer. Her evaluations rose meteorically.

Kroc sees a lesson in her success. Teaching is a performance that intertwines style and substance, and it is this performance that students rate. In this sense, evaluations are a valid measure of instructional quality, for Kroc believes that adjuncts and professors who aren’t dynamic speakers don’t merit high marks.

One must be able to captivate students before one can hope to teach them anything. This doesn’t mean evaluations are perfect. Peter Facione, Dean of Arts and Sciences at Santa Clara University in California and founder of Insight Assessment, a California firm that designs a range of tests including course evaluations, believes they fail to measure the two most important components of instruction: whether students perceive that they have accomplished the course goals (a component that presumes they know the goals) and how students believe the instructor might improve the course in light of its goals.

Equally serious, universities often use the same evaluation form for all courses. Facione thinks they would do better to tailor evaluations to the discipline. Students in an introductory calculus course and those in a graduate seminar in anthropology have different expectations of their instructors, expectations that the current one-size-fits-all model can’t measure.

Evaluations have other limits. They can’t measure an instructor’s expertise, for students aren’t qualified to make this judgement admits Kroc. Students in introductory courses typically rate an instructor only “moderately knowledgeable,” whereas graduate students will rate the instructor “highly knowledgeable,” said Gene Glass, professor of education at Arizona State University. Even straightforward questions can be problematic.

The attempt to determine whether an adjunct is accessible to students outside class has no ready answer, for only 10 percent of students will consult the instructor outside class, notes Edward Nuhfer, director of the Office of Teaching Effectiveness at the University of Colorado, Denver. Thus, only this fraction can answer the question with certitude. The other 90 percent don’t know and so tend to select the middle of the scale, as a way of saying “I don’t know” or “I don’t care.” The result is that the conscientious instructor who holds office hours and gives students her/his home phone number nonetheless ends up with a mediocre rating on this question.

As a result, evaluations measure how much students like an instructor, believes Suzanne Crawford, who teaches part-time at three community colleges in California. She admits as a graduate student having rated a professor highly because he had praised her work. In retrospect she concedes that he was lazy, arrived late to class, left early, required little of students and gave high grades.

“I think student evaluations measure their [students’] degree of like or dislike of a teacher,” she said. “I think they also reflect the students’ contentment with their grades. And I think evaluations measure little more than this.”

Despite their drawbacks, Kroc doesn’t think universities should abandon them.
“I believe in student ratings,” he said.“Students can judge quality.”

Even if he is right, the problem lies, as AAUP associate secretary Richard Moser has noted, in the tendency of department chairs to rely exclusively on them in deciding whether to rehire an adjunct. This approach gives students too much power, likening them to the spectators at the Coliseum who decided whether a gladiator lived or died. Instead, evaluations should be part of a mix of evidence that department chairs review when deciding whom to retain. Such an approach would tilt the decision away from students and back toward department chairs, where it belongs.