The icon indicates free access to the linked research on JSTOR.

Every semester, college students are given the chance to evaluate their professors. Their evaluations, like ratings of workers in other fields, show persistent gender gaps. The underlying biases are not easily defeated, but research by management scholars Lauren Rivera and András Tilcsik finds that there is a startlingly simple way to reduce inequality in evaluation systems: change the top rating from ten to six.

Path to OpenPath to Open

Rivera and Tilcsik’s findings draw on two sets of data. When one large university professional school changed its top rating from ten to six, it set up a quasi-natural experiment, allowing the researchers to draw on 105,034 student ratings of 369 different instructors from before and after the change. Additionally, to establish how much of the gender inequality in evaluations came from bias as opposed to gendered differences in teaching effectiveness, they administered a survey showing students identical course transcripts but randomly varied the gender attributed to the instructor and the number of choices in the rating system.

The results were striking. When the real-life university evaluations used a ten-point scale, women teaching in the most male-dominated fields were significantly less likely than men to get the highest rating on the scale. Their average ratings were half a point lower than men. On a six-point scale, “differences largely disappeared,” they write.

Rivera and Tilcsik note that this is partly because more options allow for more subtle distinctions, but they also argue that the shift goes beyond that. The “perfect 10” has a deeper cultural resonance and is associated with qualities like brilliance—qualities that are more often attributed to men.

More to Explore

Mrs Miss Ms

From the Mixed-Up History of Mrs., Miss, and Ms.

Language can reveal power dynamics, as in the terms of address, or honorifics, are used to refer to a woman's social status: Mrs., Miss, and Ms.

The survey supports this argument. In addition to that gender gap of about two-thirds of a point on a ten-point scale almost disappearing on the six-point scale, a shift was also detected in qualitative data. When the participants responded to the transcripts, they were significantly more likely to use words like “brilliant,” “genius,” and “perfect” when they believed the lecture to have been delivered by a man. Finally, when asked specifically if they agreed that the instructor was brilliant, participants were significantly more likely to strongly agree if they believed the instructor to be a man.

Taken together, the two data sources show that a move from a ten-point scale to a six-point one can reduce the gender gap in performance evaluations even as underlying biases, as revealed by qualitative descriptions, remain. The use of random gender attribution for the survey experiment, meanwhile, shows that bias is verifiably a factor in gender gaps.

Numerical evaluations are often used to validate the existence of a pure meritocracy, in which people are judged by the quality of their work rather than their identities. However, Rivera and Tilcsik write, “Evaluative tools are not neutral instruments: their precise design—even factors as seemingly small as the number of categories available in a performance rating system—can have major effects on how female and male workers are evaluated.”


Support JSTOR Daily! Join our membership program on Patreon today.

Resources

JSTOR is a digital library for scholars, researchers, and students. JSTOR Daily readers can access the original research behind our articles for free on JSTOR.

American Sociological Review, Vol. 84, No. 2 (April 2019), pp. 248–274
American Sociological Association