Search in:

Journal of Public Affairs Education Volume 25, 2019 - Issue 1: 25th Anniversary Special Edition

Submit an article Journal homepage

Free access

1,064

Views

CrossRef citations to date

Altmetric

Listen

Guest Editorial

JPAE at 25: Looking back and moving forward on teaching evaluations

Heather E. CampbellClaremont Graduate UniversityCorrespondence[email protected]
View further author information

Pages 23-29 | Published online: 31 Jan 2019

Cite this article
https://doi.org/10.1080/15236803.2018.1558823
CrossMark

In this article

ABSTRACT
Introduction
Some evidence that SETs are systematically biased
Conclusions
Additional information
Footnotes
References

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF View EPUB EPUB

ABSTRACT

In many if not most colleges and universities in the United States, raw scores from Student Evaluations of Teaching (SETs) are the primary tool of teaching assessment, and teaching evaluations often have real consequences for promotion and tenure. In 2005, JPAE published an article on teaching evaluations, and this article added to what was at that time a somewhat thin literature indicating that SETs are systematically biased against female faculty, and probably against older and minority faculty. Since that time, this literature has swelled and grown and now the evidence that SETs are invalid and systematically biased is too strong to ignore. Over its first 25 years, JPAE has been a force for good in public affairs education. As JPAE moves into its next 25 years, it should take a principled and evidence-based stand against the use of raw SETs as an important indicator of teaching quality, and should encourage high-quality articles studying other methods of assessing teaching so that we can learn what approaches are better.

KEYWORDS:

Evidence-based assessment
systematic bias
teaching evaluations

Introduction

Since its inception, the Journal of Public Affairs Education – affectionately known as JPAE – has been a promoter of excellence in public affairs (PA) education.Footnote¹ Its very existence validated taking teaching seriously and gave serious teachers a double-blind refereed outlet for research on teaching, thus justifying work on teaching within the structure and incentives of the academy. Through its articles, it has focused attention on a variety of different teaching methods, including active learning and service learning, as well as online learning, with significant focus and even experimentation. It has also, from very early on, turned attention to the teaching of ethics and integrity in PA programs, with the first number of volume 4 including two articles and a special issue on teaching ethics in the Master of Pubic Administration (JPAE Citation1998), and with many articles on this topic since. It has addressed important social issues, calling attention to issues of minorities in the academy, PA students who will return from PA education in the US to oppressive regimes, and cultural competency. In short, JPAE, NASPAA’s flagship journal, has been a voice for good in the PA fields.Footnote²

However, there is one issue that JPAE has largely ignored but now should address: the systematic bias in Student Evaluations of Teaching (SETs). Both from a legal perspective (Lawrence, Citation2018; Mitchell, Citation2018) and from an ethical perspective, mounting evidence (reviewed below) that SETs are systematically biased against females and probably other groups invalidates their primary use.

Some evidence that SETs are systematically biasedFootnote³

The following review is by no means a full evaluation of what is becoming a large literature on this topic. But it illustrates that the evidence that SETs are biased and invalid is longstanding and robust, and crosses a variety of research approaches.

An early article on the issue of biases in teaching evaluations was published by Elaine Martin in 1984. Martin (Citation1984) provides a good overview of related literature preceding her study and notes that, even in “laboratory research,” “sexism biases evaluations of the work of men and women” (p. 484). In her own study, she finds that sex bias was “most prevalent when students evaluate female social science instructors” (Martin, Citation1984, p. 489). As stated by Lisa Martin in 2016, “More than 30 years ago, [E.] Martin (Citation1984) wrote that the ‘message to women faculty seems clear: if your institution bases personnel decisions on student evaluations, make sure your colleagues are aware of the possibility of sex bias’ (Martin, Citation1984, p. 492)” (L. Martin, Citation2016, p. 317).

Ten years later, in work directly related to teaching in PA fields, Laura Langbein, of American University’s School of Public Affairs, published research indicating that teaching evaluations in her school were systematically biased against female instructors, and especially against female instructors who were tough graders (Langbein, Citation1994). This accords with much-earlier work by Kaschak (Citation1978), indicating that “male students were far more likely to give lower ratings to those female faculty perceived to be hard graders” (Martin, Citation1984, p. 484). Perhaps even more troubling, however, Langbein’s work goes beyond a concern with systematic bias to a broader concern with validity overall. She finds that, “of the variables examined, course characteristics have the smallest impact on student ratings, student characteristics have a mid-range impact, and the faculty characteristics of gender and experience have clearly the largest impact” (Citation1994, p. 551; emphasis added). Pondering on all of her results, she muses: “It is, in fact, unclear exactly what the student ratings really measure” (p. 551). This concern is supported in other work, for the amalgam of factors that have been found to affect teaching evaluations indicate that even non-gender-biased SET numbers would often include several factors that have nothing directly to do with instructor skill, including the size of the class, the time of day, and whether the class is required or elective (see reviews, including in Baldwin & Blattner, Citation2003).

What appears to be the first articles in JPAE discussing teaching evaluations was provided by Leslie Whittington, who won NASPAA’s Excellence in Teaching Award in 2000.Footnote⁴ In her article Whittington points out, as have many others, that “Students’ evaluations of their teachers are frequently the sole method that academic institutions use to determine the quality of the individual faculty members” (Citation2001, p. 5). She also argues that there is evidence that SETs are reliable and therefore scholarship demands that we use them. In her piece, Langbein (Citation1994) also argues that there is evidence of SET reliability, but states that we have much less evidence of validity.

What appears to be the first article in JPAE on issues of discrimination in SET appears in 2005 (Campbell, Steiner, and Gerdes). Overall, the authors find that

The results provide some useful information about how better to connect with students but also indicate that SETs are systematically biased against female teachers, older teachers, and perhaps minority teachers [as is not uncommon, sample size for minority professors was too small for statistical significance but estimated coefficients indicate bias]. These findings call into question de facto higher education policy making SETs our most important measure of teaching quality. (Citation2005, p. 211)

The quality of this article was judged high in the NASPAA community: In 2006, it won the “NASPAA Outstanding Article Award, 2005.” In this study, on the bright side, how much the students judged they learned was a very important predictor of teaching evaluation, with a 2-point increase in learning worth 1.2 points on a 10-point scale; unfortunately, the second-most important factor was sex, with a female instructor earning 0.8 points less on a 10-point scale than an otherwise statistically identical male.Footnote⁵ The results indicate that teachers can improve their SETs by giving review sessions and extra credit, but “the decrease in SETs caused by gender swamps the estimated effect of giving review sessions (−0.285) and giving extra credit (−0.148), combined” (p. 227, emphasis added).

Since that time JPAE has had few articles on teaching evaluations and no published follow-up on whether biases still exist or may have eroded over time. For example, published in JPAE seven years later, Otani, Kim, and Cho (Citation2012) references Campbell et al. (Citation2005), but, focusing on “how to use SET more effectively and efficiently” in PA education, ignores the potential effects of race/ethnicity, age, and gender in the analysis. Therefore, though otherwise a quality article, we must assume that the statistical results exhibit omitted variable bias. I was pleased to see that the call for papers for the special symposium issue of JPAE sponsored by Academic Women in Public Administration (AWPA) explicitly mentioned “gender bias in student evaluations” (AWPA April 2018, n.p.), but since abstracts are due September 1 and notification of acceptance will not be until November 15, it is unclear if this will lead to additional articles in JPAE on this topic, or not.

In the social sciences broadly, however, we have seen an increasing number of articles on this topic, finding various different types of analysis showing systematic biases, especially against women, but also indicating disturbing trends for other non-white-male groups. For example, writing more than 30 years after E. Martin, L. Martin (Citation2016) finds evidence of sex bias in evaluations of large Political Science classes.

E. Martin (Citation1984), Langbein (Citation1994), Campbell, Gerdes and Steiner (Citation2005), and L. Martin (Citation2016) are all cross-sectional and statistically inferential, making strong causal arguments less persuasive. But various experimental and quasi-experimental methods since provide greater evidence of causation. As reviewed by L. Martin (Citation2016),

Arbuckle and Williams (Citation2003) undertook a fascinating experiment in which students viewed a stick figure that delivered a short lecture. All participants observed the same stick figure and the same lecture but the figures were given labels of old or young and male or female. Participants significantly rated the figure labeled as a young male as the most expressive, which illustrates that students’ expectations influence their perception of an instructor independent of the material or how it is delivered. (p. 314)

Recently, MacNell, Driscoll, and Hunt (Citation2015) used the reality of distance in online education to advantage:

it is possible to disguise an instructor’s gender identity online. In our experiment, assistant instructors in an online class each operated under two different gender identities. Students rated the male identity significantly higher than the female identity, regardless of the instructor’s actual gender. (p. 291)

Perhaps even more strikingly, this occurred even on factors that would appear to be fairly objective (L. Martin, Citation2016):

For example, when the actual male and female instructors posted grades after two days as a male, this was considered by students to be a 4.35 out of 5 level of promptness, but when the same two instructors posted grades at the same time as a female, it was considered to be a 3.55 out of 5 level of promptness. (p. 330, emphasis in original)

Mitchell and Martin (Citation2018) also found that “a male instructor administering an identical online course as a female instructor receives higher ordinal scores in teaching evaluations, even when questions are not instructor-specific” (p. 648).

L. Martin also reviews work by Miller and Chamberlin (Citation2000) that indicates that students “perceive male instructors as having higher or superior credentials” (Citation2016, p. 314). In keeping with this finding, I once received a student note in my faculty mailbox that said “Dear Mrs. Campbell, Dr. David Pijawka suggested that I contact you because you are an expert on….” Apparently, even the fact that I was recommended as an expert by someone who was himself viewed as an expert did not overcome the idea that I was a wife rather than a professor. El-Alayli, Hansen-Brown, and Ceynar (Citation2018) find that certain students “request more special favors from female professors” and exhibit “negative emotional and behavioral reactions to having those requests denied. This work highlights the extra burdens felt by female professors” (p. 136).

In addition to concern regarding bias, research since Langbein (Citation1994) has supported the idea that teaching evaluations are not valid. For example, recent work by Anne Boring indicates that “Men are perceived by both male and female students as being more knowledgeable and having stronger class leadership skills… despite the fact that students appear to learn as much from women as from men” (Citation2016, p. 27). Lawrence (Citation2018) finds the accumulation of evidence regarding the validity of SETs so compelling that he simply entitles his article “Student Evaluations of Teaching are Not Valid.”

Conclusions

I am not by any means the first person to argue that we should stop relying on raw SETs as our primary indicator of teaching quality – nor is this the first time I have argued this. But we are at a time when the barrage of evidence and the chorus of calls means that we could possibly be approaching a social tipping point. Having an important champion such as JPAE could reduce the influence of this severely flawed method on academic PA careers. Based on the increase in evidence, some schools are beginning to move away from SETs (Flaherty, Citation2018). JPAE could help encourage this throughout PA programs. It is not enough for SETs to be reliable, they must also be valid (Langbein, Citation1994) – but increasing evidence shows that they are not. To reframe one of Whittington’s important points: “to disregard the large supporting body that documents that fact is simply not scholarly” (Citation2001, p. 5).

NASPAA and its flagship journal, JPAE, have been influences for good in PA education. Encouraging PA programs to drop this severely flawed, inequitable, unethical, and possibly illegal primary method of evaluating teaching could help the discipline. And JPAE itself can encourage the production of research that helps us learn what we should be doing instead.

Right now, we know that SETs are flawed, but we don’t necessarily know what is better; schools that are dropping SETs are trying something else (Flaherty, Citation2018), but will these methods exhibit the same biases? As Laura Langbein ended her article in 1994:

It is probably a good time … to supplement the SETs with other, related tools that share similar strengths but have different weaknesses. It is not reasonable to expect that a single methodology will measure teaching quality with reliability and validity….Together, however, the use of multiple measures makes it possible to attain a reasonable degree of both reliability and validity. (Langbein, Citation1994, p. 552)

It has been almost 25 years since Langbein reached this conclusion. Perhaps it was not the time then, but it is certainly time now. JPAE, champion the charge to find valid and unbiased measures of teaching quality!

Additional information

Notes on contributors

Heather E. Campbell

Heather E. Campbell is the chair of the Department of Politics and Government at Claremont Graduate University. She received her BA from the University of California at San Diego and her MPhil and PhD from Carnegie Mellon University. She served as editor-in-chief of the Journal of Public Affairs Education from 2009 to 2010.

Notes

1. Before 1998 JPAE was called the Journal of Public Administration Education.

2. NASPAA, formerly the National Association of Schools of Public Affairs and Administration, is now the Network of Schools of Public Policy, Affairs, and Administration.

3. This section owes thanks to Mitchell’s 2018 article in Slate for identifying a number of interesting articles on this topic.

4. This award was later renamed after her and is now called the NASPAA Leslie A. Whittington Excellence in Teaching Award. I note that when Whittington won the Excellence in Teaching Award in 2000 she was only the second female out of 8 recipients; in the history of the award under either name, 16 males and 8 females have won this important national award (http://www.naspaa.org/principals/awards/past.asp#Leslie).

5. This result was tied with a 2-point increase in whether the instructor was judged to use student feedback, which resulted in a 0.8-point increase in the SET, cet. par. (p. 226).

References

Academic Women Public Administration (AWPA) (April 2018). “AWPA Symposium – Call for Papers.” Email sent 12 April 2018 by [email protected].
Google Scholar
Arbuckle, J., & Williams, B. D. (2003). Students’ perceptions of expressiveness: Age and gender effects on teacher evaluations. Sex Roles, 49, 507–516. as cited in L. Martin (2016).
Web of Science ®Google Scholar
Baldwin, T., & Blattner, N. (2003). Guarding against potential bias in student evaluations: What every faculty member needs to know. College Teaching, 51(1), 27–32. doi:10.1080/87567550309596407
Google Scholar
Boring, A. (2016). Gender biases in student evaluations of teaching. Journal of Public Economics, 145, 27–41. doi:10.1016/j.jpubeco.2016.11.006
Web of Science ®Google Scholar
Campbell, H. E., Steiner, S., & Gerdes, K. (2005). Student evaluations of teaching: How you teach and who you are. Journal of Public Affairs Education, 11(3), 211–231. doi:10.1080/15236803.2005.12001395
Google Scholar
Campbell, H. E., Gerdes, K., & Steiner, S.. (2005). What’s looks got to do with it? Instructor appearance and student evaluations of teaching. Journal of Policy Analysis and Management, 24(3), 611–620.
Google Scholar
El-Alayli, A., Hansen-Brown, A. A., & Ceynar, M. (2018). Dancing backwards in high heels: Female professors experience more work demands and special favor requests, particularly from academically entitled students. Sex Roles, 79, 136–150. doi:10.1007/s11199-017-0872-6
Web of Science ®Google Scholar
Flaherty, C. (May 2018). “Teaching eval shake-up.” Inside Higher Ed, May 22, n.p. https://www.insidehighered.com/news/2018/05/22/most-institutions-say-they-value-teaching-how-they-assess-it-tells-different-story
Google Scholar
Journal of Public Affairs Education (JPAE). (1998). Special Issue on the moral education of public administrators: Teaching ethics in the MPA. 4(1), 25–63.
Google Scholar
Kaschak, E. (1978). Sex bias in student evaluations of college professors. Psychology of Women Quarterly, 2(3), 235–243. as cited in E. Martin (1984), fn. 9. doi:10.1111/j.1471-6402.1978.tb00505.x
Web of Science ®Google Scholar
Langbein, L. (1994). The validity of student evaluations of teaching. PS: Political Science and Politics, 27(3), 545–553.
Web of Science ®Google Scholar
Lawrence, J. W. (2018). Student evaluations of teaching are not valid: It is time to stop using SET scores in personnel decisions. Academe, 104(3), n.p. Retrieved from https://www.aaup.org/article/student-evaluations-teaching-are-not-valid
Google Scholar
MacNell, L., Driscoll, A., & Hunt, A. N. (2015). What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40, 291–303. doi:10.1007/s10755-014-9313-4
Web of Science ®Google Scholar
Martin, E. (1984). Power and authority in the classroom: Sexist stereotypes in teaching evaluations. Signs, 9, 482–492. doi:10.1086/494073
Web of Science ®Google Scholar
Martin, L. L. (2016). Gender, teaching evaluations, and professional success in political science. PS: Political Science and Politics, 49(2), 313–319. Retrieved from https://www.cambridge.org/core/journals/ps-political-science-and-politics/article/gender-teaching-evaluations-and-professional-success-in-political-science/DD308471A6A4576215B9130CD6F743BD/core-reader
Web of Science ®Google Scholar
Miller, J., & Chamberlain, M. (2000). Women are teachers, men are professors: A study of student perceptions. Teaching Sociology, 28(4), 283–298. as cited in L. Martin (2016). doi:10.2307/1318580
Web of Science ®Google Scholar
Mitchell, K. (March 9, 2018). “Student evaluations can’t be used to assess professors: Our research shows they are biased against women. That means using them is illegal.” Slate. Retrieved from https://slate.com/human-interest/2018/03/student-evaluations-are-discriminatory-against-female-professors.html
Google Scholar
Mitchell, K. M. W., & Martin, J. (2018). Gender bias in student evaluations. PS, The Teacher, July, 648–652. doi:10.1017/S104909651800001X
Google Scholar
Otani, K. B., Kim, J., & Cho, J.-I. (2012). Student evaluation of teaching (SET) in higher education: How to use SET more effectively and efficiently in public affairs education. Journal of Public Affairs Education, 18(3), 531–544. doi:10.1080/15236803.2012.12001698
Google Scholar
Whittington, L. A. (2001). Detecting good teaching. Journal of Public Affairs Education, 7(1), 5–8. doi:10.1080/15236803.2001.12023490
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

JPAE at 25: Looking back and moving forward on teaching evaluations

ABSTRACT

Introduction

Some evidence that SETs are systematically biasedFootnote³

Conclusions

Notes on contributors

Heather E. Campbell

References

Information for

Open access

Opportunities

Help and information

JPAE at 25: Looking back and moving forward on teaching evaluations

ABSTRACT

Introduction

Some evidence that SETs are systematically biasedFootnote3

Conclusions

Additional information

Notes on contributors

Heather E. Campbell

Notes

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Some evidence that SETs are systematically biasedFootnote³