Image displaying a desk with a stack of books topped with an apple, colored pencils, and ABC blocks

Can a Better Understanding of How Teacher Evaluation Works Help Improve Its Design? 

By David D. Liebowitz

How can we support all teachers so that they continue to learn and improve throughout their careers? How can we ensure that all students receive high-quality instruction every day in every classroom? These were some of the motivating questions that prompted forty-four US states to implement reforms to their teacher evaluation practices in the early 2010s. These new policies sought to improve student outcomes by providing developmental supports to grow teachers’ skills and by imposing accountability pressures to increase their effort levels. While these joint aims are firmly part of the design of most present-day teacher evaluation policies, researchers and policy makers have too infrequently reflected on the interactions between these goals. 
In an article written for the Harvard Education Review entitled, “Teacher Evaluation for Growth and Accountability: Under What Conditions Does It Improve Student Outcomes?” I propose a framework that establishes six conditions that determine the success of evaluation policies designed to promote teacher development and simultaneously impose higher-stakes on the evaluation process. Whether joint-aim teacher evaluation policies are successful hinges on the: 

  1. Validity and reliability of evaluation ratings 
  1. Accountability and incentive effects of evaluation on teachers’ skills 
  1. Feedback and coaching effects of evaluation on teachers’ skills 
  1. Effects of higher-stakes evaluation on the overall supply and demand of teachers 
  1. Effects of higher-stakes evaluation on the skills of teachers in labor market 
  1. Interactions between the growth and accountability dimensions of evaluation 

I reviewed evidence on these conditions from the fields of education, economics, social psychology, and organizational management. I found that while there may be unrealized potential (and some perils) of more intensive teacher evaluation practices, incentive- and sanction-based evaluation policies are in potential conflict with the theory and evidence justifying evaluation strategies to develop teachers’ skills. The growth and accountability aims of teacher evaluation may work at cross-purposes and cannot be effectively balanced, at least for some teachers. 
While political attention is currently directed towards other dimensions of teacher policy, given the historical evidence I review in my article, it seems likely that the pendulum will swing back around at some point to a renewed call for more intensive teacher evaluation practices. When conversations return to this issue, my review of the evidence suggests that policy makers should consider an alternative to current practice: design teacher evaluation policies that clearly delineate the application of accountability pressures and growth supports across different groups of teachers. 

About the Author

David D. Liebowitz is an assistant professor at the University of Oregon, where he studies education policy and school leadership. Before his tenure at Oregon, Liebowitz was a middle school English teacher and principal in Colorado and Massachusetts. He is the author of “Teacher Evaluation for Growth and Accountability” in the Winter 2022 issue of Harvard Educational Review