Ummm….I think UFT President Randi Weingarten means “emergency” rather than “urgency”, but oh well.
Regardless of the analogical faux pas, Weingarten is absolutely right. Implementation IS absolutely key. In all of the current rhetoric surrounding teacher evaluation, value-added, and all of the pitfalls and opportunities therein, the biggest obstacle is not whether we can come up with good ideas, but rather can we actually make those good ideas occur.
As a teacher within LAUSD for the past 8 years, there has been absolutely no shortage of good ideas, yet nearly all of the new programs and initiatives that have come down through the system have either faltered, ignored, or been abandoned completely – not because they didn’t make sense (on paper, they did), but rather because the hard job of implementing a system well often results in a useless (or worse, harmful, system).
Case in point. The current teacher evaluation system, affectionately known as the STULL, has not failed because it was a bad idea. It failed because principals were so overwhelmed with putting out daily fires on their campuses, that the amount of time, energy, and effort to thoroughly implement a STULL-style evaluation was not thought to sufficiently move the needle of student achievement enough to justify the enormous amount of time it took. Because of this reality, the STULL implementation became a checklist to simple get through.
In school systems, it’s important consider the fact that implementation strategies and realities cannot merely be considered after policy is developed. Simply put, any policy itself MUST take into consideration and be driven by the realities of how implementing such a system might work.
For example, in 2012, all the educational conversation is about revising the teacher evaluation system. And while most school districts have come to some consensus that “multiple measures” – usually taken to mean some combination of observation, student test scores, student surveys, and perhaps another measure – are the new way of the future. Unsurprisingly the biggest argument is how to weight the different components; should test scores count as 40% of an eval? should they count 10% of an eval? might there be some sliding scale?
Yet what is often missing from these discussions of policy is that our ability to implement these systems at scale should affect our initial policy in how we decide to weight certain components. We cannot afford to make these decisions in a vacuum because of what “ought to be”. That is, we cannot, for example, decide that since value-added scores are twice as reliable as observation data, then it’s percentage ought to account for twice as much as observation (which no one, right now, is currently saying – just an example). We must analyze what resources we have as a school system to actually implement these measures well and then (and only then) make decisions about the nuts and bolts of the policy.
Yes, there is an urgency to education policy right now, but that does not give us license to ignore the ways we’ve failed with implementation in the past. If we do so, we are bound to merely repeat the same mistakes. Trust me, no one wants any new evaluation system to wind up as STULL 2.0