Friday, September 10, 2010

Economic Policy Institute Briefing Paper

The following is verbatim from the Executive Summary of one of the most important policy briefs about education in recent years. At a time when the Dept. of Education is pushing to tie teacher evaluation and compensation to student test scores, this Economic Policy Institute Briefing Paper (which is available in pdf), pulls together the extensive relevant research that demonstrates the dangers of pursuing such a path. Please continue reading as I explore this important document, released at 12:01 AM today, August 29.

First, let me clarify several things. 

This is a very long diary. That is because I am trying to reasonably thoroughly cover the contents of an extremely important document. My purpose in doing so is to convince people of the document's importance. Thus I will be perfectly happy should you decide you do not need to further read what I have written below. You can follow the link for the brief  (which is provided here), download the pdf, and begin reading. The executive summary is only four pages. The brief itself, without the critical apparatus of footnotes and sources, another 17. So if you want, one more time follow this link.


This document has been in the works for several months, and was NOT hurriedly put together as a response to the recent series by the Los Angeles Times which used value-added assessment to label teachers in the Los Angeles Unified School District. Second, the ten scholars whose names are on the document are some of the most eminent in educational circles, including among their midst former Presidents of the American Educational Research Association and the National Council on Measurement in Education, two of the three professional organizations most involved with psychological measurement, of which school-related testing is a subset. One of the scholars, Robert Linn, has not only presided over both of those organizations, he has also serve as chair of the National Research Council's Board on Testing and Assessment. The group also includes the immediate past president of the National Academy of Education, Lorrie Shepard, Dean of the School of Education at Colorado. A brief and applicable curricula vitae of each of the ten authors can be found at the end of the document, and briefer descriptions at the beginning, where each author is listed, along with the following statement:
Authors, each of whom is responsible for this brief as a whole, are listed alphabetically.
An email address is provided for further contact.

The ten authors, alphabetically, are as follows:
Eva L. Baker
Paul E. Barton
Linda Darling-Hammond
Edward Haertel
Helen F. Ladd
Robert E. Linn
Diane Ravitch
Richard Rothstein
Richard J. Shavelson
Lorrie A. Shepard

Let me be blunt. I do not know how anyone who knows the work of these scholars and who reads this brief can accept the idea of placing any stakes as to firing or awarding of merit pay based on the current status of Value-Added Assessment methodologies. The document is thorough. It reviews all the relevant studies, including one not yet in print. Those includes studies by Mathematica for the US Department of Education: by Rand: by the Educational Testing Service; done for the National Center for Education Statistics of the Institute of Education Sciences of the U. S. Dept. of Education; issued by the Board of Testing and Assessment of the Division of Behavioral and Social Sciences and Education of the National Academy of Sciences, and so on. There are citations from books, from peer reviewed journals. 

I am not a scholar. I am a high school social studies teacher. During now abandoned doctoral studies in educational policy I got interested in value-added assessment and devoured what studies there were in the educational literature. I also talked extensively with the technical person for one organization that offered a value-added methodology who cautioned me that the approach was not stable enough for it to be used as the basis for decisions with any kind of meaningful stakes. That was about a decade ago. What I had read since, and what I have absorbed from this study convinces me that the situation is not significantly better now. 

But you do not have to take my word for it. Let me offer a few key examples from the study. Those who follow me on Daily Kos already have seen in the study by Mathematica the high rate of error in determining superior and inferior teachers beyond the broad middle. In this diary, written on August 27, I noted that the error rate with 2 years of data was 36%, with 3 years 26%, and even with 10 years of data still 12%. 

But that is just the tip of the iceberg of the technical problems with using such an approach. 

Without recapitulating the entire brief, let me offer a couple of other key points.

1. Results for individual teachers are not stable:
One study found that across five large urban districts, among teachers who were ranked in the top 20% of effectiveness in the first year, fewer than a third were in that top group the next year, and another third moved all the way down to the bottom 40%. Another found that teachers’ effectiveness ratings in one year could only predict from 4% to 16% of the variation in such ratings in the following year.


2. One key question is whether one is really accounting for teacher effects and excluding other influences in the results one gets from value-added assessment. Jesse Rothstein reported something interesting, about which I quote from the Executive Summary:
A study designed to test this question used VAM methods to assign effects to teachers after controlling for other factors, but applied the model backwards to see if credible results were obtained. Surprisingly, it found that students’ fifth grade teachers were good predictors of their fourth grade test scores. Inasmuch as a student’s later fifth grade teacher cannot possibly have influenced that student’s fourth grade performance, this curious result can only mean that VAM results are based on factors other than teachers’ actual effectiveness.


3. The brief notes that arguments that the private sector evaluates professional employees using quantitative measures that are parallel. The authors of the brief point out that rarely are such quantitative measures the sole or even the primary factor, noting that management experts warning against using such measures for making salary or bonus decisions. They remind us that some of the distortion on Wall Street was the result of emphasizing short term gains that could be easily measured. They also touch on medicine:
In both the United States and Great Britain, governments have attempted to rank cardiac surgeons by their patients’ survival rates, only to find that they had created incentives for surgeons to turn away the sickest patients.


4. Students are not randomly assigned to teachers. While some control for school effects is possible, scholars are reluctant to place any weight on comparisons for teachers in different schools even within the same system. And even within a school, teachers may have varying numbers of students who are learning English or have learning disabilities or are homeless or who move multiple times, each of which is a factor that can affect learning.

5. Sample sizes are often too small. Even if the class makeup stays stable during the year, and all the students show up regularly, the N=30 of a large elementary class is too small a sample to provide a result that can allow strong inferences to be drawn. Often the makeup of the class changes during the year. If you exclude students who were not there all year, or whose absences exceed some designated level, the N decreases, providing a result of even less reliability. 

6. Some argue that statewide data banks can address the question of student mobility. But if you derive results on a year or two years of data where the student has moved, how much of the improvement can properly be assigned to any one teacher? Even in elementary school, do we account for pull-out instruction, or possible tutoring (that could in some cases be counterproductive) as a possible influence on the test results upon which we base our analysis?

7. Even with value-added analysis, to date scholars have not been able to isolate the impact of outside learning experiences, home and school supports, and differences in student characteristics and starting points when trying to measure their growth. 

8. A proper system of value-added assessment would have vertically scaled tests. Most states do not currently have such tests, for example, neither New York nor California does. That is, the tests in one grade are not necessarily congruent with those of the next along a continuum from year to year - we are not testing the same thing each year. As testing expert Dan Koretz of Harvard is quoted as noting,
"because of the need for vertically scaled tests, value-added systems may be even more incomplete than some status or cohort-to-cohort systems"
Here it is worth noting that cohort to cohort is comparing this year's fourth graders to last years, which is how Adequate Yearly Progress under No Child Left Behind has been calculated. 

9. If measuring end of year to end of year, even if there are vertically scaled tests, there is still the well-documented issue of summer learning loss, which falls disproportionally upon those of lesser economic means, which also means it falls disproportionally upon those of color, who are more heavily represented at the lower end of the economic scale. IF we do not control for summer learning loss, our results are skewed. Allow me to quote a relevant portion of the study:
researchers have found that three-fourths of schools identified as being in the bottom 20% of all schools, based on the scores of students during the school year, would not be so identified if differences in learning outside of school were taken into account. Similar conclusions apply to the bottom 5% of all schools.
The authors also cite a study that shows "two-thirds of the difference between the ninth grade test scores of high and low socioeconomic status students can be traced to summer learning differences over the elementary years."

There is more, but this should give a real sense of how much there is in this paper, how thoroughly the authors examine relevant material to demonstrate that value-added assessment, the supposed magic bullet to allow us to tie student learning back to the effectiveness of teachers, cannot properly fulfill the task some wish to give to it.

The authors acknowledge that value-added approaches are superior to some of the alternatives methods of using test scores to evaluate teachers. These are

status test-score comparisons - compare average scores of students of one teacher to those of another

over change measures - compare the average test results of a single teacher from one year to the next - remember, these are different students

over growth measures - a comparison of the scores of the students of the teacher this year to the scores of those same students the previous year when they had different teachers.

Each of these approaches has serious problems with it. One can read the detailed explanation on p. 9. Value-added assessments may be an improvement, but
the claim that they can “level the playing field” and provide reliable, valid, and fair comparisons of individual teachers is overstated. Even when student demographic characteristics are taken into account, the value-added measures are too unstable (i.e., vary widely) across time, across the classes that teachers teach, and across tests that are used to evaluate instruction, to be used for the high-stakes purposes of evaluating teachers.


"WHERE THE HELL IS DAVID SANCHEZ "

This is the fight of our professional careers. Are You In or Out?

What's taking so long? This is the fight of our professional careers. Are You In or Out? "Hell has a special level for those who sit by idly during times of great crisis."
Robert Kennedy

The Art of SETTING LIMITS, Its not as easy as it looks.

Art of Setting Limits Setting limits is one of the most powerful tools that professionals have to promote positive behavior change for their clients, students, residents, patients, etc. Knowing there are limits on their behavior helps the individuals in your charge to feel safe. It also helps them learn to make appropriate choices.


There are many ways to go about setting limits, but staff members who use these techniques must keep three things in mind:
Setting a limit is not the same as issuing an ultimatum.
Limits aren’t threats—If you don’t attend group, your weekend privileges will be suspended.

Limits offer choices with consequences—If you attend group and follow the other steps in your plan, you’ll be able to attend all of the special activities this weekend. If you don’t attend group, then you’ll have to stay behind. It’s your decision.
The purpose of limits is to teach, not to punish.
Through limits, people begin to understand that their actions, positive or negative, result in predictable consequences. By giving such choices and consequences, staff members provide a structure for good decision making.
Setting limits is more about listening than talking.
Taking the time to really listen to those in your charge will help you better understand their thoughts and feelings. By listening, you will learn more about what’s important to them, and that will help you set more meaningful limits.
Download The Art of Setting Limits

SYSTEMATIC USE OF CHILD LABOR


CHILD DOMESTIC HELP
by Amanda Kloer

Published February 21, 2010 @ 09:00AM PT
category: Child Labor
Wanted: Domestic worker. Must be willing to cook, clean, work with garbage, and do all other chores as assigned. No contract available, payment based on employer's mood or current financial situation. No days off. Violence, rape, and sexual harassment may be part of the job.

Would you take that job? No way. But for thousands of child domestic workers in Indonesia, this ad doesn't just describe their job, it describes their life.

A recent CARE International survey of over 200 child domestic workers in Indonesia found that 90% of them didn't have a contract with their employer, and thus no way to legally guarantee them a fair wage (or any wage at all) for their work. 65% of them had never had a day off in their whole employment, and 12% had experienced violence. Child domestic workers remain one of the most vulnerable populations to human trafficking and exploitation. And while work and life may look a little grim for the kids who answered CARE's survey, it's likely that the most abused and exploited domestic workers didn't even have the opportunity to take the survey.

In part, child domestic workers have it so much harder than adults because the people who hire children are more likely looking for someone easy to exploit. Think about it -- if you wanted to hire a domestic worker, wouldn't you choose an adult with a stronger body and more life experience to lift and haul and cook than a kid? If you could get them both for the same price, of course you would. But what if the kid was cheaper, free even, because you knew she wouldn't try and leave if you stopped paying her. Or even if you threatened her with death.



Congress Aims to Improve Laws for Runaway, Prostituted Kids

by Amanda Kloer

categories: Child Prostitution, Pimping

Published February 20, 2010 @ 09:00AM PT

The prospects for healthcare reform may be chillier than DC weather, but Democrats in the House and Senate are turning their attention to another warmer but still significant national issue: the increasing number of runaway and throwaway youth who are being forced into prostitution. In response to the growing concerns that desperate, runaway teens will be forced into prostitution in a sluggish economy, Congress is pushing several bills to improve how runaway kids are tracked by the police, fund crucial social services, and prevent teens from being caught in sex trafficking. Here's the gist of what the new legislation is trying to accomplish:

Shelter: Lack of shelter is one of the biggest vulnerabilities of runaway and homeless youth. Pimps will often use an offer of shelter as an entree to a relationship with a child or a straight up trade for sex. In the past couple years, at least 10 states have made legislative efforts to increase the number of shelters, extend shelter options, and change state reporting requirements so that youth shelters have enough time to win trust and provide services before they need to report the runaways to the police. Much of the new federal legislation would make similar increases in the availability and flexibility of shelter options.

Police Reporting: Right now, police are supposed to enter all missing persons into the National Crime Information Center (NCIC) database within two hours of receiving the case. In reality, that reporting doesn't always get done, making it almost impossible for law enforcement to search for missing kids across districts. This hole is a big problem in finding child prostitution victims and their pimps, since pimps will often transport girls from state to state. The new bill would strengthen reporting requirements, as well as facilitate communication between the National Center for Missing and Exploited Children and the National Runaway Switchboard

We Must Never Forget These Soldiers, Sailors and Airmen and Women

We Must Never Forget These Soldiers, Sailors and Airmen and Women
Nor the Fool Politicians that used so many American GIs' lives as fodder for the fight over an english noun - "Communism"