STANDARDS SETTING PROCEDURE DOES _NOT_ DIRECTLY COMPARE AGAINST BENCHMARKS! Paul Englesberg notes that the technical manual does not mention judges comparing questions against any benchmarks, only setting a difficulty level of problems already "established" to be appropriate to the grade level. It is my understanding from Jeff Estes who was on one of these committes that they never questioned the inclusion of any of the problems, they only put a "bookmark" where they thought the pass point should be. Thus if a question appears with independent probability or indirect measurement or proportionality, the "standards setting committee" had nothing to do with the inclusion of the problems in the first place. The technical document does not state who put these questions in the first place, or who did the checking. This is probably the reason why we have a 4th grade test that is full of EALRs from the 7th and 10th grade levels in it, even though it is claimed the standards setting committee has checked it against the benchmarks. The example test does not tell where this pass point is, but George Cunningham of U Loisville says that such methods are highly unreliable. The WA test is derived from work in the New Standards project, and this process is documented exactly in new articles from any number of other states, from what I've seen of New Standards and other state's example problems, they also suffer from the same problem of not corresponding to any written grade-level standards (frequency histograms, complex woodworking, ratio, probabilty, etc) The whole ed reform is based on "outcome based education" a system so thoroughly discredited that although "performance-based" education was simply defined as OBE in initial 1209 handouts, OBE no longer appears anywhere in CSL or OSPI literature. OBE is what is fundamentally flawed. All on the list need to do research on the foundations and assumptions of OBE and why it is now so discredited that no one call it by its original name or Mastery Based Education was it was known before then. "Standards - Based " Education is only the newest name for OBE. Even if you fix the process, we still have the problem that constructed response problems based on skills which are not taught have a disproportionate impact on minorities, it still costs 10 times as much to score, it has a measurable level of inherent scoring unreliablity compared to multiple-choice questions where a computer can score 100% correctly, and no one can argue as to what constitutes a correct answer (remember both examples published by OSPI for math had INCORRECT answers, and both were non-compliant with published benchmarks) The fact is that we've simply cobbled together a slightly modified version of what the NCEE's Marc Tucker has been peddling to other states with almost no evidence of positive results, and many examples of huge disasters (CA CLAS and KY KIRIS stand out, WASL will be remembered as the next big disaster) Date sent: Tue, 15 Sep 1998 16:49:23 -0700 To: arthurhu@halcyon.com, owner-wa-math-sci@mickey.esd113.wednet.edu, wa-legislation@inspire.ospi.wednet.edu, wa-esslrngs@whitecap.psesd.wednet.edu From: Paul Englesberg Subject: Re: Criticism of test vs benchmarks > In further response to points raised by Arthur Hu, I'd suggest that people > interested in the asessment vs. benchmarks issue refer to the CSL documents > on Standard Setting Procedure, "Standard Setting Procedure for Grade 4: > Listening, Reading, Writing and Mathematics" at: > http://csl.wednet.edu/Web%20page/3%20Assessment%20System/subdocuments/Techni > cal%20Manual/A-InformManual.html > > Here's a brief excerpt: > "The purpose of the standard-setting process then was to establish the level of > performance expected of Grade 4 students who are judged as meeting the > standard in > listening, reading, writing, and mathematics. The emphasis for the judges in > the standard setting process was on what students should know and be able to > do near the end of Grade 4...." > "Next, in small groups, the judges examined the items in the ordered booklet > one at a time, starting with the first (easiest) item in the booklet, and > moving to the second easiest item, and so on, until all items (and their > scoring rubrics) were examined. As judges examined each item, they were > asked to consider: > > What is each item measuring? > What makes each item more difficult than the items that precede it? > > Judges proceeded through the ordered item booklets and specially trained > table leaders > encouraged them to observe the increase in the complexity of the items and > note the increase in knowledge, skills, and abilities required to answer the > items...." > > This document discusses the setting of "bookmarks" for the standards by > judges, but doesn't seem to directly discuss reference to the 4th grade > benchmarks. I would certainly expect that the standard-setters would be > constantly referring to the EALR benchmarks. Can anyone elaborate? > While Arthur Hu seems to assume that no one is paying attention to > standards in a rational way and the whole system/reform effort is > wrong-headed, it seems to me that existing discrepancies come from some > weaknesses in the system that need to be, and can be, addressed. Surely a > new approach with a new assessment system will have some problems - both > major oversights and minor glitches. But opposing the "whole ed reform" > becuase it rests on "goofed up" concepts leads to Mr. Hu to use these > weaknesses to condemn everything that is being done rather than strive for > improvements. I'd like to hear what others have to say about these issues. > > > At 03:16 PM 9/15/98 -0700, Arthur Hu wrote: > >Indeed, you are correct that I have problem on two levels, one, the test > >is not consistent with its own benchmarks as to what is to be > >taught at which grade, and two, the whole ed reform "all can > >succeed" assumption is goofed up from the very concept. > > Paul Englesberg > Woodring College of Education > Western Washington University > Bellingham, WA 98225 > ph: (360) 650-7527 > email: pengle@wce.wwu.edu >