Standards Based Reform: The Kentucky Experience From: "Arthur Hu" To: arthurhu@halcyon.com Date sent: Wed, 22 Apr 1998 11:29:02 +0000 Subject: George Cunningham - Standards Based Reform Based on Defective As Send reply to: arthurhu@halcyon.com Priority: normal Kentucky reforms are a disaster after billions of dollars and 7 years. Why is Washington willing to wait 7 years before asking the question "Will This Work, and What is the Evidence?" Why am I the only citizen out of 5 million that's willing to stand up and say the emporer has no clothes? Does anybody actually believe that a certificate of mastery will mean that all will be at the highest international standards, all will be ready for college, and that we can end tracking and remediation??? That's the game plan Washington is signed on to, it's the same as Marc Tucker's original 1992 plan. posted at: http://www.leconsulting.com/arthurhu/98/05/aera.htm by Arthur Hu Email George Cunningham at gkcunn01@ulkyvm.louisville.edu See Arthur Hu's Critical Guide to Education Deform / Reform Standards-based education reform: The Kentucky experience George K. Cunningham University of Louisville Standards-based education reform: The Kentucky Experience George K. Cunningham: Dept. of Ed. and Counseling "There is always an easy solution to every human problem-neat, plausible, wrong." H. L. Mencken University of Louisville George K. Cunningham Louisville, KY 40292 University of Louisville Psych. The use of educational standards has become the favored form of educational reform in the United States. This reform strategy is justified by the assertion that the declining educational achievement of American students is the result of the failure to set sufficiently high standards. Types of standards There are two kinds of standards: (1) content standards, and (2) performance standards. Content standards tell us what students are supposed to learn. This is usually accomplished by creating a list of competencies, objectives, or goals. It is not difficult to compile such lists, but it can be almost impossible to reach consensus about what should be included. The differences of opinion about what should be included can be as basic as taste, as abstract as ideology, or may involve fundamental philosophical differences about why certain subjects are taught in school. Those who formulate content standards often find themselves entangled in endless debates about what should be included. What usually emerges is a document born in compromise that satisfies no one and angers most everybody. Once they are written and implemented, it is not easy to determine whether the prescribed content is actually being taught or whether the associated accountability instruments assess the intended objectives. Part of the problem stems from specificity. Most standards are stated so vaguely that it is impossible to determine exactly what should be included. If the authors of standards go to the opposite extreme and begin specifying in detail exactly what students should know at each grade level they will find themselves trapped. Suddenly they are expected to write down everything that students are supposed to know and the standards become impossibly long and detailed. Besides, the vagueness of standards is the best way to appease warring interest groups that are demanding that the content reflect their particular interests. Performance standards tell us how well the content must be learned. Establishing performance standards presents almost insurmountable technical problems. There is a need to pay attention to the measurement issues of reliability and validity in the assessments used to measure student performance, but the biggest problems occur in the process of determining the cut-points between acceptable and unacceptable performance. Performance standards can be either norm-referenced or absolute. Norm-referenced standards are based on relative student performance. A standard requiring that all students perform at grade level in reading is norm-referenced as are the cut-points used by states to specify the score on the PRAXIS exam a teacher candidate must obtain in order to be certified. Anytime student performance is defined in terms of the average, it is norm-referenced standard setting that is being used. This form of assessment is maligned by those who believe that it dooms some students to fail (someone has to be at the bottom) and because it emphasizes differences. At the other extreme, the use of norm referenced standards also is criticized for relative standards that appear inimical to the concept of excellence. Absolute standards are intended to be independent of what is known about the performance of typical students. Unfortunately, there is no way to evaluate student performance in the absence of knowledge about a typical student's performance. Although most state educational standards purport to employ absolute standards, they really don't. It is simply not possible to set a standard in the absence of knowledge about how the typical student performs. When this is tried, the results are often disastrous. A third grader's performance on a multiplication test, from a norm-referenced perspective, would be considered satisfactory if it exceeded a set percentile. Using an absolute standard, good performance would be attributed to a student who exceeded a pre-established number of correct responses to items. The difference between the two is more apparent than real. Anytime an educator begins to set a standard of performance for third graders in the area of multiplication, a knowledge of how typical third graders perform in this area would be essential. There is no way that someone who is unfamiliar with how third graders perform in math could se appropriate standards. Deciding what level of performance should be expected of high school students in subjects such as English literature, chemistry, and history is almost impossible without reference to typical student performance. Uses for standards There are two possible reasons that standards are implemented. (1) They can be used to ensure that all students reach a minimum level of performance or (2) to actually increase student achievement. If there is a concern that unqualified students are being promoted from grade to grade or allowed to graduate from high school, the imposition of standards is one way to prevent this from happening. Of course raising the standards required for promotion or graduation will inevitably increase the number of students who fail. Paradoxically, many systems that impose higher standards do so while simultaneously expressing the need to increase student graduation rates and eliminate retention. The only way to avoid this paradox is to assume that individual differences do not exist and demand that all students reach these goals. This is usually accomplished by holding teachers accountable for bringing all students up to this level. This is why it is so common for states and school districts to simultaneously demand that all students function at a designated high level and that graduation rates simultaneously increase. Such policies can have the opposite of their intended effect. Despite higher official standards, real standards can be lowered to ensure high grades and graduation rates to create the impression of high student performance. Merely raising standards will not ensure higher performance. You can't enhance student performance by merely demanding more. The standards set by the teacher and the strategies used by the students to master instructional goals is much more likely to affect learning than standards set by some distant, anonymous committee. At present the use of standards as a primary tool for reforming schools is based on a faulty set of assumptions. It is assumed that all students are blessed with the same capacity to learn and that all that is needed is for teachers to demand that they reach these clearly stated goals. This philosophy is articulated in a quote from an article supporting the use of certificates based on standards that appeared in the April issue of Phi Delta Kappan (1998): ".the certificate would be based on standards of achievement in core subjects that are benchmarked to what countries with the highest performance on international comparisons expect of their 16 year olds. This would mean an end to tracking and remediation: it would mean that everyone would be qualified to go to college, they argue. The use of the certificate would end the practice of setting different expectations for different groups of students. The common mantra "All Kids can learn" would finally become policy, and what they would be expected to learn would for the first time be explicit. This is a wonderfully idealistic goal but it is divorced from reality. The idea that there are goals so perfected that they can, in effect, eliminate all educational problems represents a dangerous break from reality. The Kentucky experience The educational reform program established in Kentucky in 1990 is considered to be the most comprehensive state education reform plan of any state (McDonnell, 1997). The law that mandated the reform is called the Kentucky Education Reform Act (KERA) and it was originally implemented as a response to a decision rendered by Judge Ray Corns of the Franklin Circuit Court in the Rose vs. Council for Better Education (1989). Judge Corns ruled that the Kentucky General Assembly had failed to provide an efficient system of common schools as required by the above section of the state Constitution. Although the judge's ruling required only changes in school finance, the governor at the time, Wallace Wilkinson, along with other legislative leaders used this opportunity to make radical changes in the states educational structure. Kentucky had languished near the bottom of all states in some categories of educational performance and concern about the state's relative performance was used to justifying the need for radical educational reform. It is important to point out that after seven and a half years of KERA, Kentucky has not improved its performance in any of the categories cited as the justification for this educational reform. KERA has many facets, which include a more equitable distribution of funds for school districts, the ungraded primary, schools-based decision-making, preschool programs, a reorganized department of education, extended school services, and several others. The most important impact KERA has had on instruction came through the Kentucky Instructional Results Information System (KIRIS), the KERA accountability system. KIRIS is a standards-based system predicated on the belief that student achievement can be enhanced by raising the expectations we have for the average performance of schools. It is assumed that by rewarding the staff at high performing schools and punishing those at low performing schools will result in a higher level of student achievement. The KIRIS assessment is administered to 4th and 5th grade students in elementary school, 8th graders in middle school and 11th and 12th graders in high school. It assesses reading, mathematics, science, social studies, arts and humanities, practical living and vocational skills with essay questions. Writing portfolio are used to assess skill in written expression. A small proportion of a school's score is based on graduation, rates and retention. Other assessment techniques that were once part of KIRIS are performance tasks, math portfolios, and multiple-choice items. The Kentucky Department of Education (KDE) is now proposing that some of these methods be retried. Each school is assigned an accountability score based on average student performance within a school and this index can range from zero to 133. All schools are to reach a score of 100 by the year 2012. Schools are assigned accountability score goals and teachers in schools that reach their goal are awarded cash bonuses while those in schools that perform poorly on KIRIS are placed on probation and risk being fired. The usual sequence when a state adopts academic standards is to start with content standards in order to specify what students are supposed to be learning. Some states take it no further, but in some cases, states specify at what level students are to master the content specified. The next step is to use the content and performance standards as the basis for an accountability system. In Kentucky, the process was reversed. The tests were developed first and only later was attention given to the content to be covered. When the KERA legislation was implemented, the legislature specified six learner goals and mandated the creation of standards, which would specify what students should learn and the content included on the test. At the same time, they were in a hurry to begin the implementation of the testing program. Testing began before any of the standards were written. Since its inception, there have been five separate sets of standards published. These standards, in the order of their release, are the Learner Outcomes, the Transformations, the Academic Expectations, the Content Guidelines and the Core Content for Assessment. Each of these was intended to be the final word on what students were supposed to learn and the basis for the KIRIS assessment. These standards differ among themselves in terms of content and philosophy, but provide minimal guidance for teachers endeavoring to prepare students for the KIRIS assessment. Throughout the implementation of KERA, the high standards implicit in the assessment have remained fixed in the test itself, and never manifested in the published standards, which seem to be always chasing the assessment. At the present time, the system seems to be teetering on the verge of collapse. The heart of the assessment system was supposed to be performance tasks, but these were abandoned when it was discovered that there was no way to equate the items administered from year to year and their use yielded incomprehensible results. The Kentucky Finance Cabinet initiated an audit in an attempt to recover millions of dollars from Advanced Systems for Measurement in Education (ASME) for the cost of developing and scoring the performance tasks that could not be used. The results of the audit were inconclusive because there are no written records documenting how the money had been spent. Changes in the scope of the contract and the resulting adjustments in costs and payments took place on the basis of verbal agreements. It was also discovered that ASME had made a programming error that resulted in inaccurate scores for elementary and middle schools. Under pressure from the legislature, the Commissioner of Education canceled the contract with ASME. The Kentucky General Assembly met during February and March of 1998 and faced enormous public pressure to radically change or eliminate KIRIS. In the Senate a bill, which would have made substantial changes in KIRIS (SB 243) passed 35 to 1 in the Senate. The House passes a bill (HB 627) that would have kept the system essentially the same. This conflict required a compromise, which was worked out in a conference committee with representatives from both houses. The resulting compromise was SB 53. It is difficult to predict what impact this bill will have on the KERA accountability system. The bill states that the system is to be changed, but it does not specify how this will be achieved. Its authors can claim to have accomplished the one goal that was so strongly demanded by the electorate, the elimination KIRIS. The name of the test was changed to the Commonwealth Accountability Test System (CATS). I suppose this is somehow in honor of the Kentucky Wildcats winning the NCAA men's basketball tournament. Perhaps it is believed that critics would be reluctant to criticize anything with that name. The legislation does make a commitment to change the test, but it doesn't specify exactly how. The Senate Bill would have thrown out all previous results and there would have been no rewards or sanctions until the year 2000. The compromise bill gives rewards to all schools that are not in decline or about seventy-five percent of them. The determination of which schools are eligible for rewards is to be based on the scores obtained in 1967 and 1968. This year and in the future, it is the schools that will be rewarded rather than individual teachers. The sanctions have been suspended and it is not clear whether they will be reinstated. Schools that perform poorly will be subject to audits and assistance, but not sanctions. During the legislative debates, the most contentious issue was whether the tests themselves were reliable and valid. KIRIS supporters argued that the test itself was of high quality while its opponents asserted that it was almost worthless. There was general agreement among both camps that the accountability system or the way the scores were used to hold school accountable was deeply flawed. The new legislation mandates a new test, utilizing both essay and multiple-choice items, but since this is what has been used in the past, it is not clear how this new test will be different. The legislation also mandates a new accountability system, but does not provide any specifics about what it will look like. All that is implied is that the new system will continue to hold schools accountable for having their students reach high standards. The specious claim that essay tests were performance assessments has been dropped. Although they have proved to be the least reliable of any of the previously used measures, the new legislation permits the use of writing and math portfolios. The Department of Education will design the revised CATS accountability system. Several oversight committees will monitor their work. There will be a permanent legislative subcommittee, a 15-member curriculum advisory committee appointed by the governor, and a national technical panel of testing experts. Kentucky has had seven long years to make this system work and has spent billions on it. What is troubling is the dearth of evidence for its success. The critical question at this point is whether there was something inherently wrong with the Kentucky accountability system or whether it is the assumptions made about the use of standards to raise student achievement that is flawed. The Kentucky system and the similar systems being imposed across the country are based on the belief that all students can reach the same high level of performance and that this can be accomplished with the proper delineation of standards. The only way this can happen is if the standards are set very low. If not, educators must brace themselves for a high failure rate. In Kentucky the standards and expectations were high for the performance of students within schools, but not for the students themselves. The system is set up to demand that the average performance for every high school in the state be at the level of a typical graduate school student. The seeming impossibility of the task was dismissed by saying that schools had 20 years to accomplish this. When it became increasing clear that the system was not working and everyone was headed for failure, the legislature acceded and agreed to change the system. It is not clear that they have abandoned their flawed assumptions about the expected level of performance for all students. What they are willing to manipulate is the process for achieving the goal. Again there is the promise of a magical set of standards that will solve all educational problems. References Dawkins, R. (1998) Science delusion and the appetite for wonder. Skeptical Inquirer, 22(2), 28-33, 58. Lewis, A. (1998) School-to-work certificates of mastery and standards. Phi Delta Kappan 72(8), 563,564. McDonnell, L. M. (1997) The politics of state testing: Implementing new student assessments (CSE Tech. Rep. No. 424). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing. Rose v. Council for Better Education, INC. (1989). 790 S.W. 2d 186 (Ky. 1989). +- Views do not reflect any other organization or group--------------+ Arthur Hu Check out collecting toys page Hot Wheels, JL arthurhu@halcyon.com Matchbox McD BK toys Thomas Tank Engine email to join my toy mailing list Kirkland WA 98034 http://www.halcyon.com/arthurhu/collect.htm