Notes
Outline
Hu’s 4th Grade WASL Math Study
Oct: 2003: It’s Still Busted
OSPI: Please Fix it!
http://www.arthurhu.com/2002/10/mathstud.htm
Arthurhu@attbi.com
Why Revisit the WASL 4?
Original critic of 1997 WASL announcements
Hu found as many as half of problems inconsistent with state benchmarks
Prof Don Orlich found WASL 4 beyond half-dozen known well established measures of normal developmental levels.
Critic of 4th grade sample problems led to legislation of 2001 study
OSPI promised to correct problems
As parent, got to inspect 2002 4th grade math test.
2002 problems
Nearly half of problems still either not compliant with rules / not appropriate
Actual items appear much easier  than original 1997 sample book. No algebra or marble probability or  proportionality or ratio problems.
Not compliant with specifications – defn mean
Specifications still don’t match EALR benchmarks – Factor trees
Mean, Median Mode moved to G4 from G7
Tic Tac Toe game is outrageous IQ task not related to EALR
How Hard?
Only 52% passed – the AVERAGE student was “below standard”
Only 28% of African-Am passed, 2:1 ratio NONE passed at some inner city schools.
ITBS doesn’t flunk anybody – no pass score. Whites at 55-60 vs Black 40-48 percentile.
Most problems had nearly or over half give incorrect answer. Half of kids actually know formula for mean.
No problems are straightforward demonstration of basic skills and knowledge. A kid who aced 1971 4th grade math could still flunk this thing.
Chain of Errors
20% variance in grade acceptable
Wrong answer: skyscraper poem and flagpole
Wrong problem: 4th grade ratio
Wrong specification: factor tree
EALRs wrong / changed
Format of test is NOT improvement over multiple choice
Use of test for “high stakes” is wrong
Median Student > EALRs!
50% of 10th graders can factor 2nd degree polynomial (pre-college)
62% 4th graders know correct formula for mean (G7 EALR)
Only 10-15% consistently miss G4 WASL problems.
Would have 90-95% pass rate if content was restricted to basic skills & knowledge
If things were correct:
EALRs – “Essential Academic Learnings” list skills of what “every student should know and be able to do”™
Test specifications based on EALRs bound what can and cannot be put on test questions
WASL test questions must be compliant with both test specifications and EALRs
Can’t change specs to match test, or EALRs to match spec, or change test between years and expect test score gains to be valid. But that’s exactly what’s OSPI been doing.
General WASL problems
Malpractice to use cut-score for high stakes such as diploma: ETS and technical document
Score inflation appears to be designed into tests like TAAS, KIRIS
Gains not reflected in SAT, ACT, CTBS
Cart-before-horse problem: WASL problems (1996) were designed before EALRs were finalized (1997)
EALRs themselves are changed mid-course in 2001.
Changes invalidate year-year comparisons.
No Quality Contol
2000 study proves process produces defective EALRs, specs, test, and test errors
Cut score committee is asked “how high” but was never told what % passed, or to check if problem was grade appropriate in the first place.
2002 shows some improvement, but overall nothing really fixed since 2000 study
Need to hire someone who can check every problem for correctness and every specification
Software has “tech support” line to take complaints and fix problems. OSPI always responds “What problem?”
Just how bad is a “work in progress” before we can call it a piece of junk??
WASL more difficult than College / SAT!
Not on SAT: Median, mode, probability, spinner, constructing charts, similar congruent, largest figures
WASL4/7 often more difficult than corresponding college SAT problems.
AAA Commission: Community colleges say: WASL is MORE difficult than math and english placement tests
10th easier than 7 than 4?
10th grade: Jacinda is going to use  similar triangles. (Giveaway method)
7th: We’re going to build a “similar slide”.(Hint)
 4th: How would you measure  a flagpole using a fire hydrant, and ruler? (No clue)
8th grade textbook has similar triangles solution
OSPI (Seattle PI) answer counts bricks on a wall -  wrong answer!
Proportionality /Indirect measurement is 7
Why ask 4th graders to construct something not even taught in all 8th grade textbooks?
WASL has IQ test bell curve distribution
WASL Bias ~= IQ SD
White mean 387.60 SD = 30.87
Black mean 366.19 SD = 30.03
Difference = 21.4 / 30.03 or 70% of a standard deviation, consistent with high WA NAEP scores for blacks.
Claims of closing the gap in points meaningless when white/black pass ratio remains at 2:1 to 4:1 depending on grade across years.
Bias is in the gap, not names or pictures or culture.
Illusion of Closing the Gap
Sure, inflating scores look like everybody is moving up
But the ratio between whites / Asians and other minorities is declining only slightly less than 3 or 4 to 1.
Harder test = IQ test
Open response more difficult than multiple choice.
Tests with very high failure rates test people of very high rather than average abilities of what “every student can do”.
Multiple constructed right answer with no instruction more difficult than application of method that was taught.
Writing more difficult for immigrants, non-traditional english inner city .
WASL Up ITBS Down
Thurgood Marshall / Seattle goes from 0 to G4 45% math, 60% reading
But ITBS scores show decline!
WASL Gains Inflated Compared to ITBS
High Stakes is Educational Malpractice
From 1999 WASL 4th grade technical report :
Scores from one test given on a single occasion should never be used to make important decisions about students' placement, the type of instruction they receive, or retention in a given grade level in school. It is important to corroborate individual scores on WASL tests with classroom-based and other local evidence of student learning (e.g., scores from district testing programs). When making decisions about individuals, multiple sources of information should be used and multiple individuals who are familiar with the student's progress and achievement (including parents, teachers, school counselors, school psychologists, specialist teachers, and possibly even the students themselves) should be brought together to make such decisions collaboratively.
ETS (Education Testing Service) says that test score cutoffs should not be used for college admissions purposes, test scores should only be one factor in many.
Universal High School vs High Standards?
High schools admit all by age groups regardless of ability.
Diploma in diverse population should reflect participation, not ability. Most state standards are set at the level of the highest students, not the below-average.
 Not all students capable of algebra, trig or writing editorials. Especially recent immigrants or the very poor.
Setting high performance level would lead to return to before 1950s when high school was exclusively college or career prep, and most worked instead.
When Best isn’t Enough
Mercer Island (best in state) does not meet 80% passing at any grade level.
Only 75% of Running Start students pass WASL vs 80% goal
If only 20-30% pass all WASL requirements, that corresponds to top 20-30% students who are admitted by 4 year universities.
When all students “meet standard” will they ALL go to U Washington?
Most 10th grade WASL math and writing content requires pre-college track coursework.
What Every Student Should Know and Be Able to Do ™
“Set at a very reasonable 10th grade level” Patrick Patrick
“People don’t believe that a diploma is a guarantee of readiness for college” Partnership for Learning survey
“One High World Class Standard” for all jobs and further education – Marc Tucker NCEE
Standard = 4 Yr Univ
Europe has range of college to vocational college graduation requirements.
One CIM can’t be good for “all jobs and all colleges”.
SAT scores and vocational certificates count, already in place.
There is no failing SAT score. Score for WSU is not good enough for MIT. Safeco doesn’t need SQL server certificate.
High school diploma should _not_ require 3 years of algebra / integrated math / science
College based exit requirements will doom vocational students – nearly all fail in MA MCAS.
Arithmetic Mean
Specs – cannot ask for definition of mean (sum div by num items) Mode was defined, but not mean.
EALR – mean was moved from G7 to G4 in
EALR – Computation is G7, but is allowed in G4 spec, was on non-released 2002 problem.
Mean Moved From 7 to 4
Calculate Mean: Spec vs EALRs
PS04 (Central Tendency)
b) Short-Answer items may ask students to find [=compute?] mean, median, or mode for a given situation.
Conflict: EALRs “Calculate Mean” is 7.
Specs reflect Essential Learnings, not other way around, like SOME states that put the test before the horse.
Thou Shalt Not Test Definition of Terms
From Grade 4 Item Specifications:
6. Items will not test vocabulary definitions.
Terms related to central tendency:
Mean: an average obtained by dividing the sum of the data items by the number of data items.
About 1/3 got this wrong, even traditional  texts do NOT assume memorization of this formula by 4th grade.
Median and Mode and SAT
WASL gives definition of mode (easy – most common value) but not mean? Why?
College SAT does not test for median or mode
Absolutely non-essential at elementary level,
Mode is never used outside of statistics texts for most adults, but a waste of WASL problems.
Tic Tac Toe Strategy
What’s being tested here is knowledge of tic-tac-toe strategy, not D10 coordinate placement.
Strategy is not math. Checkers?
#1: Number line
Prof Orlich, myself and nearly half of kids didn’t know that just a skooch over dead center means a 48 instead of 45. Even a ruler has fraction markings. College SAT has a problem like this requiring guessing between dots.
#2 Freezing Point = 32
Fahrenheight freezing point isn’t even on the EALRs, which is what you really need to know.
Some states DO put this in their standards, but we don’t
An IQ test is figuring out it’s cold and picking out the low, or just already knowing that.
G7 Marbles Easier than G4
7th Grade asks probability of one bag
4th grade is HARDER! (Reaching Higher sample) asks to compare 3 bags. Ratio,  probability as a ratio, comparing fractions with uncommon denominator, is 7th EALRs
G10 Spinner Easier than G4
G10 spinner you can tell it’s ¼ or1/6
G4 spinner, many could not tell if 2 reds was more than one blue
Spinners moved from G7 to G4 in spec.
Can’t even measure G4 probability without protractor.
Spinner – Proportional Thinking
Proportions and ratios is grade 7 EALRs and specifications.
Solution is 1:2:1 ratio
2nd pie sizes not clearly marked
Nearly half got this wrong
Factor Tree is G4, not G7
Construct Similar Figure
1997 G7: construct symmetric, congruent, and similar figures
1997 G4: understand concepts of symmetry, congruence, and similarity
2001: G4 and G7 understand concepts of symmetry, congruence, and similarity
This pretty clearly violates 1997 rules
42% scored ZERO on  this unfair task.
Thou Shalt Not Test For:
Solve for unknown ax + bx = c
Ratio, Proportion
Percent
Fractions uncommon denominator
Prime numbers, common factors
Complex Patterns
Decimal math except money +
3D visualization
Compute probability
Strategy games
Construct a method
More difficult than G7, G10 or SAT test
NCTM Standards “Fuzzy Math”
Problem is “Standards Based” Math. Search for “Mathematically Correct”
EALRs are mostly reasonable content-based “old math”
Most WASL problems are unbounded NCTM “problem solving” which are not found in any 4th grade textbook.
Answer methods  to 1997 sample book found in 8th, 10th and college level math courses
New NCTM like Mathland, Investigations teach even LESS on fact / method based EALR skills!
WHAT’S TO BE DONE
Declare 1997-2002 WASL results and tests invalid.
Hire at least 1 person capable who can check WASL,  specs and EALRs are consistent.
Legislature must repeal ESHB 1209
Drop CIM and proficiency levels
Return all or some scored tests
Group gap in Std Dev or ratio, not points.
Report percentile scores, check if score is inflation is shown by stable ITBS scores
Rewrite WASL to test for fuzzy range of levels from entry level worker to 4 year university requirements,
Restore “real” math, reject “fuzzy” math emphasis
Include simple questions so all can “succeed” instead of all failing.
Extra Credit: Spot the 4th grade EALR violations
From www.passthewasl.com Flashcards
Solve for n: 2 x n + 8 = 16
If the ratio of boys to girls is 2: 1 how many boys are there?
What is the probability of getting chocolate ice cream as a percent or ratio?
Answer:
The card gives a standard 9th grade Alegbra 1 solution for n. Nobody teaches this in any 4th grade textbook though some teachers might assign this as NCTM style homework!
Ratio and percent are clearly labeled as 7th grade in the EALRs.
Probability measured as percent OR as ratio are clearly labeled only as grade 7 and grade 7 test specifications.