Would you walk across a bridge that was designed to break?
Of course you wouldn’t.
But what if someone told you the bridge had been fixed?
Would you trust it – especially if people were still falling off of it all the time?
And today, after countless revisions and new editions, they still do exactly the same thing.
Yet we’re exhorted to keep using them.
A BRIEF HISTORY LESSON
Modern testing comes out of U.S. Army IQ tests developed during World War I.
In 1916, a group of psychologists led by Robert M. Yerkes, president of the American Psychological Association (APA), created the Army Alpha and Beta tests. These were specifically designed to measure the intelligence of recruits and help the military distinguish those of “superior mental ability” from those who were “mentally inferior.”
These assessments were based on explicitly eugenicist foundations – the idea that certain races were distinctly superior to others.
Colleague Lewis Terman made the goal clear in his book, The Measurement of Intelligence, that these “experimental” tests will show “enormously significant racial differences in general intelligence, differences which cannot be wiped out by any scheme of mental culture.”
In 1923, another psychologist, Carl Brigham, took these ideas further in his seminal work A Study of American Intelligence. In it, he used data gathered from these IQ tests to argue the following:
“The decline of American intelligence will be more rapid than the decline of the intelligence of European national groups, owing to the presence here of the negro. These are the plain, if somewhat ugly, facts that our study shows. The deterioration of American intelligence is not inevitable, however, if public action can be aroused to prevent it.”
Thus, Yerkes, Terman and Brigham’s pseudoscientific tests were used to justify Jim Crow laws, segregation, and even lynchings. Anything for “racial purity.”
People took this research very seriously. States passed forced sterilization laws for people with “defective” traits, preventing between 60,000 and 70,000 people from “polluting” America’s ruling class.
Of the ruling, which has never been explicitly overturned, Justice Oliver Wendell Holmes wrote, “It is better for all the world, if instead of waiting to execute degenerate offspring for crime, or to let them starve for their imbecility, society can prevent those who are manifestly unfit from continuing their kind…. Three generations of imbeciles are enough.”
Eventually Brigham took his experience with Army IQ tests to create a new assessment for the College Board – the Scholastic Aptitude Test – now known as the Scholastic Assessment Test or SAT. It was first given to high school students in 1926 as a gatekeeper. Just as the Army intelligence tests were designed to distinguish the superior from the inferior, the SAT was designed to predict which students would do well in college and which would not. It was meant to show which students should be given the chance at a higher education and which should be left behind.
And unsurprisingly it has always – and continues to – privilege white students over children of color. The same as nearly every standardized test still does.
HAS IT CHANGED?
None of this can be challenged. These are historical facts. They are simply what happened justified in the words of the people who perpetrated them.
This was all a long time ago, they say. Much has changed between now and then.
But has it? Really?
We certainly don’t use the editions of the tests written by the original eugenicists, but the practices used to create them and the results of these assessments are extremely similar.
In 1964, a Department of Education report found that the average black high school senior scored below 87% of white seniors (in the 13 percentile) on standardized assessments. Fifty years later, the National Assessment of Educational Progress (NAEP) found that black seniors had narrowed the gap until they were merely behind 81% of white seniors (scoring in the 19th percentile).
Is that really the kind of progress you want to champion?
The reason for the disparity has nothing to do with the learning students of color (and the poor whose scores are similar) achieve nor their worth as human beings.
Discrimination is purposefully built in to the standardization process, according to W. James Popham, PhD, Professor Emeritus at the University of California at Los Angeles and former test maker. He explained in an interview with Frontline:
“Traditionally constructed standardized achievements, the kinds that we’ve used in this country for a long while, are intended chiefly to discriminate among students … to say that someone was in the 83rd percentile and someone is at 43rd percentile. And the reason you do that is so you can make judgments among these kids. But in order to do so, you have to make sure that the test has in fact a spread of scores. One of the ways to have that test create a spread of scores is to limit items in the test to socioeconomic variables, because socioeconomic status is a nicely spread out distribution, and that distribution does in fact spread kids’ scores out on a test.”
The scores have to fall into categories – Below Basic, Basic, Proficient and Advanced, for instance. If too many students cluster in the middle, the results are invalid. We need the scores spread out – even if we must resort to non-educational factors to get there.
Family income is not something the tests ignore, says Popham. It is an essential component specifically tested for in question construction. In fact, he claims that between 15-80% of the questions (depending on the subject area) on norm-referenced exams are linked to socio-economic status (SES).
Thus minorities with higher percentages of impoverished people are selected against. Not because of any explicit racist ideology – but to get the pretty bell curve standardized assessments require.
“Too often, test designers rely on questions which assume background knowledge more often held by White, middle-class students. It’s not just that the designers have unconscious racial bias; the standardized testing industry depends on these kinds of biased questions in order to create a wide range of scores.”
For example, Choi recalled a 10th grade student in his class asking him about a standardized test question. “With a puzzled look, she pointed to the prompt asking students to write about the qualities of someone who would deserve a “key to the city.” Many of my students, nearly all of whom qualified for free and reduced lunch, were not familiar with the idea of a ‘key to the city.’”
So when they get such a question wrong, it isn’t necessarily because they don’t know the concept being tested, but they don’t understand what was being asked in the first place.
Test makers could work to eliminate such instances but that would reduce the spread of answers. It would destroy the bell curve – and thus invalidate the goal of the test which ultimately is not assessing learning but sorting and ranking students.
Jay Rosner, a national admissions test expert, explained how this bias is built-in to the process for each revision of assessments like the SAT:
“Compare two 1998 SAT verbal [section] sentence-completion items with similar themes: The item correctly answered by more blacks than whites was discarded by the Educational Testing Service, whereas the item that has a higher disparate impact against blacks became part of the actual SAT. On one of the items, which was of medium difficulty, 62% of whites and 38% of African-Americans answered correctly, resulting in a large impact of 24%…On this second item, 8% more African-Americans than whites answered correctly…”
In other words, the criteria for whether a question is chosen for future tests is if it replicates the outcomes of previous exams – specifically tests where students of color score lower than white children. And this is still the criteria test makers use to determine which questions to use on future editions of nearly every assessment in wide use in the US.
Public schools have no control over these factors. That’s why schools serving poor and minority students invariably have lower test scores. Popham concludes there will always be a testing gap because that’s the way the system is designed.
He says it’s “A game without winners.” Or more likely a game where the poor and minorities cannot win.
And that’s how the system was designed.
Whether it’s the 1920s or the 2020s.
Standardized tests came from racists assumptions about human intelligence.
We no longer profess eugenicist ideas of racial purity embedded in our assessments as self evident or based on science. But they’re there none-the-less.
Today, the very concept of intelligence being quantifiable remains in question.
In “The Mismeasurement of Man,” evolutionary biologist Stephen Jay Gould challenged many of these ideas – in particular those of Terman. Rather than see intelligence as a genetic trait, Gould envisioned it more abstractly and thus shattered the idea that any mere number could capture human value.
But this is a relatively new idea. Accepting it requires us to turn the page on a rather dark passage of our national history.
We must move on from the original test makers narrow-minded, racist ideas about intelligence and human worth.
And to do that we must leave standardized testing far, far behind.
We can’t just give a faulty bridge a new coat of paint. We must demolish it and rebuild an entirely new structure to carry us into the future.
Like this post? You might want to consider becoming a Patreon subscriber. This helps me continue to keep the blog going and get on with this difficult and challenging work.
Plus you get subscriber only extras!
Just CLICK HERE.
I’ve also written a book, “Gadfly on the Wall: A Public School Teacher Speaks Out on Racism and Reform,” now available from Garn Press. Ten percent of the proceeds go to the Badass Teachers Association. Check it out!