Testing: Questioning the Standards

For the nearly 2.5 million people who take standardized tests each year, scribbling in answers in number-two pencil on a computerized answer sheet plays a significant role in shaping their futures. It helps allow them to continue their studies or to enter a profession--such as law--in which many of the state bar examinations are standardized. No matter how one feels about the merits of tests, it is clear they are a tool used in the determination of opportunities. Many test advocates believe they help to break down old social barriers. Standardized tests, they say, move society toward a more meritocratic--as opposed to aristocratic--structure. An inherent and sometimes unrecognized danger of a meritocracy, however, is deciding who should decide which merits tests should measure and why those traits are the most important to society.

Apparently, what most educators have chosen to measure is intelligence, while ignoring values such as creativity, ingenuity, perseverance, compassion and patience. The standardized test, in its present form, obviously cannot and does not attempt to measure these qualities, but it is still unclear whether in emphasizing intelligence and ignoring other positive human characteristics, the test manufacturers and their advocates are doing society a grave injustice; many seem to believe they are.

The College Board, an organization of more than 2500 colleges, schools, school systems and education associations' first gave the Scholastic Aptitude (SAT) in 1926. Previously, the Board, which was established in 1900 to standardize the admissions examinations given by Harvard, Princeton, Yale and a few other selective colleges, administered a written essay examinations. The SAT and the essays coexisted until World War II, when the Board indefinitely discontinued the latter. Educational Testing Service (ETS) officials will account and admit, to their chagrin, the stories related to the first IQ tests, which were designed for the purpose of ethnic exclusion. One run on Ellis Island in 1912 concluded that 83 per cent of Jews, 90 per cent of Hungarians, 79 per cent of Italians, and 87 per cent of Russians were "feebleminded." Other IQ tests showed women to be significantly less intelligent than men, but--magically--after sports-related and other male-skewed questions were deleted, the intelligence gap disappeared. Today, ETS may laugh at these memories, but many parents, educators and consumer activists believe that quite a few more subtle biases still exist.

ETS, the company which writes the six-part SAT composed of verbal and math sections and a test of standard written English, holds a predominant position in the testing market. ETS controls well over half the entire testing market, which includes such tests as the Law School Aptitude Test (LSAT), the Graduate Record Examination (GRE), the Medical College Admissions Test (MCAT), and various professional tests such as bar examinations. Because of its leading role in the testing industry, ETS, which takes in nearly $90 million a year, has been the center of much of the controversy and criticism associated with testing.

Last November, ETS officials and others heard speaker after speaker criticize standardized testing at the National Conference on Testing in Washington. Groups such as the Parent Teacher Association (PTA), the National Association for the Advancement of Colored People (NAACP), and the National Education Association (NEA) all have complained of serious and potentially dangerous shortcomings of tests. Since then the siege on testing marked by the conference has become a prominent nationwide issue.

In New York last year, the State Legislature, through the efforts of the New York Public Interest Research Group (NYPIRG), dealt a serious blow to testing in that state by passing a "truth in testing" law. Under this law, ETS and other test manufacturers must return corrected answer sheets of tests to students shortly after the test date. ETS says this law will make the administration of tests much more expensive because they continually have to make new tests. This could make giving tests in New York financially unfeasible, officials say.

NYPIRG, however, considers the passage of this law a significant step toward a needed regulation of the testing industry. "We need a careful re-examination of whether or not tests serve a useful purpose," Steve Solomon, testing project director for NYPIRG, says.

Among the most well publicized criticism of testing is a report from a Ralph Nader study group which criticized ETS for not being open enough to public scrutiny and universities and other institutions for placing too much emphasis on tests in admissions and selection. "Tests have very little to do with one's ability to succeed," Tim Massade, a member of the study groups, says. "They merely measure how well one does on a multiple choice test," he adds.

In a massive factory-like building on 400 acres of wooded land in Princeton, N.J., ETS officials create, grade, and evaluate their tests. They hold workshops to train a freelance pool of people who, along with ETS employees, devise test questions. The process is time consuming and meticulous. The prepared questions undergo a series of reviews for simplification and clarification, in which ETS employees try to spot and eliminate all the potential cultural or sexual biases they can find. By the time a question reaches final test form, it has been inspected at least 30 times, and that takes 18 months. It costs approximately $100,000 to produce an entire SAT or GRE.

But many of the screening methods on which ETS depends to ensure fairness in its exams are themselves the subject of the most heated criticism. For example, every question given on an SAT must be "pretested," a process ETS considers to be essential. ETS tries to see if the median score of students who answered the question correctly is higher than that of students who got the question wrong. This ensures that the tests will always measure the same things in the same way. Critics claim that by doing this ETS operates under the assumption that what it measures are the right, or most important qualities, of human beings. If the tests are in fact biased in some way, the pretest apparently serves only to perpetuate these and other potentially undiscovered flaws. ETS is creating and perpetuating an arbitrary system to classify human beings, critics claim.

Much of the criticism of ETS is not without statistical base. For example, ETS studies have shown a direct correlation between a person's family income and his SAT scores. As average family income increases, test scores rise proportionately. Many such as Alan Nairn, head of the Nader study group, claim this is direct evidence of an economic bias in the tests. The Nairn-Nader study says ETS statistics show SATs are not a very accurate predictor of a student's first year grades. "Ninety per cent of the time, tests predict a student's first year grades no more accurately than a roll of dice," Massade says. Although Mary Churchill, associate director of the Information Division of ETS, agrees that high school grades are statistically a better indicator of a student's performance, she believes that Nairn misinterpreted ETS findings and that SATs are a better predictor than the Nader group says.

Amidst the controversy and criticism, ETS officials, confused about why their industry has recently taken so much abuse, try to remain calm. They admit many flaws in testing exist, but insist that most of the serious shortcomings lie in people's use of the test. In college and graduate school admissions they say tests are only useful when used in conjunction with high school transcripts and other materials. Many find it convenient to place more emphasis on the test than they had intended, they say. "Tests are the most valid measure that anyone's devised," Churchill says.

Some Harvard admissions officers agree tests must be considered in perspective. "We never use tests as a single factor. Alone they are not very accurate, but in the kinds of decisions we are making they can be very helpful," L. Fred Jewett '57, dean of admissions, says. "I can't think of a single instance when a person has been admitted because of high scores," Molly T. Geraghty, assistant dean of admissions at the Law School, adds. "However, it may well be the thing that tips the scales," she says.

Harvard admissions officers deny they determine cutoff scores--scores below which they would take no applicants--and emphasize that tests are only one tool in the admissions process. "In some schools you can find there is an unfair cutoff, but we have no formulas. We look at the whole application," Geraghty says.

"Tests are important to the degree they confirm and support other evidence or refute it. We try to use them in the context that they came," Jewett says.