Testing: Questioning the Standards
For the nearly 2.5 million people who take standardized tests each year, scribbling in answers in number-two pencil on a computerized answer sheet plays a significant role in shaping their futures. It helps allow them to continue their studies or to enter a profession--such as law--in which many of the state bar examinations are standardized. No matter how one feels about the merits of tests, it is clear they are a tool used in the determination of opportunities. Many test advocates believe they help to break down old social barriers. Standardized tests, they say, move society toward a more meritocratic--as opposed to aristocratic--structure. An inherent and sometimes unrecognized danger of a meritocracy, however, is deciding who should decide which merits tests should measure and why those traits are the most important to society.
Apparently, what most educators have chosen to measure is intelligence, while ignoring values such as creativity, ingenuity, perseverance, compassion and patience. The standardized test, in its present form, obviously cannot and does not attempt to measure these qualities, but it is still unclear whether in emphasizing intelligence and ignoring other positive human characteristics, the test manufacturers and their advocates are doing society a grave injustice; many seem to believe they are.
The College Board, an organization of more than 2500 colleges, schools, school systems and education associations' first gave the Scholastic Aptitude (SAT) in 1926. Previously, the Board, which was established in 1900 to standardize the admissions examinations given by Harvard, Princeton, Yale and a few other selective colleges, administered a written essay examinations. The SAT and the essays coexisted until World War II, when the Board indefinitely discontinued the latter. Educational Testing Service (ETS) officials will account and admit, to their chagrin, the stories related to the first IQ tests, which were designed for the purpose of ethnic exclusion. One run on Ellis Island in 1912 concluded that 83 per cent of Jews, 90 per cent of Hungarians, 79 per cent of Italians, and 87 per cent of Russians were "feebleminded." Other IQ tests showed women to be significantly less intelligent than men, but--magically--after sports-related and other male-skewed questions were deleted, the intelligence gap disappeared. Today, ETS may laugh at these memories, but many parents, educators and consumer activists believe that quite a few more subtle biases still exist.
ETS, the company which writes the six-part SAT composed of verbal and math sections and a test of standard written English, holds a predominant position in the testing market. ETS controls well over half the entire testing market, which includes such tests as the Law School Aptitude Test (LSAT), the Graduate Record Examination (GRE), the Medical College Admissions Test (MCAT), and various professional tests such as bar examinations. Because of its leading role in the testing industry, ETS, which takes in nearly $90 million a year, has been the center of much of the controversy and criticism associated with testing.
Last November, ETS officials and others heard speaker after speaker criticize standardized testing at the National Conference on Testing in Washington. Groups such as the Parent Teacher Association (PTA), the National Association for the Advancement of Colored People (NAACP), and the National Education Association (NEA) all have complained of serious and potentially dangerous shortcomings of tests. Since then the siege on testing marked by the conference has become a prominent nationwide issue.
In New York last year, the State Legislature, through the efforts of the New York Public Interest Research Group (NYPIRG), dealt a serious blow to testing in that state by passing a "truth in testing" law. Under this law, ETS and other test manufacturers must return corrected answer sheets of tests to students shortly after the test date. ETS says this law will make the administration of tests much more expensive because they continually have to make new tests. This could make giving tests in New York financially unfeasible, officials say.
NYPIRG, however, considers the passage of this law a significant step toward a needed regulation of the testing industry. "We need a careful re-examination of whether or not tests serve a useful purpose," Steve Solomon, testing project director for NYPIRG, says.
Among the most well publicized criticism of testing is a report from a Ralph Nader study group which criticized ETS for not being open enough to public scrutiny and universities and other institutions for placing too much emphasis on tests in admissions and selection. "Tests have very little to do with one's ability to succeed," Tim Massade, a member of the study groups, says. "They merely measure how well one does on a multiple choice test," he adds.
In a massive factory-like building on 400 acres of wooded land in Princeton, N.J., ETS officials create, grade, and evaluate their tests. They hold workshops to train a freelance pool of people who, along with ETS employees, devise test questions. The process is time consuming and meticulous. The prepared questions undergo a series of reviews for simplification and clarification, in which ETS employees try to spot and eliminate all the potential cultural or sexual biases they can find. By the time a question reaches final test form, it has been inspected at least 30 times, and that takes 18 months. It costs approximately $100,000 to produce an entire SAT or GRE.
But many of the screening methods on which ETS depends to ensure fairness in its exams are themselves the subject of the most heated criticism. For example, every question given on an SAT must be "pretested," a process ETS considers to be essential. ETS tries to see if the median score of students who answered the question correctly is higher than that of students who got the question wrong. This ensures that the tests will always measure the same things in the same way. Critics claim that by doing this ETS operates under the assumption that what it measures are the right, or most important qualities, of human beings. If the tests are in fact biased in some way, the pretest apparently serves only to perpetuate these and other potentially undiscovered flaws. ETS is creating and perpetuating an arbitrary system to classify human beings, critics claim.
Much of the criticism of ETS is not without statistical base. For example, ETS studies have shown a direct correlation between a person's family income and his SAT scores. As average family income increases, test scores rise proportionately. Many such as Alan Nairn, head of the Nader study group, claim this is direct evidence of an economic bias in the tests. The Nairn-Nader study says ETS statistics show SATs are not a very accurate predictor of a student's first year grades. "Ninety per cent of the time, tests predict a student's first year grades no more accurately than a roll of dice," Massade says. Although Mary Churchill, associate director of the Information Division of ETS, agrees that high school grades are statistically a better indicator of a student's performance, she believes that Nairn misinterpreted ETS findings and that SATs are a better predictor than the Nader group says.
Amidst the controversy and criticism, ETS officials, confused about why their industry has recently taken so much abuse, try to remain calm. They admit many flaws in testing exist, but insist that most of the serious shortcomings lie in people's use of the test. In college and graduate school admissions they say tests are only useful when used in conjunction with high school transcripts and other materials. Many find it convenient to place more emphasis on the test than they had intended, they say. "Tests are the most valid measure that anyone's devised," Churchill says.
Some Harvard admissions officers agree tests must be considered in perspective. "We never use tests as a single factor. Alone they are not very accurate, but in the kinds of decisions we are making they can be very helpful," L. Fred Jewett '57, dean of admissions, says. "I can't think of a single instance when a person has been admitted because of high scores," Molly T. Geraghty, assistant dean of admissions at the Law School, adds. "However, it may well be the thing that tips the scales," she says.
Harvard admissions officers deny they determine cutoff scores--scores below which they would take no applicants--and emphasize that tests are only one tool in the admissions process. "In some schools you can find there is an unfair cutoff, but we have no formulas. We look at the whole application," Geraghty says.
"Tests are important to the degree they confirm and support other evidence or refute it. We try to use them in the context that they came," Jewett says.
Both Jewett and Geraghty believe tests are most useful when they are either very high or low or show a large discrepancy with high school performance. Jewett says his office tries to take into consideration such factors as how much exposure a person has had to the multiple choice type of testing on the SAT, his cultural and social background, the number of times he has taken the test, and other possible mitigating circumstances. Through the application, evaluations and interview, the admissions office is usually able to learn of these influencing circumstances, he says. "One indispensible part of our process is putting a lot of thought and energy into the highest level, and tests can be a helpful approach to fairness," Geraghty says.
Nonetheless, the question of people being denied opportunities by unfair aspects of tests remains. Some believe that standardized tests are geared toward white middle class society, putting poor people and nonwhites at a distinct disadvantage. The NAACP has consistently attacked test makers on this point. ETS officials claim and Jewett and Geraghty seem to agree to some extent, however, that without tests, many talented people would never have been recognized by admissions offices. "Abolishing tests would only hurt the people we are trying to help," Geraghty says.
Other critics claim that some students going to private schools such as Andover and Exeter are trained many years in advance on how to take the tests. Milton Academy, for example, holds SAT preparation sessions. Many other schools such as the Detroit Country Day School in Michigan have started to coach their students and others in "exam tricks." Coaching centers such as Test Prep Services, the John Sexton Test Preparation Center, and the Stanley Kaplan Educational Center, all with branches in Boston, help people who can pay the fee to improve their scores. Until recently, ETS refused to admit that coaching could improve scores on their tests, but with the recent release of a Federal Trade Commission Report, stating that coaching does indeed help, ETS is starting to reveal exam hints in its booklets. Critics claim that middle class students who can take advantage of test prep centers, or who go to private schools, have a distinct advantage, while poor people have no such help available.
None seem to disagree that a thorough look at the testing industry would be a good idea, and many positive changes would be welcome. Few, however, have many constructive suggestions to make. Many of the faults of testing are glaringly clear, but few who want to eliminate testing suggest some sort of alternative to it. "We must recognize and understand that some of the kind of criticism is aimed at the abuse of the use of tests and rather than at the tests themselves," Jewett says. "I worry that some of the criticisms tend to throw out testing without providing an alternative. Tests do help present a total picture."
Probably the greatest danger of tests is that students tend to take their test scores as an indication of their intelligence or, worse yet, of their worth either as a student or as a human being. "ETS doesn't do enough to tell students that just because they got a poor SAT doesn't mean they are not talented," Solomon says.
"People take tests entirely too seriously. They read much more into it than it was ever intended to show," Geraghty says, adding, "They should not be overly impressed by anything that has numbers." "One should never get seduced into the notion that tests are in any way scientific," she says.
Still, the stigma of a low SAT score can stay with a person all his life. In many ways, SAT scores act as a status symbol, something to show off. But their value is questionable. The debate--call it war--over testing is apparently just heating up and will probably continue for many years. It is unclear what will result from the controversy, but most likely the public will begin to scrutinize tests and their results more closely than ever before.