Last week, just after I submitted my blog, I heard on NPR about a book arguing similar points by Cathy O’Neil, titled Weapons of Math Destruction. O’Neil is a data scientist, i.e., someone who works in the trendy new field called Big Data. Unlike so many who are gushing over the promise of Big Data, she tells a cautionary tale that parallels my own concern with the misuse of statistical reasoning.
O’Neil’s book defines “weapons of math destruction” as mathematical algorithms with three characteristics; they are: 1) opaque or even completely invisible to most people, 2) unfair and significantly damaging to people’s lives, and 3) scalable, in other words, they can apply to many thousands or millions of people. Tying in with my blog last week, weapons of math destruction are typically derived from statistical studies that employ faulty or lazy reasoning. O’Neil also emphasizes in numerous examples that weapons of math destruction often create pernicious feedback loops as people adapt their behavior to them.
One example I used last week is also discussed by O’Neil: the ranking of colleges and universities. She echoes my point that ranking statistics are often misleading, yet once created and taken seriously, administrators in higher education struggle to raise their school’s ranking at the expense of neglecting other facets of educational quality that might not be captured by the standard ranking algorithms. Whoever creates the dominant ranking schema essentially dictates national educational priorities.
The current obsession with ranking schools and teachers creates similar problems, O’Neil writes. The basic idea of testing is that good teachers are those who most significantly increase the standardized test scores of their students, over the course of a school year. Unfortunately, unscrupulous teachers often find ways to help their students cheat, or simply change wrong answers to right ones. If they are not caught, then in this system, the ones hurt the most are honest teachers, since they cannot show improvement over the inflated student scores from previous teachers who cheated. The honest and capable teachers are sometimes fired for “poor performance” and the cheaters are promoted.
Even if there is no cheating, inevitably the “best” teachers are those who waste no effort on anything but teaching to the test. Whatever is neglected by the testing process, even if important for learning and development of critical thinking, becomes superfluous.
Having lived and taught in China for seven years as dean of an international college and then professor of business management, I learned a lot about the flaws in the Chinese educational system, which has been “teaching to the test” for decades longer than the idea has been popular in the U.S.
Cheating and plagiarism are utterly rampant. Many Chinese teachers overlook cheating because they know if their students do poorly, they will be considered inferior teachers. I found that nearly all Chinese professors were useless as exam proctors. They would ignore the students and even leave the room during tests to take phone calls, leaving students to jump up and cluster around the top students to share answers.
Chinese students even find highly creative ways to cheat on standardized tests administered by American testing companies, such as the SAT, GRE, GMAT and TOEFL. Testing scandals have, in some cases, led testing companies to void test results for the entire country and force all students to retake the test. Some of the methods of cheating have included switching identities to let an expert take the test for you, using concealed phones or microphones to receive test answers, and getting test answers from people taking the same test earlier in a different time zone. Many Chinese students do not cheat, but so many do that I, personally, would not take test results too seriously.
The Chinese national college entrance exam is notorious for cheating. Unlike the admissions process in the U.S., in China the test score is the sole criteria for university entrance and for admission to specific majors. Students’ entire future may rest on this one test result.
I found that despite widespread cheating, Chinese parents often see testing as the most credible way to select students. This is because they do not trust any selection system that involves subjective choice by authorities, since they assume such methods will be too easily corrupted. For example, while I was dean of an international college, one of my tasks was to determine which students should be selected to spend their senior year at our New York campuses. Along with the other American faculty, we decided the best method (in addition to GPA) was for three American professors to meet with each student for 10-15 minutes and have a discussion in English to gauge their language ability. A number of Chinese parents protested. They wanted a written test with a fixed score to determine who would pass. To them, any room for subjective judgment smacked of potential favoritism. Although surprised by that attitude, we stuck to our method.
It might be hard to reconcile the idea that testing is considered fair within a system with widespread cheating, but I think it is a case of “the devil you know” being preferable to an opaque system in which the decision criteria are not as clear-cut.
O’Neil notes many instances of pernicious weapons of math destruction; not just in the educational arena, but also in a host of other fields. Criminal justice, credit scoring, internet advertising, evaluating potential terrorist threats, job performance evaluation, evaluating job applicants, and, one of my favorite examples, financial investment models are all subject to the potential harm of weapons of math destruction. In all of these areas, models are often too poorly understood by the victims for them have the ability to fight back. O’Neil describes how many weapons of math destruction tend to discriminate against the poor while making the rich richer. Many are also unjustly arbitrary. Her objective is to bring them to light so that bad algorithms can be challenged and revised.
Given O’Neil’s time working in a leading hedge fund, it is understandable why her examples of the failures of weapons of math destruction in finance are among the most vivid. These weapons of math destruction are crucial for the behavior of financial markets since a large proportion of trades, now often the majority, are executed automatically by mathematical algorithms. This is known as programmed trading.
The first major crash of the stock market caused by programmed trading, occurred on Black Monday in 1987, when the Dow dropped more than 500 points in a single day; this was largely because a number of similar trading programs were all trying to sell the same stocks at the same time. With so many sellers and few buyers, prices plummeted.
Since then, programmed trading has proliferated enormously. Most of the programs used are proprietary, and therefore secret. The net effect of their interaction in unusual circumstances is impossible to predict, but could be drastic. They execute trades so rapidly, and with such enormous volume, that computer-driven trading can cause massive price swings faster than human traders can react. I will consider the key importance of time in financial trading in next week’s blog.
Originally posted on World Policy Institute blog September 16, 2016 – Weapons of Math Destruction.