Phillips applied the “Page 99 Test” to his new book, Scouting and Scoring: How We Know What We Know about Baseball, and reported the following:
When I began writing Scouting and Scoring, I envisioned the book as a challenge to the “great man” accounts of the rise of data analytics. That is, the stories of a handful of mathematical savants, armed with data, replacing experts in fields from flipping burgers to conducting surgery. Instead I wanted to show the immense work, often from dozens of unnamed or unknown individuals, that was required for data to exist at all. Data don’t simply appear, waiting to be analyzed. They must be carefully created, collected, curated, and cleaned before they’re treated as reliable and useful.Visit Christopher J. Phillips's website.
In baseball, the main figure in the history of analytics was Bill James, and on page 99, I address why focusing exclusively on him or other analysts is misleading.
Though James was the face of this effort, and most accounts of the rise of baseball analytics…make him its protagonist, doing so obscures an important link. One reason an increasing number of people were asking questions about the level of detail in baseball data after midcentury was the success of the various mathematical sciences of modeling and prediction. In the development of these sciences James was the exception, the graduate school dropout and factory night watchman who became the father of baseball analytics. Nearly everyone else involved in this effort had scientific or technical training.The new methods of data analysis did not spring fully formed from anyone’s head. They required fundamental shifts in methods of collecting and organizing data. These shifts were driven by those with technical training, particularly people in the emerging field of computer science. By taking a wider-lens view, we can focus on the dozens of people who took their technical training across many fields—accounting, computing, biology, chemistry, statistics—and applied them to baseball data.
On its surface, the rise of data analytics in baseball and beyond seems like a story of people being replaced by data and algorithms, but that’s a self-serving narrative promoted by the data crunchers themselves. Human labor and expertise are essential to making numbers stable and credible. Expertise wasn’t replaced by data; it was a precursor for knowing what should count as data and what questions could be answered by data. By focusing less on the wizardry of analytics and more on the mundane skills and effort required for data to exist at all, I wanted to reorient how we think about the rise of data science in the late modern world.
--Marshal Zeringue