The Sabermetric Revolution
Assessing the Growth of Analytics in Baseball
Benjamin Baumer and Andrew Zimbalist
Feb 2014 | 240 pages | Cloth $26.50
Business | Statistics
View main book page
Table of Contents
Chapter 1. Revisiting Moneyball
Chapter 2. The Growth and Application of Baseball Analytics Today
Chapter 3. Overview of Sabermetric Thought I: Offense
Chapter 4. Overview of Sabermetric Thought II: Defense, WAR, and Strategy
Chapter 5. The Moneyball Diaspora
Chapter 6. Analytics and the Business of Baseball
Chapter 7. Estimating the Impact of Sabermetrics
The Expected Run Matrix
Modeling the Effectiveness of Sabermetric Statistics
Modeling the Shifting Inefficiencies in MLB Labor Markets
Excerpt [uncorrected, not for citation]
Michael Lewis wrote Moneyball because he fell in love with a story. The story is about how intelligent innovation (the creative use of statistical analysis) in the face of market inefficiency (the failure of all other teams to use available information productively) can overcome the unfairness of baseball economics (rich teams can buy all the best players) to enable a poor team to slay the giants. Lewis is an engaging storyteller, and along the way, introduces us to intriguing characters who carry forward the rags-to-riches plot. By the end, the story of the 2002 Oakland A's and their manager, Billy Beane, is so well told that we believe its portrayal of baseball history, economics, and competitive success. The result is a new Horatio Alger tale that reinforces a beloved American myth and, all the better, applies to our national pastime.
The appeal of Lewis's Moneyball was sufficiently strong that Hollywood wanted a piece of the action. With a compelling script, smart direction, and the handsome Brad Pitt as Beane, Moneyball became part of mass culture and its perceived validity—and its legend—only grew.
This book will attempt to set the record straight on Moneyball and the role of "analytics" in baseball. Whether one believes Lewis's account or not, it had a significant impact on baseball management. Following the book's publication in 2003, team after team began to create their own statistical analytics or sabermetric subdepartments within baseball operations. Today, over three-quarters of major league teams have individuals dedicated to performing these functions. Many teams have multiple staffers creatively parsing numbers.
In a world where the average baseball team payroll exceeds $100 million and the average team generates $250 million in revenue each year, the hiring of one, two, or three sabermetricians, at salaries ranging from $30,000 to $125,000, can practically be an afterthought. (Sabermetricians is what Bill James called individuals who do statistical analysis of baseball performance, named after the Society for American Baseball Research, SABR.) Particularly, once the expectation of prospective insight and gain is in place and other teams join the movement, a team that does not hire a sabermetrician could be accused of malpractice. In baseball, much like the rest of the world, executives and managers are subject to loss aversion. Many of their actions are motivated not by which decision or investment offers the highest potential return, but by which decision will insulate them best from criticism for neglecting to follow the conventional wisdom. So, to some degree, the sabermetric wildfire in baseball is a product of group behavior or conformism.
Meanwhile, the proliferation of data on baseball performance and its extensive accessibility, as well as the emergence of myriad statistical services and practitioner websites, have imbued sabermetrics with the quality of a fad. The fact that it is a fad, much like rotisserie baseball leagues, fantasy football leagues, and video games, does not mean that it doesn't contain some underlying validity and value. One of our tasks in this book will be to decipher what parts of baseball analytics are faddish and what parts are meritorious.
Some of the new metrics, such as the one that purports to assess fielding ability accurately (UZR), are black boxes, wherein the authors hold their method to be proprietary and will not reveal how they are calculated. The problem is that this makes the metric's value much more difficult to evaluate. Of course, fads, like myths, are more easily perpetuated when it is not possible to shed light on their inner workings.
Here are some questions that need to be answered. What is the state of knowledge and insight that emanates from sabermetric research? How has it influenced the competitive success of teams? Does the incorporation of sabermetric insight into player evaluation and on-the-field strategy help to overcome the financial disadvantage of small market teams and, thereby, promote competitive balance in the game? Lewis's account in Moneyball exudes optimism on all counts.
Beyond the rags-to-riches theme, Lewis's story echoes another well-worn refrain in modern culture—the perception that quantification is scientific. Given that our world is increasingly dominated by the TV, the computer, the tablet, and the smartphone—all forms of electronic communication and dependent on binary signaling—it is perhaps understandable that society genuflects before numbers and statistics. Yet the fetish of quantification well predates modern electronic communications.
Consider, for instance, the school of industrial management that was spawned by Frederick Winslow Taylor over a hundred years ago. Taylor argued that it was possible to improve worker productivity through a process that scientifically evaluated each job. This evaluation entailed, among other components, the measurement of each worker's physical movements in the production process and use of a stopwatch to assess the optimal length of time it should take to perform each movement. On this basis, an optimal output expectation could be set for each worker and the worker's pay could be linked, via a piece rate system, to the worker's output. The Taylorist system was known as "scientific management" and was promulgated widely during the first decades of the twentieth century. The purported benefits of scientific management, however, proved to be spurious and the school was supplanted by another—one that emphasized the human relations of production. Thus, obsession with quantification at the expense of human relations met with failure.
Baseball, much more than other team sports, lends itself to measurement. The game unfolds in a restricted number of discrete plays and outcomes. When an inning begins, there are no outs and no one is on base. After one batter, there is either one out or no outs and a runner on first, second or third base, or no outs and a run will have scored. In fact, at any point in time during a game, there are twenty-four possible discrete situations. There are eight possible combinations of base runners: (1) no one on base; (2) a runner on first; (3) a runner on second; (4) a runner on third; (5) runners on first and second; (6) runners on first and third; (7) runners on second and third; (8) runners on first, second, and third. For each of these combinations of base runners, there can be either none, one, or two outs. Eight runner alignments and three different out situations makes twenty-four discrete situations. (It is on this grid of possible situations that the run expectancy matrix, to be discussed in later chapters, is based.)
Compare that to basketball. There are virtually an infinite number of positions on the floor where the five offensive players can be standing (or moving across). Five different players can be handling the ball.
Or, compare it to football. Each team has four downs to go ten yards. The offensive series can begin at any yard line (or half- or quarter-yard line) on the field. The eleven offensive players can align themselves in a myriad of possible formations; likewise the defense. After one play, it can be second and ten yards to go, or second and nine and a half, or second and three, or second and twelve, and so on.
Moreover, baseball performance is much less interdependent than it is in other team sports. A batter gets a hit, or a pitcher records a strikeout, largely on his own. He does not need a teammate to throw a precise pass or make a decisive block. If a batter in baseball gets on base 40 percent of the time and hits 30 home runs, he is going to be one of the leading batters in the game. If a quarterback completes 55 percent of his passes, though, to assess his prowess we also to need to know something about his offensive line and his receivers.
So, while the measurement of a player's performance is possible in all sports, its potential for more complete and accurate description is greater in baseball. It is, therefore, not surprising that since its early days, baseball has produced a quantitative record. Although one might not know it from either the book or the movie Moneyball, the keeping of complex records and the analytical processing of these records reaches back at least several decades prior to the machinations of Billy Beane and the Oakland A's at the beginning of the twenty-first century.
Our book proceeds as follows. To clarify some matters of artistic license presented as fact, Chapter 1 discusses the book and the movie Moneyball, what they get right, what they get wrong and various sins of omission. Chapter 2 traces the growing presence of statistical analysis in baseball front offices. Chapters 3 and 4 introduce and survey the current state of sabermetric knowledge for offense and defense, respectively. Chapter 5 sketches the Moneyball diaspora, that is, the growing application of statistical analysis to understand performance and strategy in other sports, principally basketball and football. Chapter 6 illustrates the use of statistical analysis to penetrate the business of baseball, particularly its effects on competitive balance. Chapter 7 assesses sabermetrics' success, or lack thereof, in improving team performance.
Finally, it is useful to clarify some vocabulary before proceeding. Sabermetrics means the use of statistical methods to analyze player performance and game strategy. Baseball analytics also means the use of statistical methods to assess player performance and game strategy, but it further involves the use of statistical methods to evaluate team and league business decisions. The term analytics as applied to sports has also come to include the interpretation of digital video images, often with associated quantity metrics. We use moneyball (with the lowercase m) to mean the application of sabermetrics with the goal of identifying player skills and players that the market undervalues.