The growing diffusion of analytics in baseball. Team Strategies, Personalized Training, Data Journalism. How one of the most popular sports in the US is changing.
“Baseball is 90% analytics, the other half is physical” (Yogi Berra)
This is maybe how the former baseball champion Yogi Berra would see it today, replacing the importance of the mental dimension in his famous quote. Yes because in these years, just like in football, baseball experienced the outbreak of algorithms and big data.
We say that the paradigm of data analytics applied to baseball is becoming pervasive. It is decisive on the clubs’ strategies, and, consequently, on the manager’s choices of team setup and personal instructions to players. For instance, some of the measured data are the “exit speed” (i.e. the speed of the ball when it is hit by the bat), the frequency of home runs did by the striker, and the “spin rate” (how fast the thrown ball is spinning). These parameters can be very difficult to analyze accurately without specific analysis tools. Complicated algorithms are applied by Data Scientists to extracts hints useful to decide how to deploy the players on the pitch, what kind of movements they should do and what training they need.
It is interesting to note that these indices often do not match with the statistical benchmarks that are traditionally used in player evaluation. One important example is OPS (on-base plus slugging), which measures a hitter’s overall effectiveness across multiple batting components. The Major League Baseball has been using OPS to rate players (see below) for years, but cases like Curtis Granderson started to shake the faith in those legacy measures.
Category | Classification | OPS Range |
A | Great | .9000 and Higher |
B | Very Good | .8333 to .8999 |
C | Above Average | .7667 to .8333 |
D | Average | .7000 to .7666 |
E | Below Average | .6334 to .6999 |
F | Poor | .5667 to .6333 |
G | Very Poor | .5666 and Lower |
Although his performances looked poor when evaluated with OPS after the first month, the club still decided to bet on him. In fact, Data Analytics showed that Granderson was playing pretty well and that his average numbers were only matter of really bad luck. The results of the following months showed that they were on the right track.
Statcast System
In the Major League Baseball, The Statcast platform is a reference for anyone looking for player performance statistics. Statcast Heart is the so-called Player Tracking, a system created by MLB Advanced Media in collaboration with Amazon Web Services. Statcast aggregates and makes available online an impressive amount of data, which can be viewed through tables and charts. In practice, it is possible to study in real time the position of the players and the way it affects the dynamics of the game. Immediate replay has radically transformed the experience of television enjoyment of many sports, but Statcast allows us to make a further step forward. The baseball enthusiast has the possibility to immediately examine each player’s stats (for example, what is a shortstop‘s reaction time and what direction he went to) and compare them to the available historical series.
These technologies are also used in other sports other than baseball. This is the case of NBA player tracking and tracking systems used by major NFL teams, as well as large soccer teams in Europe. Worldwide, several important suppliers of match analysis solutions have emerged. Catapult Sports (with its subsidiary GPSports) has already several clients among football, baseball and rugby clubs. Furthermore, GPSports produces one of the lightest wearable GPS in the world, able to record speed, change of direction, heart rate and other parameters. STATSport is another important player in this market, which supplies the Viper Pod device to football teams such the like of Liverpool, Manchester United and Arsenal, but also to MBA teams such like Washington Wizards and Chicago Bulls. Another interesting solution is Adidas miCoach Elite, whose peculiar feature is the ability to aggregate data from all of a team’s payers, providing a remarkable contribution to the development of game strategies.
Sports journalism changes too
The strong thrust of analytics in the world of sports also has a direct impact on industry journalism. A new type of computer-assisted reporting (CAR) is emerging, which is based on advanced search techniques, use of resources and data available on the Internet, data grabbing, data scraping. The main difference between this new scenario and the past is evidently linked to big data. Available data are no longer scarce, closed and from few sources. In this sense, it is interesting especially the emergence of new information brokers. This is the case for Sportradar, the world’s leading provider of data and statistics for sports applications. Sportradar works in partnership with the main world’s sports federations and acts as a hub for thousands of journalists, thanks to a model based on a digital platform that can be integrated with all media through APIs and widgets.
In short, the growing weight of analytics is deeply changing the sporting experience for practitioners, followers and those who tell it. It is a scenario we want to look into. This is why we decided to support the first italian hackathon of football, which took place on October 14th and 15th, in Trento. The initiative was promoted by the Federazione Italiana Giuoco Calcio (FIGC) and the University of Trento.
For two days, digital innovators, developers and designers will meet to study new ways of using big data to analyze the performance of football matches. A further stream of work will instead focus on enhancing the relationship between FIGC and its members. It is a great challenge for anyone who is interested in technological innovation and loves sport.