Week 3: Basic Bayesian Inference: Evaluating Hitter Performance

Ian M -

This week, I’ve continued my exploration of Bayesian statistics in baseball, and my focus this week was on evaluating hitter performance, particularly looking at how batters perceive pitching and adjust to information while hitting. The first part of my work focused on posterior probability, specifically in relation to individual batters and their responses to observed pitching performance.

In Bayesian statistics, posterior probability is a way of updating our beliefs about a batter’s ability based on new data. Instead of just observing their batting average, Bayesian methods allow us to account for how a batter perceives and reacts to the information they gather while facing pitchers. Every time a batter steps up to the plate, they are gathering information such as pitch types, pitch locations, and pitcher tendencies which they use to adjust their hitting strategy. By analyzing these adjustments, we can model the batter’s response to different types of pitches over time.

This process works by taking into account prior knowledge of the batter’s skill level—perhaps their performance in previous seasons or against similar pitchers—and updating that belief with each new observation. For example, if a batter starts the season struggling against fastballs but adjusts their swing mechanics after a few games, Bayesian inference helps us update our understanding of their hitting ability. We can calculate the posterior probability of the batter’s true ability to hit fastballs, accounting for both their previous performance and the new data from the most recent games. Over time, this helps us form a more accurate picture of how a batter is adjusting and whether they are improving or regressing in response to different types of pitchers.

Part A describes the process by which a batter interprets the information provided by how the pitcher delivers the ball to determine if and where to swing for the best results.
Part B is a breakdown of each aspect of a pitcher’s delivery and the pitch itself and their influence on how batters perform against those pitches.
Part C shows how ball position is estimated by batters by accounting for prior belief and observed factors simultaneously to form a posterior distribution.
Part D displays that relying on posterior distributions rather than prior or observed data only creates the best likelihood of successful batting.
Image credit to Bayesball: Bayesian Integration in Professional Baseball Batters by Justin A. Brantley and Konrad P. Körding.

The second part of my work this week centered on the streakiness of hitters and their odds of slumping. Baseball is known for its streaky nature, with players sometimes experiencing hot streaks followed by cold spells. Traditionally, these streaks are just seen as random fluctuations in a player’s performance. However, Bayesian methods allow us to directly quantify the likelihood of a slump occurring based on a player’s previous performance and the variability inherent in the game.

By using Bayesian models to analyze the fluctuations in a batter’s performance, we can estimate the probability of a hitter entering a slump, given their recent history. For instance, if a batter has been consistently performing above their career average, Bayesian inference allows us to predict the probability of them regressing to the mean. Conversely, if a player is struggling, Bayesian models can help us calculate the likelihood of them recovering and returning to their usual performance levels. The power of this approach is that it provides a more nuanced understanding of these streaks, allowing us to model not only the player’s talent but also the uncertainty and variability that are inherent in baseball.

Using a basic coin-flipping model, we can predict slump (O-fer) lengths based on expected batting average. Furthermore, this model allows for inputting of observed data to determine tail probabilities of slumps of certain lengths. Credit to Jim Albert, Introduction to ShinyBaseball Package, Version 0.5.3.

Ultimately, this week’s exploration of Bayesian statistics has helped me develop a deeper understanding of how hitters adjust to pitching and how their performance can fluctuate over time. By modeling how batters update their beliefs based on pitch selection and how likely they are to experience slumps, we can provide more sophisticated predictions of their future performance. This week’s work has reinforced the value of Bayesian methods in capturing the complexities of baseball, and how they allow us to view player performance through a probabilistic lens, improving our ability to make informed assessments.

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    nakyung_y
    Hey Ian! Based on your photos, there really is more statistics involved in baseball than I thought. Are you finding any interesting patterns when comparing how batters adjust against familiar versus unfamiliar pitchers?
    emma_k
    Hi Ian, it's so cool to see how Bayesian statistics can capture such a wide range of factors that seem unquantifiable at first glance! I never expected that players' slumps and streaks could actually be predicted with data!
    avaya_a
    The idea of using Bayesian models to predict slumps is super interesting! It's super cool that even future performance can be predicted by this modeling.
    ian_m
    Thanks for the question, Nakyung. One trend that often appears in data for how certain batters perform against certain pitchers is that most batters have pitchers that they perform very well against and others that they struggle against. However, given that the sample sizes in this case tend to be very small, this can be simply caused by randomness rather than an actual advantage or disadvantage, so it can be hard to draw a strong conclusion in this case.
    ian_m
    Thanks for the comment, Emma. When working with the ShinyBaseball R package, several different tools actually can predict a player's chances of streaks or slumps in different ways, ranging from a simple coin-flipping method to an advanced Markov switching model that I hope to utilize more later in which each prior outcome influences the next outcome's parameters.
    ian_m
    Thanks for the comment, Avaya. I also found it quite exciting to finally be able to create simple Bayesian predictions about player performance, and, next week, I actually intend to focus even more on predicting future performance for hitters.

Leave a Reply

Your email address will not be published. Required fields are marked *