Bat Speed = Contact?

A question popped up in my mind recently: How does bat speed affect contact rate? We know that higher bat speed leads to better contact, but the question here is on making contact. Intuitively we’d think that more bat speeds leads to a little more time to make a swing decision and the ability to catch up to higher velocities, which then leads to more contact. On the other hand, maybe it doesn’t really matter and the gains are not as much as we think and the actually hitting ability of a batter is more important. Let’s dig into the data!

Here are the top 5 hitters based on bat speed

PlayerAverage Bat Speed (MPH)Whiff %Batting AverageBarrel %
Oneil Cruz78.6534.3%0.20920.3%
Junior Caminero78.6524.5%0.25312%
Jo Adell77.6228%0.22916.7%
Nick Kurtz77.5733.4%0.30419.3%

What stands out is that the barrel percentages are pretty elite, as we’d expect, but the whiff % is really bad and batting average nothing to write home about (except Kurtz, who had a phenomenal year). What gives? Maybe this is small sample theater (i.e. A small sample does not tell the whole story) so let’s look at 215 qualified players.

VariableCorrelation with Bat Speed
Whiffs per swing0.618
Squared up per bat contact-0.471
Competitive Swing %0.411

We get not so strong correlations for all combinations of variables. Interestingly, as bat speed increases then whiffs per swing and competitive swing % increases (positive trend) while squared up per bat contact decreases (negative trend).

Let’s look at the regression!

We get the following equation

Bat Speed ~ -140.6 + 22.5*Whiff_Per_Swing – 9.94*squared_up_per_bat_contact + 233.3*percent_swings_competitive

We get an R-Squared of 0.497, indicating that only 49.7% of the variability is explained by the model. All variables are significant and we see the same trends as the correlation showed.

Well, it appears that there is more than meets the eye (literally) when it comes to bat speed and contact. Other variables such as pitch selection, probably contribute to contact as well. So bat speed is a great indicator of quality of contact, but a moot point if a player is not making contact.

Note: As a professor who teaches data analytics, I decided to do this analysis in Excel since my students have to for the course. Although not great for large scale, you can explore data in Excel to get a start.

Dedication

This blog post is dedicated to my cousin Mike, who passed suddenly a couple months ago. He was the best food blogger in Orange County, and inspired me to do a blog. https://eatingmywaythroughoc.blogspot.com

Data

BaseballSavant

Who is Jack Kochanowicz?

Jack Kochanowicz bursted onto the scene at the end of 2024 for the Angels posting a 3.99 ERA, 104 ERA+, 1.19 WHIP, and 3.8% BB%. He came up as a high groundball pitcher with the occasional strikeout. In 2025 he sports a 5.53 ERA, 74 ERA+, 1.59 WHIP, and 10.4% BB%, all well below what he did in 2024. On the surface we can see that his BB% went sky high, but what happened and who is the real Jack Kochanowicz? Is it the league catching up or is he doing something different?

First, we will look at the changes from BaseballSavant year over year.

FB velo, average exit velocity, chase rate, and hard-hit % were about the same year over year. However, whiff% (+5%), k% (+6%), BB% (+7%), barrel % (+4%) went up while GB% went down (-6%). So on the surface it appears more of a focus on swing and misses than groundballs, higher number of walks, and barrels. Interestingly, his xERA and xBA were around the same. Expected metrics are generally better because they account for randomness and luck, and it appears that maybe the 2025 version is more of who he is and he was lucky in 2024.

Let’s look deeper at the data.

First let’s look at pitch usage.

Pitch20242025Change
4-Seam Fastball6.1%18.4%12.3%
Changeup2.9%10.9%8.0%
Sinker72.6%47.8%-24.8%
Slider14.1%16.3%2.2%
Sweeper4.3%6.5%2.2%

Kochanowicz has gone away from the sinker with a higher reliance on the 4-seam fastball and changeup this season. His sinker has been hit harder this year, but his homerun rate has skyrocketed with less of them.

This leads to the question of is he attacking hitters differently? Now let’s look 1st pitch distribution to see if there is the same trend.

1st Pitch20242025Change
4-Seam Fastball3.0%13.5%10.5%
Changeup0.4%2.1%1.7%
Sinker75.2%53.9%-21.3%
Slider17.1%25.9%8.8%
Sweeper4.3%4.6%0.3%

A lot more 4-seam fastball and slider usage first pitch and less sinkers. How about with 2-strikes?

2-strikes20242025Change
4-Seam Fastball13.6%24.0%10.4%
Changeup8.2%17.6%9.5%
Sinker54.4%39.4%-15.1%
Slider16.3%10.4%-5.9%
Sweeper7.5%8.6%1.1%

Less sinkers and sliders, but more 4-seam fastballs and changeups.

It is clear that the approach throughout the entire plate appearance has drastically shifted from a lot of sinkers to a combo of fastball and changeup. In 2024 he actually only threw 25 changeups! The sinker and slider have shown differing results the last two season.

Sinker

YearwOBAxwOBADifference
20250.3600.349-0.011
20240.3300.3480.018

Slider

YearwOBAxwOBADifference
20250.4830.445-0.038
20240.3100.307-0.003

For the sinker, in 2024 his xwOBA was higher indicating a little luck but that has reversed in 2025 where it is the opposite indicating a little unlucky. Overall the xwOBA is about the same year over year. So his sinker is about the same quality as last season.

The wOBA and xwOBA for the slider were about the same, but the difference year over year has been significant. Looking beyond this, the spin rate is similar to 2024. Exit velocity on the slider is up +2 mph, whiff % up +3.4%, and launch angle down -5 degrees. What gives?

Let’s look at location

From an initial look it appears more sliders are in zone and middle-middle in 2025, but there are more sliders thrown in 2025. Time to isolate on just swings.

Looks like a bit more in the zone and closer to middle-middle.

The Angels broadcast talked about the sinker being up this year contributing to the use of the changeup more. Here is the plot of swings on the sinker.

To me, it looks like a similar distribution.

So who is Jack Kochanowicz? I think he is somewhere in the middle of 2024 & 2025 to date. Cheap answer I know, but he drastically changed his approach to hitters and his expected metrics about the same which shows similar quality of contact from the hitters. Also there is only a year and a half-ish of data on him. What is contributing to his higher ERA is his high walk %. Definitely not a pitching coach but based on the data he needs better command, to increase the sinker usage, and fix the slider location or use it less. He is a groundball pitcher who should use the Angels infield defense to his advantage and not worry as much about the strikeout.

It is fascinating to me how players can be so good for a time and then the numbers flip. Always been interested in what they are doing different and how can they get back to who they were. More of this similar analysis in the future!

Data

BaseballSavant

Putting a bow on the impact of the shift

I wrote over a year ago about the possible impact of banning the shift (Impact of Banning the Shift) and I revisited it for the 2024 season in May (Revisiting the impact of the shift). Now that the 2024 season has ended, let’s look at all the data. Again, we will look at just pulled baseballs on the ground (which the shift was used for) for the 2024 season.

Batting Average (2024)Batting Average (2022)Difference
0.2070.222-0.015

Interestingly, in May we found the batting average was 0.281. Granted, it was only a month and a half into season, it seems like the pitching caught up. In my original post about the shift there was an -0.031 impact estimated from the shift. We see here with real data that it was actually less of an impact than estimated. The goal of banning it was more action, and it looks more hits were not induced at the clip the league may have expected.

Let’s look at the hitters that were identified as being the most negatively impacted by the shift in the first post (if they are still playing in MLB)

NameBatting Average (2024)Batting Average (2022)Difference
Joey Gallo0.080.071+0.009
Byron Buxton0.3750.146+0.229
Eddie Rosario0.1220.081+0.041
Jose Ramirez0.1650.340-0.175
Joc Pederson0.2760.250+0.026
Salvador Perez0.20.208-0.008
Ozzie Albies0.2430.167+0.076

Compared to the differences in May, these are very minuscule, with the largest positive change being Byron Buxton and negative being Jose Ramirez. In Jose’s case, he shoots the ball all over the field and Buxton’s speed and being healthier may have had an impact.

We do not see the positive effects MLB wanted from banning the shift. Maybe there are cases up the middle where there are more hits now but for pulling the baseball there is minimal effect.

When I did my initial post, it was interesting that the impact would have been minimal and then in May it looked like the opposite. This is definitely the idea of regressing to the mean and small data problems.

To me this was a very interesting result and I hope you enjoyed the ride!

Data

Baseball Savant

Quality Pitch % (QP%) and QP%+

There are many metrics to evaluate how good a pitcher is. From tradition metrics like ERA and WHIP to Sabermetrics with FIP and WAR to Statcast era metrics like Whiff% and CSW%, there are so many out there. Recently, there is a trend to look at swing and miss ability of a pitcher as a main metric for how good a pitcher is but really you should look at several. What I think is missing right now is looking at command + stuff in a metric, so what I propose here is quality pitch % (QP%) and QP%+ to evaluate this.

First we need to define what this metrics encompasses. What really is a quality pitch? Well, could be a pitch that is at the edge of the strike zone (shadows) but also a swing a miss pitch. Even if not called a strike for the umpire or contact, hitting the edge of the strike zone is a good pitch. In addition, a pitch could be anywhere and a hitter missing it shows it was a good pitch. Alas, we have a definition! The formula then becomes:

QP% = (# Shadow Pitches + # Swing and Miss Pitches)/(Total Pitches)

We will select the data with this criteria from BaseballSavant, looking at starting pitchers with at least 1000 pitches for the 2024 season and relief pitchers with at least 750 pitches in 2024.

Starting Pitchers

Let’s look at the top starting pitchers in QP%

NameQuality PitchesQP%QP%+
1Yamamoto, Yoshinobu67153.8%126
2Snell, Blake83652.7%123
3Crochet, Garrett119852.5%123
4Gallen, Zac121252.4%122
5Perez, Martin106551.7%121
Ryan, Joe108851.7%121
7Crawford, Kutter136051.6%120
8Flaherty, Jack126251.5%120
Castillo, Luis147251.5%120
Ober, Bailey125251.5%120
11Anderson, Tyler141851.4%120
Imanaga, Shota123251.4%120
Woo, Bryan75251.4%120
Gibson, Kyle134151.4%120

We see players that have great stuff like Snell, Yamamoto, Gallen, and Castillo but also interesting names like Crawford, Perez, and Anderson. Crawford (4.19 ERA, 103 ERA+) and Perez (4.36 ERA, 96 ERA+ but better with Padres) don’t jump off the page but Anderson (3.60 ERA, 118 ERA+ and All Star) has turned in an exceptional year.

Relief Pitchers

Let’s look at the top relief pitchers in QP%

NameQuality PitchesQP%QP%+
1Nardi, Andrew50956.2%146
2Hader, Josh58154.3%141
3Clase, Emmanuel48954.3%141
4Miller, Mason49854.2%141
5Lee, Dylan42453.5%139
6Iglesias, Raisel45053.3%138
7Erceg, Lucas46753.2%138
8Yates, Kirby49853.0%138
9Estrada, Jeremiah49452.5%136
10Cano, Yennier48152.4%136

We see similar trends to the starting pitchers with Hader, Close, Miller, and Iglesias a few here with fantastic stuff but very surprisingly Nardi (5.07 ERA, 89 ERA+) is at the top.

So what does all this result in? Well, we can evaluate a pitcher’s ‘pitchability’ by using QP% but it doesn’t tell the whole story, just like the other metrics. However, I think we should look at this because if a pitcher locates, they should have more success. You can’t teach command, but you can work on spin rate, break, etc. in the pitching lab.

Data

Baseball Savant

(R)Shiny AAA Pitch Data

I first made an RShiny app in graduate school when I first discovered it in a course I was taking. Simple, yet powerful applications. Of course, I used baseball data for fun outside of my assignments. The goal was to look at pitch data and break it down by location. I wanted to bring that back to make something with AAA data!

Recently, MiLB has been tracking pitch data using Statcast. I used 2024 data from the Salt Lake Bees (Angels AAA Affiliate) to make an RShiny app of pitch data. I don’t believe it is complete data, since not all stadiums have Statcast, but used what there was. I isolated on all swing, not including foul balls. My goal was to be able to allow a user to filter by hits, in-play, whiffs, pitcher, and date. The output would be pitch location, separated by pitch type, and associated events. In addition, I added a stacked bar chart for frequency of pitch types by events. Let’s see how it turned out!

I didn’t do anything super fancy, but I think it is pretty neat. It is not BaseballSavant level but something you can do with open source data and software.

You can check it out here: http://cwatkins1123.shinyapps.io/Bees_Pitcher_App

Won’t gatekeeper code, so here it is for those interested:

bees <- read.csv("bees_pitch_data_24_data.csv")
bees$game_date <- as.Date(bees$game_date)
bees <- bees %>%
        mutate(events = ifelse(events == '', 'whiff', events))


tzone <- round(mean(bees$sz_top),2)
bzone <- round(mean(bees$sz_bot),2)
inKzone <- -.95
outKzone <- 0.95

kZone <- data.frame(
  x = c(inKzone, inKzone, outKzone, outKzone, inKzone)
  , y = c(bzone, tzone, tzone, bzone, bzone)
)


ui <- fluidPage(
  titlePanel("Bees Pitchers 2024", window ="Bees Pitchers 2024"),
  
  sidebarLayout(
    sidebarPanel(radioButtons("resultInput", "Result", choices = c("All", "Hits","In-Play","Whiffs"), selected = "All"),
                              uiOutput("playernameInput"),
                 sliderInput("dateInput",
                             "Dates:",
                             min = min(bees$game_date),
                             max = max(bees$game_date),
                             value = c(min(bees$game_date),max(bees$game_date)),
                             timeFormat="%m-%d-%Y")
    ),
    mainPanel(plotOutput("coolplot", width = "750px", height = "750px"),
              br(),
              plotOutput("coolplot2"),
              br(),
              textOutput("nrow"),
              br(),
              textOutput("credit"),
              br(),
              textOutput("signature"),
              br(),
              br())
    
  )
)
server <- function(input, output){
  output$playernameInput <- renderUI({
    selectInput("playernameInput", "Pitcher", 
                choices = sort(unique(bees$player_name)),
                selected = "Crouse, Hans")
  })
  filtered <- reactive({
    if(is.null(input$resultInput)) {return(NULL)}
    else if(input$resultInput == "Hits"){
      bees %>% 
        filter(player_name == input$playernameInput,
               events %in% c('single', 'double', 'triple', 'home_run'),
               game_date >= input$dateInput[1],
               game_date <= input$dateInput[2])
    }
    else if(input$resultInput == "In-Play"){
      bees %>% 
        filter(player_name == input$playernameInput,
               description == "hit_into_play",
               game_date >= input$dateInput[1],
               game_date <= input$dateInput[2])
    }
    else if(input$resultInput == "Whiffs"){
      bees %>% 
        filter(player_name == input$playernameInput,
               description %in% c("swinging_strike", "swinging_strike_blocked"),
               game_date >= input$dateInput[1],
               game_date <= input$dateInput[2])
    }
    else{
      bees %>% 
        filter(player_name == input$playernameInput,
               game_date >= input$dateInput[1],
               game_date <= input$dateInput[2])
    }
  })
  
  output$coolplot <- renderPlot({
    if(is.null(input$playernameInput)) {return(NULL)}
    ggplot(filtered(), aes(x = plate_x, y = plate_z)) + geom_point(aes(col = events)) +
      scale_y_continuous(limits = c(0,5)) +
      scale_x_continuous(limits = c(-2.2, 2.2)) + coord_equal() +
      geom_path(aes(x, y), data = kZone, lwd = 1, col = "red", alpha = .5) +
      labs(x = "x", y = "z", title = "Pitch Location") +
      theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 20),
            legend.title = element_text(face = "bold"))+facet_wrap(~pitch_name, ncol =2)
  }, height = 750, width = 750)
  
  output$coolplot2 <- renderPlot({
    if(is.null(input$playernameInput)) {return(NULL)}
    ggplot(filtered(),aes(fill = events, x = pitch_name))+
      geom_bar(aes(y = (..count..)/sum(..count..)))+
      labs(x = "Pitch Type", y = "Frequency")
  })


  output$nrow <- renderText({
    if(is.null(input$playernameInput)) {return(NULL)}
    nn <-nrow(filtered())
    paste("Based on your criteria, there were", nn, "pitches found.")
  })
  
  output$credit<- renderText({
    paste("Data pulled from BaseballSavant")
  })
  
  output$signature <- renderText({
    paste("By Chris Watkins, Ph.D.")
  })
  
}
shinyApp(ui = ui, server = server)

Data

Baseball Savant

What happened to HRendon?

Anthony Rendon was an All-Star, Silver Slugger and 3rd in MVP voting in 2019 with the Nationals. His injury issues with the Angels have been well documented, and he is on pace to play in the most games (over 58) with the Angels in 2024. What has interested me though is his lack of power from 2020-2024 after his increase in power from 2016-2019. Surely there is something in the numbers to pick out right? Let’s see.

YearAgeHRSLG
201929340.598
2020*3090.497
20213160.382
20223250.380
20233320.318
20243400.280
Anthony Rendon
*COVID shortened year

We see a significant decrease in homers (this trend can be seen in doubles as well), leading to a decrease in slugging. These years should be prime years for Rendon, and we could blame injuries for sure but let’s compare to Mike Trout, who has had his share of injuries and only 2-years younger than Rendon.

YearAgeHRSLG
201927450.645
2020*28170.603
20212980.624
202230400.630
202331180.490
202432100.541
Mike Trout
*COVID shortened year

It’s worth noting in 2023 that Trout struggled hitting the fastball (that’s a story for another day), leading to more strikeouts and lower slugging. Regardless, we can see that when Trout was healthy, he was still Mike Trout (for the most part). What gives with Rendon? Well, for that we need to look at what he is swinging at and the contact metrics.

YearAvg Exit Velo (MPH)Avg Launch AngleBarrel %Chase %Whiff %
201990.419.512%20.6%12.9%
2020*90.119.56.3%16.7%14.7%
202189.122.35.6%21.9%17.2%
202289.618.78.3%18.6%19.6%
202390.116.24.8%16.9%17.9%
202488.114.12.5%17.5%12.5%
Anthony Rendon
*COVID shortened year

In terms of baseballs he swings at, Rendon is still elite in chase % and whiff %. So the pitches he is swinging at are still good pitches and he makes contact. The average exit velocity has been similar, although the lowest in 2024, but we see a significant decrease in barrel % and varying launch angle. To me (not a hitting coach), it seems like he does not have a consistent swing because the launch angle has been up and down the last several years. This is the first year Statcast is tracking bat speed, and Redon’s average bat speed is 68.4 MPH, which is not good at all. Interestingly, his 34.5% squared up % is elite.

Rendon’s eye at the plate is still elite, which gives promise for the future. Obviously he needs to stay on the field, but also needs a more consistent swing, more bat speed, and increase in launch angle for more consistent barrel %. Could be his injuries leading to this as well.

I, and all Angels fans, are hoping he figures it out for 2025.

Data

Baseball Savant

Baseball Reference

Can we perfectly predict exit velocity?

New Hawkeye technology is allowing us to measure bat speed for hitters and the data has now become available through Baseball Savant. The idea of exit velocity is related to the change in velocity (i.e. Pitch speed to exit velocity), where it should be dependent on pitch speed, launch angle (how good the contact is), and bat speed. For those physics fanatics, it is related to momentum (P = mv). So, do we have everything we need to predict exit velocity perfectly? Let’s dive in!

I selected data from just August of contact (no foul balls) because we have enough data for a model (13,301 observations) from Baseball Savant.

First, let’s look at correlation between exit velocity and the predictive variables

Variable 1Variable 2Correlation
Exit VelocityPitch Speed0.108
Exit VelocityLaunch Angle0.149
Exit VelocityBat Speed0.468

We see a positive correlation for each, meaning there is a positive trend between exit velocity and the other variables. These are weak correlations, with bat speed being the most decent correlation.

Next, we will model exit velocity (Exit_Velo ~ Pitch_Speed + Launch_Angle + Bat_Speed)

Note: release_speed = pitch_speed

With a logistic regression model, we get a model that explains 23.6% of variability in the data, with bat speed having the largest effect size. All variables are significant, but with a large data set this is expected. Let’s call it how it is, this model is not good, which is very interesting. Would more data fix this? Maybe, but the sample size is large. What other variables could be impacting exit velocity? Maybe weather? From a physics perspective, we have a majority of what we have (except air friction from weather) so this is a surprising result to me at least.

Being the physicist lover that I am, let’s build a change in momentum statistic (Exit Velocity minus Pitch Speed).

player_namebat_speedmomentum_change
1Cabbage, Trey78.9137220.2
2Narvaez, Carlos75.1173117.1
3Baker, Luken74.8906916.2
4Crawford, Brandon79.5133815.6
5Monasterio, Andruw70.591943312.85
6Haase, Eric71.354012910.1714286
7Sweeney, Trey76.34846410.1
8Riley, Austin75.01776578.12727273
9Cameron, Daz70.82803278.01818182
10Gonzalez, Romy73.50865567.71111111

And compare to highest bat speed.

player_namebat_speed
1Stanton, Giancarlo81.1253164
2Walker, Jordan80.2783138
3Crawford, Brandon79.51338
4Cabbage, Trey78.91372
5Wallner, Matt78.3187904
6Wisdom, Patrick77.8608838
7Leon, Pedro77.70978
8Judge, Aaron77.4944854
9Schwarber, Kyle77.3780231
10Adell, Jo77.1606417

Looking at these lists, those with the top 10 change in momentum that are on the top 10 list for bat speed are the following: Trey Cabbage, and Brandon Crawford (only 20%!!).

The old adage is that a pitcher that throws fast “provides the power”, which may be the reason for these results. Or, in general pitchers throwing faster means there is less of a difference in momentum here. There is more to learn with bat speed being available, and as more metrics become available it is exciting. Also, shout out to Jo Adell for being top 10 in bat speed in this data set!

Data

Baseball Savant

Revisiting the impact of the shift

I wrote over a year ago about the possible impact of banning the shift (Impact of Banning the Shift) so I wanted to revisit it for the 2024 season so far. This gives the hitters and pitchers a season (2023) to adjust to the change. Again, we will look at just pulled baseballs on the ground (which the shift was used for) for the 2024 season up to this point.

Batting Average (2024)Batting Average (2022)Difference
0.2810.222+0.059

We are definitely seeing an increase in batting average so far on pulled baseballs on the ground, resulting in about 6 more hits per 100 baseballs pulled on the ground. Recall that the impact in the previous post was -0.031, about 3 less baseballs that were hits with the shift. A caveat here is that we are only about a month and a half into the season, so it could change by the end (which I till revisit!).

Let’s look at the hitters that were identified as being the most negatively impacted by the shift in the previous post (if they are still playing in MLB)

NameBatting Average (2024)Batting Average (2022)Difference
Joey Gallo0.4290.071+0.358
Byron Buxton0.5520.146+0.406
Eddie Rosario0.2140.081+0.133
Jose Ramirez0.2350.340-0.105
Joc Pederson0.2330.250-0.017
Salvador Perez0.4670.208+0.259
Ozzie Albies0.3750.167+0.208

Nearly everyone on the list that was negatively impacted by the shift are showing more success in 2024 on pulled baseballs on the ground. Jose Ramirez sprays the ball all over the field and Joe Pederson’s difference is not as large as the others, so something else may be going on there.

It will be interesting to see how this shakes up at the end of the season, but players are definitely seeing large improvements without the shift.

Data

Baseball Savant

Making the Case for RTB (Runners Total Bases) and RTB%

Many (if not all) those in Sabermetrics think the RBI is dead and that it should not be considered when evaluating a player. This is because RBI’s are dependent on players around you getting on base and in scoring position, thus not truly an individual statistic. The opponents of that thinking say there is something to be said about a player being ‘clutch’ and coming through in those situations because more runs lead to more wins. My thinking is that there is a case in the middle, which is looking into how a player moves runners on base. To score runs you need to get on base, then move the runners till they eventually score and there are many ways to move runners than just hits. This is where Runners Total Bases (RTB) comes in.

So what is RTB? Well, it is the number of bases that runner on base is moved by a player.

For example, let’s say Shohei Ohtani is on first base with Mike Trout at the plate. Trout singles and Ohtani moves to 3rd base. Then, Trout’s RTB for that at bat is 2.

Simple right? Naysayers would say that RTB alone is the same as the RBI because it is dependent on opportunities so let’s fix that with RTB%.

equation

For example, let’s assume Trout had 4 RTB and 2 opportunities in one game (i.e. 2 runners were on base total during his PA’s) then RTB% = 2. This means that on average, Trout moves runners 2 bases.

From here we can get metrics like RTB%+, can add weights in for certain types of hits that contribute to RTB, among other things. I would remove plays that result in errors, as that is on the defense, and walks.

Is this metric perfect? Absolutely not and should be tested, but I think it is the start to improving on RBI to get a metric that evaluates a player fairly. The idea that is on my mind of how this helps is optimization: Get players with high OBP and high RTB% and you will score runs. Ideally putting high RTB% players hitting behind high OBP players.

You probably noticed that I don’t have data showing an example of RTB. The data is available via Baseball Savant, but would take a bit of time to code and apply the logic. With that in mind and having a full-time job, I wanted to put the idea out there (good or not) and apply it later. Maybe a future post I can take a small sample of data to show this in action.

Isolated OBP

The other day I was thinking about the metric ISO (Isolated Power) which is SLG minus AVG and thought about what a metric of OBP minus AVG would tell us. Of course someone already thought of this, namely Michael Salfino of The Athletic here. It always seems like someone smarter thinks of the good ideas first but let’s take a deeper dive into the 2023 data.

So what does this Isolated OBP (ISO OBP) actually tell us? Well it gives us a measure of how much a player depends on hits to get on base. This means the lower the ISO OBP the more dependent on hits the player is to get on base. Hits have a lot of randomness to them so they are not as reliable as a hitter’s eye at the plate to get on base. A player could be in a slump at the plate but still get on base due to their ability to get a walk. Ideally a leadoff hitter would have a higher ISO OBP.

Let’s dig into the data!

NameBAOBPISOISO OBP
1Kyle Schwarber0.1910.3350.2740.144
2Ryan Noda0.2390.3810.1810.142
3Juan Soto0.2610.4010.2280.140
4Matt Carpenter0.1740.3140.1290.140
5Max Muncy0.2050.3330.2800.128
6Jose Caballero0.2270.3550.0950.128
7Jack Suwinski0.2060.3330.2360.127
8Aaron Judge0.2650.3920.3620.127
9Joey Gallo0.1730.3000.2570.127
10Andrew McCutchen0.2500.3730.1380.123
Data was taken from Baseball Reference as of 8/30/23 of hitters with at least 200 PAs

Looking at the top 10 in ISO OBP we see players we’d expect like Juan Soto, Aaron Judge, Kyle Schwarber, and Max Muncy with superior plate discipline. We also see the three true outcome (homerun, walk, strikeout) players in Kyle Schwarber and Joey Gallo. Of these players, 6 out of 10 have led off this season.

We can see that if we take ISO (Isolated Power) into consideration there are players with very weak ISO but high ISO OBP. In an ideal world you’d like to maximize both and you get players like Schwarber, Soto, Muncy and Judge. I believe that these are your ideal leadoff hitters because they get on base via a walk and when they hit the ball it is likely that extra bases are involved. There could be an argument to be made that batting them second (like Judge for example) is the best because they can drive in more runs. To which I’d argue that if you don’t have a hitter in front of them that has a high ISO OBP then you will not have as many runners in front of them over the course of the season. The ability to drive runners in is a different story for a different blog post.

The Moneyball adage is getting on base leads to wins. Higher ISO OBP shows a hitter has a good eye and will get on base without depending on a hit. Ideally you maximize both ISO and ISO OBP for a leadoff hitter but a high ISO OBP should lead to more times on base in the long run.