Why are Angels pitchers struggling with two strikes?

Introduction

Generally having two strikes is favorable for the pitcher. Hitters are hitting 0.171 in 2023 with two strikes but the Angels are giving up a BA of 0.187. For an 0-2 count it is tougher with hitters hitting 0.151 but the Angels are giving up a league worst 0.209 BA. The question is simple: Why? I dig into the data to see if there is anything we can find from it.

Methodology

Data was taken from Baseball Savant of all two-strike counts that led to an outcome (hit/out) for Angels pitchers through June 5th, 2023.

Results

Location

Here is a reference to the zones I will be referencing:

From Baseball Savant

First I looked at where the pitchers are throwing the baseball in two strike counts.

zonecount
14155
13117
5111
876
474
674
1270
1164
962
761
257
355
153

The top two zones are low and away/outside but shockingly third and fourth on this list are middle/middle and middle/down. Pitching in the heart of the plate is obviously not good, and a pitch being down goes right into the swing path of a hitter in this launch angle revolution.

Here is how things look on an 0-2 count:

zonecount
1453
1322
420
615
715
1215
513
912
210
310
810
1110
17

The first two are the exact same as two strike counts so it seems the philosophy is to try to get a hitter to swing and miss something down low. Though middle/middle (zone 5) isn’t third here, 4 & 6 are middle/in or out.

Unsurprisingly, with two strikes the hits are mostly when the location is zone 5 but on 0-2 it is zone 14. Seemingly the ball is not getting down enough in zone 14.

Pitch Type

Now let’s look at pitch type in two strike counts.

pitch_typecount
FF269
SL250
CH239
ST100
CU55
FC45
FS35
SI32
KC4

So in a two strike count they are throwing a four-seam fastball the most followed by the slider.

Let’s look at 0-2

pitch_typecount
FF52
SL50
CH47
ST24
CU13
FS11
FC8
SI6
KC1

Similar story here, with four seam fastballs being thrown the most. The majority of hits on 0-2 are sliders followed by the four-seam fastball.

Pitch Type + Location

We’ve looked at pitch type and location separately but now let’s slice the data by both (looking at the top 10), starting with two strikes.

pitch_typezonecount
CH1455
SL1442
FF1141
SL1340
CH1336
FF535
ST1434
FF1232
CH831
FF231

There is a clear trend here of throwing a changeup or slider down in the zone an a fastball up/in or outside. However, 35 pitches were middle/middle four seam fastballs!

Let’s look at 0-2

pitch_typezonecount
CH1418
SL1414
ST1412
SL1310
FF129
FF38
FF47
FF117
CH86
CH136

Similar story here but no middle/middle fastball. Interestingly, the slider and changeup low/outside or inside resulted in the most hits on 0-2

Visualizing Two-Strike Hits

Lastly, let’s take a look at the location of two strike hits.

Of course the middle of the plate is showing up the most, but a large majority are in the strike zone, and if they aren’t they are close to the strike zone.

Conclusion

So what does this all mean? Unsurprisingly, middle/middle fastballs and sliders are punished but there is a clear pitching philosophy to get hitters out down in the zone rather than up in the zone. This leads to a higher possibility of bloop hits with higher launch angles. A confounding variable here is pitch sequencing, which could be an issue as well but beyond the scope of this post. I am no pitching coach but I think the Angels need to try to get hitters out more up with the fastball and also get the ball lower or more lateral movement out of the strike zone.

It will come down to execution, and I hope they can turn this around.

Data

Impact of Banning the Shift

For the 2023 season MLB will be banning the infield shift. This ban is supposed to increase action on the field, which in this case more hits is the goal. This post looks into what a potential impact would be and which players will be most affected.

Methodology

The idea here is very simple, but we need to be careful with interpretation as you’ll see. This came from something I saw from Sisu, who coined impact. What we do is take the average of the variables of interest (in this case batting average), take the average after removing a certain category of a variable (in this case when the infield alignment is in a shift), and find the difference.

What I don’t like about this technique is that many people will think “impact” means “causation” when it actually is an association analysis. The reason why is that we are removing rows of data, which eliminates interactions between variables that may be there and are actually causing changes in output. For example, we are not taking something like quality of contact into account. It could be that the quality of contact has a larger effect on success. I think there is still value in the simplicity of this analysis but CORRELATION DOES NOT MEAN CAUSATION. If you want more causal analysis, I would go with a machine learning model or hypothesis testing. Rant over, now let’s get into it!

Data

What is the shift supposed to take away? Baseballs that are pulled and on the ground. With this in mind, the data taken for this analysis are all pulled ground balls for the 2022 season.

Results

Overall

Note that the following batting averages correspond to just pulled baseballs on the ground in 2022.

Batting Average (All)Batting Average (No Shift)Impact of Shift
0.1910.222-0.031

So what does this mean? Well on average, the impact of the shift is -0.031 in batting average. This is about 3 less hits per 100 baseballs pulled on the ground. This really doesn’t sound like much in my opinion, but again this is an association and not causal analysis so maybe there is a larger impact than shown here.

Player Level

Now let’s look at which players were the most affected, in terms of impact, by the shift. For this I looked at only players that were shifted on 20+ times in the 2022 season and were also not shifted against (at least once) which resulted in 154 players.

Players Most Negatively Affected By the Shift
NameBatting Average (All)Batting Average (No Shift)Impact of Shift
Joey Gallo0.0710.500-0.429
Joey Votto0.1270.500-0.373
Byron Buxton0.1460.500-0.354
Eddie Rosario0.0810.429-0.348
Mike Moustakes0.0690.400-0.331
Jose Ramirez0.3400.650-0.310
Joc Pederson0.2500.500-0.250
Brad Miller0.1600.400-0.240
Salvador Perez0.2080.400-0.192
Ozzie Albies0.1670.353-0.186

Not surprising that Gallo is at the top but hitters like Votto and Ramirez might be a little surprising. Something to note is that the sample size was less than 10 for the top 5 players for pulled ground balls when not shifted on, meaning that it was rare they weren’t shifted on, which is a confounding variable in this case. Still interesting nonetheless and these players should benefit from the shift being banned. Now let’s look into players that did well beating the shift.

Players Most Positively Affected By the Shift

NameBatting Average (All)Batting Average (No Shift)Impact of Shift
JJ Bleday0.1200.12
Max Kepler0.100.1
Mitch Haniger0.1940.10.094
Michael Perez0.09100.091
Michael Massey0.08600.086
Matt Olson0.1350.0590.076
Mitch Garver0.20.1250.075
Jake Fraley0.150.0910.059
Rougned Odor0.05500.055
Brett Phillips0.04200.042

As you can see, the top positive effects are much smaller than the top negative effects. This means that they managed to beat the shift but really not much difference. Again, sample size of non-shifted ABs is in here as well. Olson is very interesting to me because in Oakland he pulled the ball very often. An extra fun fact is that Shohei Ohtani was 11th, just missing the top 10 cutoff, meaning he was not affected as much.

Conclusion

The shift being banned should have a high association for more pulled ground balls for certain hitters becoming hits but overall we might not see as much impact as we thought. Time will tell though! For what it is worth, I liked the shift because it is just using data to make more informed decisions to take away hits but many would disagree and that is okay too. I have always been interested in a shift based on the pitcher, rather than a hitter, so I think I will look at that next time.

Data

Breakout of Taylor WARd

For years Taylor Ward had been up and down from the major league level and could not gain consistency. Ward then turned himself from an average major leaguer last season (turning it on in the second half) into an above average major leaguer. Before the 2022 season his career bWAR was -0.4 and he is already at 2.3 bWAR this season. Absolutely incredible jump and, if healthy, will be an All Star. Let’s dive into some stats!

Statistics

YearBarrel %Avg Exit VeloLaunch AnglewOBAOPS+BB %Chase %Whiff %
202110.3%89.6 MPH16.4°0.3331088.4%25.1%25.1%
202218.4%89.5 MPH13.2°0.50021217.6%16.6%21.8%
Taylor Ward Statistics

Overall, his number have drastically improved, as seen by the wOBA and OPS+. He isn’t hitting the ball much harder, but his lower launch angle is resulting in better contact. In addition, he improved his chase % by nearly 9%. Clearly his approach at the plate is better and a mechanical adjustment was made. 

So his overall metrics have gotten better but now I want to dive deeper into his at bats.

At Bats

First, let’s look at his swing and misses.

It is early in the season, so we don’t have as much data as last season but in 2021 there were a lot of swing and miss low while this season similar story but more around the plate. 

Let’s look at the counts he has swung in. 

Count Analysis

Ward 2021 Swings

CountFrequencyPercentage
0-07618.5%
2-26014.6%
0-15112.4%
1-14210.2%
1-0368.8%
0-2348.3%
1-2327.8%
3-2286.8%
2-1256.1%
3-1133.2%
2-0122.9%
3-020.5%
Ward Swings 2021

Ward 2022 Swings

CountFrequencyPercentage
0-03416.2%
1-13416.2%
2-22813.3%
1-02210.5%
3-2199.0%
0-1167.6%
1-2167.6%
0-2157.1%
2-1146.7%
2-083.8%
3-131.4%
3-010.5%
Ward Swings 2022

Ward is continuing to swing a lot at the first pitch in 2022 and there isn’t really a huge discernable difference from 2021. He has been quoted that the count doesn’t matter and his approach changed in an article from The Athletic: Ward Article 

“The count … it doesn’t matter,” Ward said. “Just waiting for the ball to show up in the spot. That’s all that matters. No count. No lineup. It doesn’t matter. I’m just in the box, doing my thing. Don’t think about it at all.”

From Sam Blum, The Athletic

Now I will look at his hits in different counts.

Ward 2021 Hits

CountFrequencyPercentage
0-01528.8%
0-1611.5%
1-1611.5%
1-059.6%
1-259.6%
0-247.7%
2-247.7%
3-247.7%
2-023.8%
3-111.9%
Ward 2021 Hits

Ward 2022 Hits

CountFrequencyPercentage
1-1922.5%
3-2615.0%
0-1512.5%
2-2512.5%
0-0410.0%
1-237.5%
0-225.0%
1-025.0%
2-025.0%
3-125.0%
Ward 2022 Hits

In 2021 the majority of his hits were on the first pitch while in 2022 they are spread out, seemingly from his new approach. 

Lastly, I wanted to look at the type of contact he is making and where the ball is going. 

Contact Type Analysis

Ward Contact 2021

Contact TypeFrequencyPercentage
Ground Ball6038.5%
Fly Ball4629.5%
Line Drive4327.6%
Pop Up74.5%
Ward Contact 2021

Ward Contact 2022

Contact TypeFrequencyPercentage
Ground Ball3035.7%
Fly Ball2732.1%
Line Drive2327.4%
Pop Up44.8%
Ward Contact 2022

Ward Directional Statistics

YearPull %Center %Opposite Field %
202139.1%29.5%31.4%
202245.4%30.9%23.7%
Ward Directional Statistics

Ward is putting the baseball in the air a tiny bit more and his ground ball percentage is a little less so far this season. We do see a major shift to pulling the baseball more and hitting it less to the opposite field, which is resulting in his higher power numbers.

Conclusion

Taylor Ward has improved drastically in all offensive categories. His approach at the plate has had a huge impact on his output this season. Analytics only tells you what is happening and what needs to be fixed, but it is up to the player to do it. We are seeing someone successfully changing his approach and becoming an all-star caliber player. Hopefully he can stay healthy and carry these numbers through the rest of the season. 

Data

Whiff-le Ball (Part 2: Naïve Model)

We saw in the previous post that there were quite a few differences when a hitter whiffs versus when they don’t. The data was sliced and diced in many ways like the count, state, and pitch. In this post I will be getting into some data science discussing a Naïve Model, which is basically just slicing and dicing the data and taking averages. For those data scientists out there it’s a very simple decision tree approach. The inspiration for this comes from a talk I saw at last year’s SABR Analytics conference where a Markov Chain approach was done. This inspired me to think that based on the variables we have we could find a probability that an event will happen. I used this approach to answer a question on a data questionnaire as part of the process for a baseball analytics position in baseball operations for the New York Mets. Unfortunately, did not get the position but was told the approach beat the other models submitted by a large margin. Got to the final round but a las, they went a different direction. I cannot say what that question was because of an NDA but will say that it was not the same as predicting whiffs, so I am not breaking any rules here.

Let’s get started!

Methodology

I will be using the exact same data set I used in the previous blog. However, I built one more variable called ‘platoon’ which is a value of True when there is a platoon advantage (i.e. Hitter and pitcher opposite handedness) and False otherwise.

The goal of this is to predict the probability of a whiff. Let’s get into a little data science. To do this I first split the data into a training and testing data set. Where 80% of the original data is in the training set and 20% is in the testing set. Alternatively, you could do 5-fold cross-validation [generally would do this but didn’t make as much sense here and going for simplicity]. The reason we split the data like this is because when a model is run for prediction it is on unseen data. Something to note is that the data is imbalanced, but not that bad. There is ~32% whiffs and the other ~68% is contact. Since we are predicting a probability rather than a true classification problem (Predicting a whiff or not. Either it happened or didn’t.) this is ok. One may argue it is fine for classification as well but that is an argument for another time. If you didn’t follow this, it is okay! Don’t necessarily need to know this. Time to slice and dice the training data set.

The first portion of this is the pitcher side where I will build a data set grouping by the pitcher, pitch name, count, baseball state, and platoon to get every combination. The second portion of this is the hitter side where it will be the same as previously mentioned but grouping by the hitter instead of the pitcher. Here is what they look like:

Then I grouped by pitch name, count, baseball state, and platoon. The reason for this is if a pitcher, hitter or both aren’t in the data sets built above this will break the model. Here is what it looks like.

The last thing that I did was group by just count in case a pitch name, count, baseball state, and platoon combination was in the training set but not the test set.

Model

The model structure itself is simple at its core but could be confusing. If you find it too confusing feel free to jump ahead to results. Basically, we take the pitcher and hitter grouped data sets based on the pitcher and hitter respectively and find the corresponding probability and average them. If either the pitcher or batter don’t exist, we just use the one that does exist and data without pitcher or hitter. However, if both don’t exist, we will just use the grouped data without the pitcher and hitter. If the doesn’t exist, we take the data set just grouped by count. Confused? Let’s take an example.  

Say we have this scenario

PitcherHitterPitch NameCountStatePlatoon
650644518735Cutter2-21 111True

The batter data set doesn’t exist but the pitcher one does so we take that as one data point.

Then we take the data set without the pitcher or batter.

Then our final prediction is a (0.5+0.201)/2 = 0.3505 probability of a whiff.

Okay even I admit that is a lot. If you’ve made it this far, awesome! Let’s get into some results.

Results

I evaluated how well the model did with three metrics: MAE (Mean Absolute Error), accuracy (Over 50% probability is a predicted whiff), and AUC (Area Under Curve). For MAE, closer to a value of 0 is better and it measures how far on average the predictions were off by. Accuracy is the percentage of correct predictions assuming over a 50% probability is a predicted whiff. Finally, AUC (Area Under the Curve) measures how well a model fits the data with closer to 1 being better. I compared across several different model combinations and chose the best one.

Here are the results on predicting whiffs on the test set:

MAEAccuracyAUC
0.34672.4%0.58

But what about class imbalance!? Yes there is some class imbalance, but this is why I calculate AUC.

Conclusion

I showed a simple at its core (yet confusing) decision tree model to predict a whiff, but is it good? The answer is yes, but why? Well, if we randomly simulated data from 0 to 1 and calculated metrics MAE would be 0.5 and accuracy 50%. So, this model performed about 44.5% better than chance in MAE and 22.4% better than chance in accuracy, and 16% better in AUC. There could very well be a better combination of variables out there that does better but what I find the most interesting is that sometimes the simple things are surprisingly good.

Data

Whiff-le Ball (Part 1: Analytics)

Everyone loves the whiff (also known as swinging strikes). Coaches, front offices, scouts, agents. Pitchers that don’t have good numbers on the surface but a high whiff % continue to get opportunities in MLB. Why? Well whiffs are an indicator of success for pitchers, especially relievers. If the bases are loaded with no outs, we’d want a strikeout rather than contact because a run can be scored on contact, even if it leads to an out. This will be a multipart series, starting with this one on the analytics and will go into modeling in subsequent parts. The goal is to find variables that influence whiffs and predict whiffs.

Finding the probability of a swing and miss can be done in a couple ways. We can be given that a hitter has swung or not. We will assume that the hitter has swung. For those probability lovers out there, assume event A is a swing and event B is a whiff. P(A and B) = P(A)P(B|A) from conditional probability. We will be focusing on P(B|A), the conditional probability that a hitter whiffed given they swung. Can argue back and forth about our assumption of a swing but the goal of this is not to determine what influences a swing, just that they whiff so I think this is the most logical way of going about it. Word of caution, there is some statistics here (exciting!)  but do not fear I will break it down in simple terms!

Methodology

For this analysis I took data of every swing in 2021 from Baseball Savant. This was quite tedious because the limit for exporting data from Baseball Savant is 40,000 rows and the number of swing events was 334,881. So I meticulously took a couple weeks of data at a time resulting in 12 data sets and combined them together to get the full data set. I created a variable that is True when it was a whiff (swinging strike or swinging strike blocked) and False otherwise. Just as a glimpse into the issue data scientists must deal with, there were pitch types and pitch names that were empty like ” ”, which is different than an empty type which is NA/Null, so I had to fix that. I also made a variable for the ball-strike count and baseball state (i.e. 1 100 is 1 out runner on 1st). With this, the analysis can begin.

Results

We will slice and dice the data to look at whiff % in different aspects of the game. A simple group by for those data analyst out there. These are very simple but glean nice insights.

Let’s start simple with just the probability of a whiff (given a swing of course).

Whiff %
24%

If a hitter swings, there is a 24% chance of a whiff.

We will now start to group by certain variables. First, by pitch name (without unknown pitch types). 

Pitch NameWhiff %
Split-Finger36.2%
Slider33.9%
Knuckle Curve32.1%
Curveball30.9%
Changeup28.5%
Cutter22.0%
4-Seam Fastball19.4%
Knuckleball14.7%
Sinker13.9%
Eephus11.5%
Fastball7.4%
Screwball0.0%

Breaking it down by pitch we see a split-finger induces the most whiffs followed by a slider and knuckle curve. A couple things to note are that there was only 1 screwball recorded and fastball corresponds to 2-seam I believe. I asked about it, but haven’t heard back from Baseball Savant. However, the 2-seam fastball doesn’t show up anymore so I assume it is now just called fastball but maybe wrong on that.

Next, we will group by ball-strike count.

CountWhiff %
0-227.2%
0-126.5%
1-225.6%
0-025.1%
1-024.5%
1-124.5%
2-122.1%
2-222.1%
2-021.7%
3-118.6%
3-218.0%
3-016.0%

To probably no one’s surprise, when being ahead of the count a pitcher will induce a whiff more often. What I find interesting is that 0-0 counts are 4th on this list which means that many times a pitcher won’t just throw a fastball down the middle for the first pitch. An indicator of a changing game.

Baseball state is up next. For those who are unfamiliar, there are 24 states a baseball game can be in during an inning. These are the combinations of outs and strikes. The first number will indicate the number of outs, and the the following 3 values will indicate runners on first, second, and third with a 1 indicating the base is occupied. For example, 1 010 would indicate 1 out and a runner on second base.

StateWhiff %
2 10026.8%
2 00126.1%
2 10125.9%
2 01025.8%
2 11125.7%
2 11025.7%
2 00025.6%
0 00025.6%
1 11025.5%
1 10025.4%
2 01125.4%
1 00124.7%
1 11124.6%
1 10123.8%
1 00023.7%
0 10023.7%
0 11023.1%
1 01123.0%
1 01022.5%
0 00122.4%
0 11122.3%
0 01022.2%
0 01121.8%
0 10121.3%

The top 5 states where whiffs were the highest were with two outs and runners on base. This isn’t entirely surprising as hitters are more aggressive to get runs in and pitchers are trying to avoid contact.

Finally, let’s explore both ball-strike count and baseball state together. I will spare you looking through 288 scenarios and just give you the top 10.

StateCountWhiff %
0 1103-044.4%
0 1003-040.0%
1 0103-033.3%
2 1000-232.8%
0 1000-232.3%
0 0000-232.1%
0 0002-032.1%
2 1000-131.5%
1 0012-031.3%
0 0000-030.7%

Looking at the top 10 we see two trends. One is that on a 3-0 count and runners on hitters seem to be aggressive and whiff. Two, on an 0-2 count hitters must protect and pitchers don’t necessarily have to throw a strike.

We can go down a rabbit hole of grouping by more variables and get more and more specific. This will be the foundation for part 2 of this analysis which is sort of a naïve decision tree type model, but I will not spoil the surprise right now.

Let’s look at another important aspect, pitch location by pitch type.

For this plot, the lighter the blue the more pitches in that area. This is from the catcher’s perspective. We can see clear trends that changeups and curveballs down as well as 4-seam fastballs up induce whiffs.

What else could influence a whiff? Let’s look at accelerations in the x direction (left/right, ax), and in the z direction (up/down) az. 

From the boxplot we see there is a different in medians for both plots. Statistically we can compute a t-test to see if there is a statistical difference in means. (For other R fans out there, I will be using t.test()). If we get a p-value less than 0.05 we have a statistically significant difference. If you are not well versed in statistics don’t worry, in simple terms it means the difference is actually a difference and not likely due to chance. 

Welch Two Sample t-test
data:  
swings[swings$swing_miss_tf == TRUE, "ax"] and swings[swings$swing_miss_tf == FALSE, "ax"]
t = 24.839, df = 143574, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.9366943 1.0971813
sample estimates:
mean of x mean of y 
-1.306053 -2.322991 
Welch Two Sample t-test


data:  
swings[swings$swing_miss_tf == TRUE, "az"] and swings[swings$swing_miss_tf == FALSE, "az"]
t = -57.324, df = 130076, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:
-2.166112 -2.022884
sample estimates:
mean of x mean of y

-24.67372 -22.57922 

There is a statistical significance in both cases. In statistics language we can say that we reject the null hypothesis that there is no difference in means of ax and az based on if a hitter whiffs in favor of the alternative that there is a difference. Based on the confidence interval, we see that ax is higher when a hitter whiffs (positive is running right from the catchers perspective) and az is lower when a hitter whiffs (negative is dropping down). In simple terms, there is a difference that isn’t due to just chance. 

Let’s repeat this for the spin rate of a 4-seam fastball and slider.

We see some sort of difference but not as clear here. What we need is more statistics!

Welch Two Sample t-test

data: 
swings[swings$swing_miss_tf == TRUE & swings$pitch_name == "4-Seam Fastball", "release_spin_rate"] and swings[swings$swing_miss_tf == FALSE & swings$pitch_name == "4-Seam Fastball", "release_spin_rate"]
t = 19.642, df = 35360, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 23.38357 28.56781
sample estimates:
mean of x mean of y 
 2297.987  2272.012 
Welch Two Sample t-test

data: 
swings[swings$swing_miss_tf == TRUE & swings$pitch_name == "Slider", "release_spin_rate"] and swings[swings$swing_miss_tf == FALSE & swings$pitch_name == "Slider", "release_spin_rate"]
t = 9.94, df = 43948, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
17.44415 26.01333
sample estimates:
mean of x mean of y 
 2429.844  2408.115 

Turns out there is a statistically significant difference in both cases. For the statistically inclined, we reject the null hypothesis that the mean spin rates are the same for 4-seam fastballs and sliders based on if a hitter whiffs in favor of the alternative that there is a difference. Based on the confidence intervals for both pitches, the spin rate is higher when a hitter whiffs. This makes sense, as higher spin rates have been shown to induce whiffs. 

Conclusion

We see there are plenty of differences based on if a hitter whiffs. In this post we explored pitch type, pitch count, baseball state, baseball state + count, pitch location, accelerations, and spin rate.  There certainly are more, and if I had unlimited time I could show more but also that would be an extremely long blog post. So I encourage you to explore more if you are interested! Now that we know there are differences, we can build models to predict a whiff. Next time I will be looking at a naïve model, which is inspired by Markov Chains (the probability of getting to the next state only depends on the current state) but is basically a decision tree. 

Data

The Enigma of the 2020 Season: Batters

The 2020 season was one of the strangest MLB seasons in history. A 60-game season starting in late July, no fans, social distancing measure for players, and no high fives due to COVID-19. A big question going into the 2020 off-seasons was how can we analyze player performance with such limited data in an unprecedented season? Many high-profile players struggled in 2020 while some unknowns made big strides. With the 2021 season completed, we can look back to see which performances were an anomaly in 2020 and which were signs of a continuation in 2021. In this post I analyze 2020 batter performance. 

Metrics

With so much data available in baseball there are a bevy of metrics that one could choose to analyze players. I picked four: wOBA, fWAR (Fangraphs WAR), Swinging Strike %, and Hard Hit % to evaluate overall performance, value, contact ability, and quality of contact. Yes, I could have chosen different metrics and there is always a debate on what is the best, but I decided on these. For fWAR in 2020 I multiplied it by 2.7 to scale it to a full 162 (since 162/60 = 2.7). Of course, this is an estimation but the best we have for that season.  

Methods

I took data via Fangraphs of qualified hitters in 2019, 2020, and 2021. For wOBA, Swinging Strike %, and Hard Hit % I looked at the percent change between 2019-2020, 2020-2021, and 2019-2021 while for fWAR just the differences for those time periods (since fWAR can be negative). To assess the impact of 2020, we must look at 2019, 2020, and 2021. Sharp improvements or declines in metrics in 2020 could have been due to the nature of the season, thus I explored 2019, 2020, and 2021 (had to be qualified in 2019, 2020, and 2021) which resulted in 54 players. Due to injuries, among other things, this caused the decline in the number of players in the data set. Could I have made the criteria less strict for the data set? Absolutely, but then we have smaller sample sizes (in this case plate appearances) for players. Always pros and cons but for this analysis I took qualified players over those three seasons. 

Results

Below I will show tables with differences between 2019-2020, 2020-2021, and 2019-2021 seasons.

wOBA

29.6% of players increased in wOBA in 2020 and decrease in 2021.

40.7% of players decreased in wOBA in 2020 and increase in 2021.

11.1% of players increased in wOBA in 2020 and increase in 2021.

16.7% of players decreased in wOBA in 2020 and decrease in 2021. 

Swinging Strike %

29.6% of players increased in Swinging Strike % in 2020 and decrease in 2021

14.8% of players decreased in Swinging Strike % in 2020 and increase in 2021.

9.3% of players increased in Swinging Strike % in 2020 and increase in 2021.

5.6% of players decreased in Swinging Strike % in 2020 and decrease in 2021. 

Hard Hit %

14.8% of players increased in Hard Hit % in 2020 and decrease in 2021

51.9% of players decreased in Hard Hit % in 2020 and increase in 2021.

11.1% of players increased in Hard Hit % in 2020 and increase in 2021.

5.6% of players decreased in Hard Hit % in 2020 and decrease in 2021. 

fWAR

37.0% of players increased in fWAR in 2020 and decrease in 2021

40.7% of players decreased in fWAR in 2020 and increase in 2021.

11.1% of players increased in fWAR in 2020 and increase in 2021.

11.1% of players decreased in fWAR in 2020 and decrease in 2021. 

We can see that most players came back to Earth after a large increase or decrease in their performance metrics. The least impacted was swinging strike %, which is likely due to a player’s approach and eye for the strike zone not changing drastically year to year. This increase or decrease in 2020 was due to the shortened season and players regressed or bounced back to their normal performance after that. The concept of regressing towards the mean is notably in “The Book” by Tom Tango where analysis showed that after a hot streak or cold streak a player will come back to their average performance and that we needed even more than a season’s worth of data to truly assess a player. However, there were rare cases like Vladimir Guerrero Jr., Brandon Crawford, and Joey Votto that continued to improve in 2021. Looking at 2019 to 2021 paints a better picture of how these players are trending. We see players like Jose Ramirez, Juan Soto, Freddie Freeman, among others, who are trending upwards as we’d expect after watching them through the 2019 season.

Conclusion

2020 was weird for baseball, and all sports for that matter. Keep in mind this data is biased in 2020 since there was a small sample size of plate appearances per player, which probably contributed to the results of this analysis, but it is what we have. We saw that most players regressed or bounced back to their normal performance and their trajectories were more in line in 2021 with what we’d expect. Were there players that made strides in 2020 and continued in 2021? Yes, as there are always outliers, but the majority did not continue their trajectory. In general, only the elite like a Mike Trout can improve on their great seasons while most come back to their average. So, let’s go back to the question I posed at the beginning: how can we analyze player performance with such limited data in an unprecedented season? The answer is we can’t say much just using 2020 data and should include data from 2018-2019 in addition to 2020 at a minimum to have made an evaluation after the 2020 season. There probably is an optimized weighting of 2020 metrics that could be useful for analysis, but I did not explore that. Since we now have 2021 data, it is easier to evaluate a player since it was a more normal season. Initially I thought this would be a two-part post where I go into the pitching side next, but I think we’d see similar patterns. However, if you’d like me to do it let me know! As a data scientist, more data is always better and player evaluation is the most difficult part of data analytics in baseball, which made 2020 very difficult to analyze.

Data

Clustering Starting Pitchers

My favorite machine learning algorithms are unsupervised clustering. I think there is elegance in finding patterns in data that we don’t know about. Clustering is grouping similar data together in a data set when the groupings are unknown. This is based on how far away data points are from other groups and how close they are to other data points in their group. In layman’s terms, you are like your group but different from other groups. As an example, let’s think about cats and dogs. There are many different types of dogs, but they are all similar because they are all dogs. Same for cats, many types but all cats. However, cats are different than dogs, thus are different animals. There are many more technical insights into clustering, but this is the idea. Once we get clusters, since groupings are unknown, we use a domain expert to determine what these groups represent. 

I may be wrong because I’ve never worked in a front office, but I believe clustering is underutilized in the baseball industry. It is such a powerful tool that can lead to many discoveries. From a team perspective, we can identify undervalued players that are like superstars in the game and at a cheaper price. From an agency perspective, we also identify undervalued players to target, as well as potential superstars at the high school, college, and minor league level. Of course, with any machine learning algorithm, it’s never a guarantee but it can glean insights that we may not be aware of. There are many questions as to which variables to use, from what time period our data comes from, and many other data questions that need to be taken into consideration but I will keep it simple. 

Methodology

For this analysis I took starting pitchers that had 100+ innings pitched in 2021, which resulted in 115 starting pitchers. Each variable used in clustering adds a dimension to the data. For example, if we have three variables then the data lives in three dimensions. Obviously for four or higher we cannot visualize the data but also the data gets further apart, thus not as good clusters. I decided on three metrics: WAR, Swinging Strike %, and Barrel %. It is up for debate if these are the best to use and there very well could be a better combination, but I decided on these because they exemplify a pitcher’s value, swing and miss ability, and how well they limit hard contact. 

The clustering technique I will be utilizing is called k-means. This algorithm clusters the data into k groups. The difficulty with this algorithm is that we must choose a k value to run it. So how do we choose k? Well sometimes it is chosen with domain knowledge if the number of groups is known already and other times we use a metric to determine the best k value. In this case I will be utilizing a mix of domain knowledge and a metric. The metric I will use is called a silhouette score, where a value closer to 1 indicates better clustering. A silhouette score considers how close data is to their own cluster (cohesion) and how far away they are from other clusters (separation). For k from 2-10 I will calculate the silhouette score, and I will repeat this 10 times and average them to avoid variation caused by randomness. When clustering, we always scale the data because the algorithm is distance based.

We have all the tools, let’s cluster!

Clustering

From the silhouette scores, the best option is two clusters with a value of 0.396. However, there are more than two types of pitchers in the league so I will go with three, which had a value of 0.32. Neither of these scores are great, as they are far from 1, but real data is never like in textbooks. Here are the averages of the metrics per cluster. 

ClusterWARSwinging Strike %Barrel %
13.670.1310.068
20.9890.1010.094
32.190.0990.077

Results

Since we have only three variables, we can visualize the data in a 3D plot and see the separation of clusters.

x: WAR, y: Swinging Strike %, z: Barrel %

Another neat way of visualizing the groups and analyzing them is a radar plot. It shows what variables are the largest signals for each group.

Cluster 1

Cluster 1 (in red) is the group of elite pitchers. They have a high WAR and Swinging Strike % while limiting hard contact with a low Barrel %. Here are the pitchers in Cluster 1.

NameWARSwinging Strike %Barrel %Cluster
Carlos Rodon4.90.150.0661
Corbin Burnes7.50.1660.0311
Max Scherzer5.40.1590.081
Walker Buehler5.50.1160.0681
Brandon Woodruff4.70.1290.0581
Trevor Rogers4.20.1410.051
Lance Lynn4.20.120.0571
Zack Wheeler7.30.1240.0461
Kevin Gausman4.80.1530.0711
Robbie Ray3.90.1550.0981
Freddy Peralta3.90.1440.0571
Marcus Stroman3.40.1160.0651
Logan Webb40.1240.0561
Pablo Lopez2.30.1180.0741
Lance McCullers Jr.3.30.1160.0531
Shohei Ohtani30.1290.0711
Sandy Alcantara4.20.1330.0611
Alek Manoah20.1260.0581
Gerrit Cole5.30.1450.0981
Joe Musgrove3.20.1270.0741
Charlie Morton4.60.1240.0491
Frankie Montas4.10.1370.0871
Shane McClanahan2.50.1480.1071
Kyle Gibson3.10.1040.0451
Lucas Giolito40.1530.0671
Clayton Kershaw3.40.1670.0691
Luis Garcia3.10.1320.071
Tyler Mahle3.80.1140.0651
Nathan Eovaldi5.60.1260.0631
Alex Wood2.50.1250.0531
Jordan Montgomery3.30.1370.0741
Dylan Cease4.40.1480.0991
Sean Manaea3.30.1230.081
Luis Castillo3.70.1310.0451
Sonny Gray2.40.1060.0471
Yu Darvish2.90.1210.0881
Jameson Taillon20.1220.0821
German Marquez3.40.1210.0531
Aaron Nola4.50.1280.0711
Kenta Maeda1.70.1360.0631
Logan Gilbert2.20.1250.0881
Eduardo Rodriguez3.80.1170.0681
Brady Singer20.1020.0561
JT Brubaker0.30.120.0881
Andrew Heaney1.20.1270.0931

We see the top starting pitchers like NL CY Young award winner Corbin Burnes, AL CY Young award winner Robbie Ray, Max Scherzer, Kevin Gausman, Shohei Othani, and others. Some up-and-coming talent like Alex Manoah, Logan Webb, and Logan Gilbert are also a part of this group. There are many interesting names in this group, but a couple of names stand out here as surprising are JT Brubaker and Andrew Heaney. These players are not known as elite yet are in this group. Does this mean they will be elite? Not necessarily, but the metrics indicate they could be. This is a big reason why the Yankees last season and Dodgers this season took a chance on Heaney. They have the tools to be elite but could be a steal for a team.

Cluster 2

Cluster 2 (in green) are the underachievers with low WAR and Swinging Strike % and allow a lot of hard contact with a high barrel %. Here are the pitchers in Cluster 2.

NameWARSwinging Strike %Barrel %Cluster
Trevor Bauer1.80.1260.1062
Framber Valdez1.90.1020.0582
Adrian Houser1.50.070.052
Ian Anderson1.90.1190.0952
Casey Mize1.30.0940.12
Rich Hill1.60.0980.0882
James Kaprielian1.30.110.0942
Marco Gonzales0.60.0910.1142
Joe Ross1.30.1110.092
Blake Snell2.10.1290.112
Zac Gallen1.50.0910.0792
Jake Odorizzi10.0940.0962
Tarik Skubal0.80.1110.1432
Yusei Kikuchi1.10.1250.112
Taijuan Walker1.20.0950.1022
Austin Gomber1.30.1130.0942
Dane Dunning1.80.10.082
Nick Pivetta2.10.1060.0822
Jon Gray2.30.110.0692
Jon Lester00.0870.0772
Vladimir Gutierrez0.60.0960.0842
Drew Smyly0.20.1180.1082
Martin Perez0.50.080.0942
Kris Bubic-0.10.0890.0992
Triston McKenzie1.10.1240.12
Adbert Alzolay0.40.1150.112
Dallas Keuchel0.70.0860.0912
Garrett Richards0.40.0940.0932
Erick Fedde1.10.0890.092
Brad Keller1.10.0910.1092
Jordan Lyles-0.20.1050.0962
Wil Crowe-0.30.1050.0952
Zach Davies0.10.090.0912
J.A. Happ0.50.0810.1162
Patrick Corbin0.20.1120.0922
Mitch Keller1.10.0820.0682
Jorge Lopez0.80.0820.0932

We see some names that aren’t surprising like Dane Dunning, Drew Smyly, Wil Crowe, and others. Also, there are pitchers who took a step back like Yusei Kikuchi, Jake Odorizzi, and Blake Snell. Several surprises in here but a couple that stand out are Martin Perez and Trevor Bauer. Martin Perez got a long-term deal with the Tigers but was in this group. Bauer’s WAR was affected not playing the entire season but his Barrel % was high, which contributed to him being in this group.

Cluster 3

Cluster 3 (in blue) are the average performers for the season. Not the best but not the worst WAR, Swinging Strike %, and Barrel %. Here are the pitchers in Cluster 3.

NameWARSwinging Strike %Barrel %Cluster
Julio Urias50.1120.0533
Eric Lauer1.80.1050.073
Max Fried3.80.1110.0633
Adam Wainwright3.80.0810.0623
Cal Quantrill1.80.0890.073
Chris Bassitt3.30.1010.0653
Anthony DeSclafani30.110.0813
Wade Miley2.90.1010.0613
Jose Berrios4.10.0990.0913
Chris Flexen30.0860.0633
Jose Urquidy1.80.1180.0933
John Means2.50.1190.1013
Michael Pineda1.30.1040.0913
Steven Matz2.80.0940.073
Aaron Civale0.80.0940.0823
Johnny Cueto1.50.0970.0663
Zack Greinke1.40.0910.0663
Zach Eflin2.20.1020.0683
Cole Irvin2.10.0890.0733
Kyle Freeland1.50.0850.0823
Hyun-Jin Ryu2.50.0970.0853
Antonio Senzatela3.50.0860.0593
Merrill Kelly2.40.0880.0633
Tyler Anderson2.10.1150.0853
Michael Wacha1.50.1150.0973
Zach Plesac1.10.1120.0943
Madison Bumgarner1.50.0960.0753
Kyle Hendricks1.30.0890.0843
Mike Minor2.30.1070.0933
Chris Paddack1.80.1120.0853
Ryan Yarbrough0.90.0960.0783
Mike Foltynewicz-0.80.080.0963
Matt Harvey1.90.080.0793

There are some players that make sense like Johnny Cueto, Zack Grienke, and Hyun-Jin Ryu. A few surprises in here like Matt Harvey and Julio Urias for different reason. We’d expect Harvey in Cluster 2 while Urias would have been expected in Cluster 1. 

Conclusion

We can clearly see differences in groups and identified players that are undervalued and even possibly overvalued. There could very well be a better combination of metrics to use and even could employ other clustering techniques. The beauty of machine learning is that there is always another way to do something that could be an improvement, and the fun part is finding it. Clustering is not perfect, and we saw with the silhouette scores that it definitely wasn’t perfect in this case, but clustering is still very useful to group players to see what insights there could be. 

Data

The 2021 Sho (Part 2: Pitcher)

For as good as Shohei Ohtani is at the plate, his ability on the mound is arguably his best tool. His devastating splitter strikes fear in hitters around the league. Oh yeah, he also can hit 100 MPH on his fastball. He gets 100 MPH exit velocity off the bat and can throw 100 MPH, truly amazing. With the Angels letting Shohei hit for himself in games that he pitched in 2021, this led to a roll of the dice depending on how deep into the game he pitched. It made sense since he is a better hitter than anyone else that they could put in the DH spot but if he did not pitch deep into the game the Angels could possibly run out of position players to hit in the pitcher’s spot. When everything worked it was fantastic, but when it didn’t it was difficult. The key for Shohei pitching deep into games was his command. When his strike-to-ball ratio was bad, he could not pitch deep into games. To explore this I will use a technique I saw years ago in a Sloan Conference Research paper: “Bullpen Strategies For Major League Baseball” (https://www.sloansportsconference.com/research-papers/bullpen-strategies-for-major-league-baseball) by Harrison Willie and John Salmon. They use the strike-to-ball ratio to figure out when to pull a pitcher, but I will just be using their plotting technique plus regression to assess control for Shohei. Thought the technique used was a great way to visualize how well a pitcher is doing, specifically with command. Now let’s analyze Shohei. 

Command

I took pitch level data from BaseballSavant for this analysis. Unfortunately, they do not have a variable tracking the number of strikes, balls, or total pitches for a pitcher so I just engineered the variables myself. I plotted the data where the number of strikes were on the y-axis and the number of balls on the x-axis. I built a linear regression (line solid black and formula below) to find Shohei’s average strike-to-ball ratio. In simple terms, games where Shohei is above this line are good and below are bad. Games along or close to that line was average for him.

Strikes = 4.07 + 1.56*Balls

The plot on the left shows Shohei’s strike-to-ball ratio on the game level while the right color codes the number of pitches Shohei threw. What I noticed right away on the game level was that most of the games he was above average occurred in the second half of the season. This is no surprise, as anyone who watched Shohei (including myself) noticed he struggled mightily with command to start the season. On the pitch level we can see that for games that were above average he generally got to a higher pitch count. Let’s compare the first and second halves of the season. 

Now we can clearly see the differences in the first and second halves of the season. Clearly in the second half he had a much better strike-to-ball ratio in more games. So, we have established that Shohei had better command as the season progressed, but what changed? Although pitching was always my favorite position, I am not qualified to be a pitching coach nor am I in sports science. Thus, I cannot comment on the pitching mechanics changes Shohei implemented over the course of the season. BaseballSavant does have release positions, velocities, and accelerations but did not to go this route. Maybe a future blog post. I decided to look at something I noticed watching him pitch, fastball (four-seam + two-seam) velocity and splitter velocity. 

Fastball

The first thing I did was look at Shohei’s average fastball velocity by inning in the first and second halves of the season.

Shohei had a slightly lower fastball velocity over the course of a game in the second half of the season but not a huge difference. He kept up a fairly consistent velocity, but in the first half of the season in the 7th and 8th he increased velocity while it declined in the second half. Could have been by design for better command but he was doing so much every day, so fatigue was probably a factor.  Not much difference though, but let’s look at his splitter.

Splitter

This is where things get really interesting. In the first half Shohei’s average splitter velocity started above 90 MPH and decreased as the game went on. The second half of the season he started at a lower velocity and increased it after the 5th inning, but not getting to 90 MPH on average. Was it for command? Could be, or by design the lower velocity helps make his fastball better with a larger differential (average fastball minus splitter velocity).

Fastball & Splitter Velocity Differential

In the first half of the season Shohei had a low differential in fastball and splitter velocity on average. Both halves showed increased differential later in the game. The more differential, the more effective at getting swing and misses he is. More swing and misses mean more strikes and finishing off hitters earlier in the at bat. Watching him in the first half, he didn’t really get a lot of swing and misses on his splitter but in the second half he got more. 

Conclusion

We can conclude that Shohei’s change in fastball and splitter velocity led to better command in the second half, right? Not quite, need more statistical analysis to make that claim. Correlation never means causation and there could be many other factors that influenced this like mechanics changes, pitching to contact, better command of the slider, use of the cutter, or other things. Whatever it may have been, Shohei showed better command in the second half and thought this was an interesting observation. Truly lucky to witness his historic season in 2021 and excited to see what he does in 2022!

Data

The 2021 Sho (Part 1: Batter)

To kick off this blog I had to do something with my favorite team, the Angels. My friends and family would say I am obsessed but I will say passionate is the better word. Missed only one home opener the last 15+ years and going to 20+ games every year while having to follow every moment either at the ballpark, on TV, radio, Twitter, or the MLB AtBat app. Just love baseball and supporting the Angels, even though it has been a struggle recently. The 2021 Angels season was all about witnessing the historic season by AL MVP winner Shohei Ohtani. He dominated on both the mound and in the batter’s box. I have never witnessed one player do all that he did, and we all just expected greatness whenever he was on the field. For part 1 I focus on an aspect of Shohei the batter, and in part 2 the pitcher.

Shohei was fantastic in the batter’s box slugging 46 home runs, 152 wRC+, and 0.964 OPS. However, in the second half of the season he struggled, especially in August. In the second half his SLG went down over 200 points, but he actually increased his OBP with the high number of walks he received. This was a product of how pitchers were attacking him, a little impatience with no protection, and pulling the baseball more. Going from Mike Trout to Phil Gosselin (Goose!) hitting behind you is quite drastic. Pitchers were attacking Shohei very differently in the first and second half, which is what I wanted to explore. In addition, the distribution of pitches for swing and misses versus hits, and pull percentage were different as well. Let’s see how different it was.

NOTE: To visualize pitch locations, I will be using a heatmap that will show the magnitude of the number of pitches in each location. In simple terms, the lighter the blue the more pitches were in that area. The strike zone is in red, and the view is from the catcher’s perspective. 

First Half

Pull %Center %Opposite %
43.1%31.9%25%
Shohei Ohtani first half

In the first half of the season the distribution of pitch locations was fairly uniform around the strike zone with most pitches being middle down in the zone. This is probably biased with being ahead of the count. Shohei’s swinging strikes were concentrated middle up and down and outside. He dominated at about belt level across the plate with most of his damage on the inside part of the plate. The distribution of where he hit the baseball was pretty spread out with center + opposite field being higher than pull.

Second Half

Pull %Center %Opposite %
54.8%27.4%17.8%
*data via Fangraphs.com

In the second half pitchers attacked Shohei on the outside part of the plate with pitchers going way down and in more than the first half. His swing and misses were up and outside as well as down and in. The hits were middle up a little and down and outside, due to how he was pitched. Shohei became much more of a pull hitter in the second half as well and pitchers were exploiting this. 

Comparison

Here are some images that make it easy to compare the 1st and 2nd half for Shohei.

Conclusion

Ok so the distribution of pitches was quite different in the first and second half, so what? Well, we can clearly see that pitchers exploited the pull mentality by pitching him outside more and without protection they could try below the strike zone to get him to chase. For 2022 with Trout behind him (Or even Rendon), pitchers cannot do what they did in the second half and Shohei can be patient. He will need to get back to being less of a pull hitter, which probably stemmed from frustration not getting pitches to hit, but he is set up to perform more like his first half offensive numbers next season with protection behind him.

Data