+

Introducing PLV Mistake Rate

Quantifying the pitches that pitchers regret.

Blake Snell has been flummoxing hitters and analysts alike his entire career but took it to another level in 2023. The strengths are obvious: he won the 2023 NL Cy Young with an incredible 2.25 ERA (best in MLB for starters) across 180 IP, with a 31.5% strikeout rate (98th %ile) and a matching 31.5 CSW% (95th %ile). He’s a strikeout artist (29.7% career K rate), who has shown that he can overpower hitters and send them to the dugout shaking their heads. He’s also not without his warts: he had a 13.3% walk rate last year (3rd %ile) and has averaged >10% for his career. When you give away that many free passes, you are subject to the dangers of batted ball luck: Hits drop, which brings the runners on base home, and your ERA goes up. It’s easy to see how this profile can come crashing back to Earth, even with the gaudy strikeout rate. So how does he manage to walk so many hitters, while also keeping his ERA in check? Is he simply lucky? A look at his BABIP (0.256; 91st %ile) and LOB% (86.7%; 98th %ile) certainly indicates that he has been. That’s not a fun (or rewarding) explanation, though, and it’s lazy analysis to simply write him off as lucky and move on. I propose a different explanation for how he limited the damage from his BB% (and how he controls some of his BABIP/LOB%): He makes fewer mistakes, which means he allows fewer hits, which allows his K% to bail him out more often.

 

What is a Mistake?

 

That’s a fairly obvious thing to say: making fewer mistakes is better for the pitcher. The tricky part is defining what a “mistake” actually is. I went through several different permutations of what a mistake could be. Between actual results and estimated results with our PLV model, there is a deep sandbox of metrics to check and thresholds to tweak when looking for a definition. Is it a pitch that has a low expected CSW%? Is it a pitch that’s likelier to result in a barrel or other type of dangerous contact? Some ideas simply didn’t yield useful results, while others were overly complicated. After searching around for a useful and straightforward definition, it dawned on me:  We already have thresholds for Quality Pitches (PLV >5.5) and Bad Pitches (PLV <4.5), and we can lean further into those categories to help us define a “mistake.”

Specifically, let’s look at Bad Pitches: a Bad Pitch is a pitch that, based on its characteristics (movement, velocity, location, and count), is expected to return poor outcomes for the pitcher, resulting in a higher-than-average expected run value for the pitch. Bad Pitches come in two flavors: those thrown out of the zone for a ball/HBP, or those thrown in the zone that are likelier to result in a hit. I wouldn’t generally classify pitches that are balls as a “mistake,” since pitchers generally use them to induce a hitter into a bad swing outside the zone, or they may have just missed their location. Those pitches happen, and we move on. The more interesting genre of Bad Pitch is the Bad Pitch in the zone. Those pitches are expected to allow more runs by yielding more contact and/or more bases on contact. Pitchers don’t want more contact, and they definitely don’t want more bases. This will be our criteria for a Mistake: a Bad Pitch in the zone (aka a pitch in the zone with a PLV <4.5 ). From 2020-2023, 8.5% of all pitches have met these criteria, so roughly 1 in 12 pitches is classified as a Mistake Pitch. Given that we define Mistake Pitches as pitches in the zone, it’s also worth knowing that 18.8% of all pitches in the zone are classified as Mistake Pitches.

I’m a visual learner, so I always check to make sure things look like how I expect/hope them to look before I dig in further. Here is a chart of where Mistake Pitches reside, and how often they occur there (darkest red is >50% Mistake Pitches, lightest is <5%. Areas outside the strike zone are, by definition, 0%).

 

 

This passes the eye test! The most frequent locations are in the heart of the plate, where the hitter can do the most damage, and the frequency fades as you move toward the edges of the strike zone.

Now that we have a definition for what a Mistake Pitch is, we can dig in.

 

Analysis

 

Given that a Mistake Pitch is a Bad Pitch, and those are estimated to allow more and harder contact, that should be backed up by the data. If not, it’s back to square one of defining the metric. Thankfully, the results line up with our expectations: Mistake Pitches not only earn fewer CSW, but they yield ~2x as many hits per pitch and result in more bases on contact.

Mistake Rate Validation

Now that the metric is initially validated, let’s get into some more rigorous analysis of it, and then discuss applications. I’m a fan of showing more than telling, so get ready for some charts.

 

Stability

 

Research has shown that it can take ~400 pitches for the value of a pitcher’s location to stabilize, and that is similarly true for Mistake Rate, which stabilizes (Cronbach’s Alpha >= 0.7) at ~360 pitches.

 

 

Using the 2023 MLB average of 16.6 pitches per IP, this means that we have a good idea of what a pitcher’s Mistake Rate talent is after about 22IP, or 4-5 starts. This is great because it means that Mistake Rate can be used when analyzing pitchers with a very limited sample of innings, like relievers or prospects.

 

Stickiness

 

A metric that stabilizes quickly helps us understand that the pitcher influences a metric (instead of it being influenced more by randomness/other variables). Another important measure of a metric’s ability to be innate to that pitcher is its year-to-year stickiness. If this is truly a pitcher’s skill, it should be fairly consistent from one year to the next, and Mistake Rate checks this box as well.

 

 

An r2 of 0.65 is strong and tells us that a pitcher’s Mistake Rate can be expected to continue from one season to the next. Combined with the relatively quick stabilization, this means that we can have an idea of what a pitcher’s Mistake Rate will be like in the next year, even if they don’t have a full season of innings to draw from.

 

Application

 

Descriptiveness

 

Now we have confirmation that Mistake Rate stabilizes early and is sticky year-to-year. That gives me confidence that this is a consistent stat that appears to be a measure of a pitcher’s skill. That’s all well and good, but there’s another question we need to answer: how is this useful? It’s a very important question and is one that I have learned to always keep in mind, because in the past I’ve spent too much time going down a rabbit hole of properly honing a metric, to then realize it’s merely a curiosity and doesn’t have any bearing on whether a player is good or not.

The biggest question I had for Mistake Rate was: how does it relate to batted balls (and hits, specifically)? Since Mistake Rate is a pitch-level stat, I looked at its relation to Batted Balls-per-Pitch (aka: is a pitch a field out or hit; labeled Batted Ball%) and Hits-per-Pitch (Hit%). I also included their relationship to Zone%, to make sure that there is a benefit brought by Mistake Rate above and beyond knowing that a pitch was in the zone or not. These charts are using all pitchers from 2020-2023 with at least 100 pitches thrown, weighted by the number of pitches thrown. We’ll start with Batted Ball%.

 

 

That’s an exciting improvement over Zone%! I’m quite happy with an r2 of 0.36 for estimating a pitcher’s ability to limit or allow batted balls, especially since there isn’t much of a relationship with Zone%. The Hit% charts show similar results.

 

 

Both metrics are noisier than before, but that makes sense given how variable batted ball luck can be. Using only the characteristics of a pitch (with no information about who the batter is, what the defense behind them is like, or what park they’re in), we can have a loose idea of how many hits they’ll allow. In all, I’m quite pleased with Mistake Rate’s relationship to batted balls and hits, and will be using it as a shorthand for gauging how “hittable” a pitcher is.

 

Conclusion

 

We now have a metric designed to identify Mistakes thrown by pitchers. We’ve shown here that it checks all of the boxes I like to see from a novel metric: it’s relatively quick to stabilize, it’s sticky from year to year, and it helps us describe aspects of a pitcher’s performance that have a large impact on their observed results. Something else that pleases me is that this helps quantify some of the vernacular around baseball—it’s pretty easy to call a hanging curveball in a 2-strike count a “mistake,” but now we have a way to quantify if it actually was a Mistake Pitch, how often he throws them, and we know that it is a repeatable skill (or lack thereof).

This brings me back to Blake Snell. He just refuses to throw Mistakes. Only 3.2% of his pitches are classified as Mistakes, while Eury Pérez was 2nd (among pitchers with 1,500+ pitches), at 4.3%. Some of that has to do with him not hitting the strike zone (36.5% Zone rate; the lowest in baseball), but even when we control for Mistakes per Zone Pitch, Snell still leads the league by a relatively wide margin (8.6% Mistake/Zone, vs 10.1% for Pérez in 2nd). Snell’s strategy is to never miss in the middle of the zone, which will yield a high BB% if you have even average control, but he knows that his stuff will allow him to generate strikeouts there. That, combined with his lack of Mistake Pitches, means that he won’t be punished as harshly as someone like Zach Davies (also sub-40% Zone rate, but with less impressive stuff and twice the rate of Mistakes).

I’ll close with the 2023 Mistake Rate leaderboard for all pitchers with at least 1,500 pitches. Enjoy!

 

Mistake Rate Leaderboard

4 responses to “Introducing PLV Mistake Rate”

  1. portyl says:

    Super interesting. Probably has some DFS implications to target high mistake rate pitchers. How do you explain Eflin? Was he lucky? Is his stuff that good that it’s still hard to hit in mistake zone?

    • Kyle Bland says:

      There’s definitely some relation between high Mistake Rates, and allowing balls in play, specifically ones that result in hits, so that type of pitcher is a good option to target for DFS or standard fantasy, especially if the lineup is aggressive.

      There are a ton of factors that go into PLV, which then feeds this. Regarding Eflin specifically: he throws a lot of strikes and generally locates in places that are helpful, but if he misses middle, his stuff isn’t good enough to cover for him. High-zone, low-Stuff pitchers, like him, will tend to score lower by this metric (in addition to guys who consistently don’t locate well in the zone, or hang a breaking ball), because they don’t have the margin for error of high stuff guys, and their high zone rate means they’ll be flirting with mistakes more than pitchers who run lower zone rates.

  2. Bill says:

    Very interesting article, and one that seems to raise more questions than it answers.

    The rank-order list does not correlate with generally accepted rankings of pitcher quality/effectiveness, begging the question of what factors might explain this. George Kirby’s position is especially surprising in light of his reputation as a control specialist.

    I wonder which additional factor(s) can be included to nudge the ranking closer to the ranking of actual outcomes. Possibilities that occur include velocity, pitch variety/repertoire, extension, and the various measures of pitch movement. It would also be interesting to look at the extent to which park factors and quality of the supporting defense each affect outcomes.

    • Kyle Bland says:

      Thanks Bill, glad you found it interesting!

      I’ll lead off by saying: this metric is not designed to be a holistic quality/effectiveness analog. It was designed to quantify an idea that is common in baseball parlance, but is nebulous (a “mistake” pitch). In doing so, it also illustrates how a particular pitching strategy (don’t miss middle, don’t worry about balls/walks, and let your stuff work at the edges for Ks/weak contact) can be effective.

      I will be the first to say that there are factors that affect overall results that are purposefully NOT included in this stat. The biggest: it’s only for in-zone pitches, which is only half of the pitches thrown by a pitcher, at most (unless you’re George Kirby). Not including half of pitches thrown is always going to give an incomplete image of a player.

      There are a number of factors included in PLV (which gives us the quality measurement for Mistake pitches), including some you’ve mentioned: velo, extension, movement, and difference in velo/movement from Fastball (a proxy for repertoire) are all included. There are a lot of interesting dynamics between these factors, including some that are counterintuitive. Ex: Count is an included factor in PLV that may be counterintuitive for this metric: in a 3-ball count, the relative risk of throwing a pitch in the strike zone is less (because you’re already likely to give up a base on balls, anyway). PLV controls for count in that way, so you can’t get as low of a PLV score with a meatball in those counts (you’re already *mostly* expected to give up a base on balls, so giving up a batted ball isn’t much worse, because there’s also a chance that batted ball is an out). If you’re a pitcher who avoids 3-ball counts (Kirby says hello) you will be “penalized” for it in this metric, because you shouldn’t be throwing in/near the middle of the zone in those pitcher’s counts you find yourself in.

      Kirby is a particularly interesting case for this metric. His Mistake/Zone rate is 19%, which is roughly league average, but he’ll sink down the Mistake Rate leaderboard because he simply throws a ton of zone pitches. He’s for sure a command artist, but he could also afford to throw fewer pitches in the zone. With his ability to locate for strikes at the edges of the zone, you would think he could be better about avoiding the heart of the zone, too. His breaking pitches, in particular both run at least a 90th %ile Zone% (with the Curve at 99th!), and every single pitch he throws is located middle-middle more than average.

      Again, Kirby is an ELITE pitcher, but the things that make him elite are things that either aren’t included in this metric, or actively cause him to look worse by this metric. That’s ok, and that’s why we have different metrics (many of which will show how elite Kirby is).

Leave a Reply

Your email address will not be published. Required fields are marked *

Account / Login