+

Introducing PLV Mistake Rate

Quantifying the pitches that pitchers regret.

Blake Snell has been flummoxing hitters and analysts alike his entire career but took it to another level in 2023. The strengths are obvious: he won the 2023 NL Cy Young with an incredible 2.25 ERA (best in MLB for starters) across 180 IP, with a 31.5% strikeout rate (98th %ile) and a matching 31.5 CSW% (95th %ile). He’s a strikeout artist (29.7% career K rate), who has shown that he can overpower hitters and send them to the dugout shaking their heads. He’s also not without his warts: he had a 13.3% walk rate last year (3rd %ile) and has averaged >10% for his career. When you give away that many free passes, you are subject to the dangers of batted ball luck: Hits drop, which brings the runners on base home, and your ERA goes up. It’s easy to see how this profile can come crashing back to Earth, even with the gaudy strikeout rate. So how does he manage to walk so many hitters, while also keeping his ERA in check? Is he simply lucky? A look at his BABIP (0.256; 91st %ile) and LOB% (86.7%; 98th %ile) certainly indicates that he has been. That’s not a fun (or rewarding) explanation, though, and it’s lazy analysis to simply write him off as lucky and move on. I propose a different explanation for how he limited the damage from his BB% (and how he controls some of his BABIP/LOB%): He makes fewer mistakes, which means he allows fewer hits, which allows his K% to bail him out more often.

 

What is a Mistake?

 

That’s a fairly obvious thing to say: making fewer mistakes is better for the pitcher. The tricky part is defining what a “mistake” actually is. I went through several different permutations of what a mistake could be. Between actual results and estimated results with our PLV model, there is a deep sandbox of metrics to check and thresholds to tweak when looking for a definition. Is it a pitch that has a low expected CSW%? Is it a pitch that’s likelier to result in a barrel or other type of dangerous contact? Some ideas simply didn’t yield useful results, while others were overly complicated. After searching around for a useful and straightforward definition, it dawned on me:  We already have thresholds for Quality Pitches (PLV >5.5) and Bad Pitches (PLV <4.5), and we can lean further into those categories to help us define a “mistake.”

Specifically, let’s look at Bad Pitches: a Bad Pitch is a pitch that, based on its characteristics (movement, velocity, location, and count), is expected to return poor outcomes for the pitcher, resulting in a higher-than-average expected run value for the pitch. Bad Pitches come in two flavors: those thrown out of the zone for a ball/HBP, or those thrown in the zone that are likelier to result in a hit. I wouldn’t generally classify pitches that are balls as a “mistake,” since pitchers generally use them to induce a hitter into a bad swing outside the zone, or they may have just missed their location. Those pitches happen, and we move on. The more interesting genre of Bad Pitch is the Bad Pitch in the zone. Those pitches are expected to allow more runs by yielding more contact and/or more bases on contact. Pitchers don’t want more contact, and they definitely don’t want more bases. This will be our criteria for a Mistake: a Bad Pitch in the zone (aka a pitch in the zone with a PLV <4.5 ). From 2020-2023, 8.5% of all pitches have met these criteria, so roughly 1 in 12 pitches is classified as a Mistake Pitch. Given that we define Mistake Pitches as pitches in the zone, it’s also worth knowing that 18.8% of all pitches in the zone are classified as Mistake Pitches.

I’m a visual learner, so I always check to make sure things look like how I expect/hope them to look before I dig in further. Here is a chart of where Mistake Pitches reside, and how often they occur there (darkest red is >50% Mistake Pitches, lightest is <5%. Areas outside the strike zone are, by definition, 0%).

 

 

This passes the eye test! The most frequent locations are in the heart of the plate, where the hitter can do the most damage, and the frequency fades as you move toward the edges of the strike zone.

Now that we have a definition for what a Mistake Pitch is, we can dig in.

 

Analysis

 

Given that a Mistake Pitch is a Bad Pitch, and those are estimated to allow more and harder contact, that should be backed up by the data. If not, it’s back to square one of defining the metric. Thankfully, the results line up with our expectations: Mistake Pitches not only earn fewer CSW, but they yield ~2x as many hits per pitch and result in more bases on contact.

Mistake Rate Validation

Now that the metric is initially validated, let’s get into some more rigorous analysis of it, and then discuss applications. I’m a fan of showing more than telling, so get ready for some charts.

 

Stability

 

Research has shown that it can take ~400 pitches for the value of a pitcher’s location to stabilize, and that is similarly true for Mistake Rate, which stabilizes (Cronbach’s Alpha >= 0.7) at ~360 pitches.

 

 

Using the 2023 MLB average of 16.6 pitches per IP, this means that we have a good idea of what a pitcher’s Mistake Rate talent is after about 22IP, or 4-5 starts. This is great because it means that Mistake Rate can be used when analyzing pitchers with a very limited sample of innings, like relievers or prospects.

 

Stickiness

 

A metric that stabilizes quickly helps us understand that the pitcher influences a metric (instead of it being influenced more by randomness/other variables). Another important measure of a metric’s ability to be innate to that pitcher is its year-to-year stickiness. If this is truly a pitcher’s skill, it should be fairly consistent from one year to the next, and Mistake Rate checks this box as well.

 

 

An r2 of 0.65 is strong and tells us that a pitcher’s Mistake Rate can be expected to continue from one season to the next. Combined with the relatively quick stabilization, this means that we can have an idea of what a pitcher’s Mistake Rate will be like in the next year, even if they don’t have a full season of innings to draw from.

 

Application

 

Descriptiveness

 

Now we have confirmation that Mistake Rate stabilizes early and is sticky year-to-year. That gives me confidence that this is a consistent stat that appears to be a measure of a pitcher’s skill. That’s all well and good, but there’s another question we need to answer: how is this useful? It’s a very important question and is one that I have learned to always keep in mind, because in the past I’ve spent too much time going down a rabbit hole of properly honing a metric, to then realize it’s merely a curiosity and doesn’t have any bearing on whether a player is good or not.

The biggest question I had for Mistake Rate was: how does it relate to batted balls (and hits, specifically)? Since Mistake Rate is a pitch-level stat, I looked at its relation to Batted Balls-per-Pitch (aka: is a pitch a field out or hit; labeled Batted Ball%) and Hits-per-Pitch (Hit%). I also included their relationship to Zone%, to make sure that there is a benefit brought by Mistake Rate above and beyond knowing that a pitch was in the zone or not. These charts are using all pitchers from 2020-2023 with at least 100 pitches thrown, weighted by the number of pitches thrown. We’ll start with Batted Ball%.

 

 

That’s an exciting improvement over Zone%! I’m quite happy with an r2 of 0.36 for estimating a pitcher’s ability to limit or allow batted balls, especially since there isn’t much of a relationship with Zone%. The Hit% charts show similar results.

 

 

Both metrics are noisier than before, but that makes sense given how variable batted ball luck can be. Using only the characteristics of a pitch (with no information about who the batter is, what the defense behind them is like, or what park they’re in), we can have a loose idea of how many hits they’ll allow. In all, I’m quite pleased with Mistake Rate’s relationship to batted balls and hits, and will be using it as a shorthand for gauging how “hittable” a pitcher is.

 

Conclusion

 

We now have a metric designed to identify Mistakes thrown by pitchers. We’ve shown here that it checks all of the boxes I like to see from a novel metric: it’s relatively quick to stabilize, it’s sticky from year to year, and it helps us describe aspects of a pitcher’s performance that have a large impact on their observed results. Something else that pleases me is that this helps quantify some of the vernacular around baseball—it’s pretty easy to call a hanging curveball in a 2-strike count a “mistake,” but now we have a way to quantify if it actually was a Mistake Pitch, how often he throws them, and we know that it is a repeatable skill (or lack thereof).

This brings me back to Blake Snell. He just refuses to throw Mistakes. Only 3.2% of his pitches are classified as Mistakes, while Eury Pérez was 2nd (among pitchers with 1,500+ pitches), at 4.3%. Some of that has to do with him not hitting the strike zone (36.5% Zone rate; the lowest in baseball), but even when we control for Mistakes per Zone Pitch, Snell still leads the league by a relatively wide margin (8.6% Mistake/Zone, vs 10.1% for Pérez in 2nd). Snell’s strategy is to never miss in the middle of the zone, which will yield a high BB% if you have even average control, but he knows that his stuff will allow him to generate strikeouts there. That, combined with his lack of Mistake Pitches, means that he won’t be punished as harshly as someone like Zach Davies (also sub-40% Zone rate, but with less impressive stuff and twice the rate of Mistakes).

I’ll close with the 2023 Mistake Rate leaderboard for all pitchers with at least 1,500 pitches. Enjoy!

 

Mistake Rate Leaderboard

Subscribe to the Pitcher List Newsletter

Your daily update on everything Pitcher List

Account / Login