One of last week’s blurb highlights, Josh Rojas (No. 47), has gotten the call to the majors. It’s been a good week for the Diamondbacks’ farm system, with Daulton Varsho (No. 10) also vaulting into the top 10. There’s plenty of other movements in the rankings, especially towards the bottom of the list where the projections are a virtual tie that is often broken by recent performance. The methodology of the system for projecting peak WAR can be found here, and more detailed projections with percentile outcomes beyond the top 100 can be found in the link here.
This week’s update will be a little different in that I’d like to discuss the historical accuracy of these projections. Some healthy skepticism is absolutely warranted in condensing the future performance of minor league players to one number, so these results should give a better idea of how to treat the projections. I took a four-year test sample of years from 2008-2011 as the most recent timespan where the early careers of prospects have more or less played out. Output are the results of the correlations between xWAR and actual average 3-year peak WAR for every hitter projected, just the top 100 projected hitters, and just the top 50 respectively.
Year | Total r2 | Top 100 r2 | Top 50 r2 |
2008 | .2432 | .2329 | .4057 |
2009 | .2085 | .1316 | .2134 |
2010 | .2326 | .2469 | .1658 |
2011 | .2616 | .1979 | .1726 |
The high correlation for top-50 prospects in 2008 is likely an outlier, but the model is fairly stable any way the groups are sliced—with some leaning towards higher accuracy in separating higher-level prospects from non-prospects. These numbers aren’t perfect, but they indicate some value can be found in a field with high variance. And of course, a balanced approach that considers the scouting side will likely yield better results than just the numbers alone.
An additional measure of fit to consider is the RMSE (Root Mean Square Error). This tells us the spread of residual error from the projections. The table below shows the results for the same years as above with comparison to the standard deviation from each sample as a minimum baseline that the model should achieve to provide value.
Year | Total RMSE | Total STDEV | Top 100 RMSE | Top 100 STDEV | Top 50 RMSE | Top 50 STDEV |
2008 | .6360 | .7309 | 1.5629 | 1.7415 | 1.7522 | 1.9963 |
2009 | .7458 | .8378 | 1.7505 | 1.8717 | 1.7635 | 1.9317 |
2010 | .7550 | .8593 | 1.7381 | 1.9520 | 2.2068 | 2.3770 |
2011 | .7337 | .8493 | 1.7972 | 1.9734 | 2.2320 | 2.4048 |
The RMSE beats the standard deviation for each category, implying that the model achieves at a minimum less error than using the mean in each sample. These results more distinctly show that error increases in separating higher-level prospects from each other and that players in the same tier should be treated as roughly the same.
Hopefully, this update is illuminating to the strengths and weaknesses of these projections and how to best use them. Next week we’ll be back to celebrating the unheralded Gabriel Morenos and Kevin Padlos of the minor league baseball world.
(Photo by Mark LoMoglio/Icon Sportswire)
What are top 3 guy to get in nl keeper league that will have impact and have oppty next season n could be stars soon. See waters pache but they seem a year away, carson has crowded stl of, lux keiboom grisham are gone. Whose left w oppty + can be a stud?
Kristian robinson in az of peralta gone marte moving back into infield
Carson in st Louis they have a traffic jam in OF even gorman he has huge power but doesnt seem ready.
I’d still put Carlson/Waters at 1 and 2 — they both just got sent up to AAA now so they shouldn’t be too far away, especially with the Braves having so many injuries in their OF. The Cardinals OF is crowded now, but Ozuna is going to be a free agent and Martinez could get moved over the offseason.
There’s a few guys that I’d consider behind them. Pache, Varsho, Rodgers, and Hayes could all get longer looks in the next year — Pache probably being the top of that group. There’s also Trammell and Chisholm who carry more risk, though they also have more upside than some of the guys above. For looking beyond next season, Marco Luciano and Kristian Robinson are trending up. If I had to pick one for that third spot, I’d go with Luciano for the upside, but Pache should have better short term returns.