Analyzing the Accuracy of the 2012 Projections | Part 2: Pitchers
The cheatsheets that you see on this site aggregate a lot of available data for your fantasy baseball drafts. One of the main features within the sheets is the array of projections to look at for each player. These are fairly scientific projections that take in a lot of factors to predict player performance. If you’re using these sheets within your drafts, you’re putting a lot of faith that the projections are providing an accurate picture of the draft pool. That being said, there are loads of projections to choose from and the list is growing every year. From that list of oddly-named projections, what should you choose? Or, more specifically, which one is the “best” for fantasy baseball purposes?
I had posed this question last month and looked at how the projection systems predicted performance for 5×5 roto hitters in 2012. In that analysis, the Steamer projections beat out the competition. In addition, an approach of averaging four of the main free projections for a Combined projection did quite well. This was similar to the results from 2011 which shows a trend of Steamer being a strong predictor of fantasy success for hitters on a consistent basis.
Now what about the pitchers?
Glad you asked. The time has come to analyze them too. So, first, let’s meet the contenders. There were nine different projection systems used in this analysis. Here is some background on each of them:
- Marcel: The most basic forecasting system around. It takes three years of player data, weights the most recent years heaviest and regresses the players towards a mean (age factor included)
- Steamer: Steamer also takes years of player performance but then regresses certain stats more heavily than others while using an aging factor as well. It also accounts for player role, park factors and league factors. Each component within their projections uses a different projection system. It also uses pitcher velocity as part of its system.
- ZiPS: Does a little of the weighted regression like Marcel but for four years and does a bit of the comparable player regression based on aging trends
- Cairo: It’s like Marcel but with more bells and whistles (stat-specific regression and position-specific regression for instance)
- Fangraphs Fans: These are user-entered projections for a player that are entered into the Fangraphs site. They average out all of those user-entered projections for a look at how the public views a player as opposed to a computer algorithm. A minimum of 8 “fans” must have submitted a projection for a player in order for them to be eligible.
- RotoChamp: Also takes three years of player data and weights the most recent years heavier. Utilizes projected roles and uses FIP/xFIP as opposed to ERA.
- MORPS: I can’t find a full explanation of his system but he seems to utilize some of the simplicity of Marcel and complexity of ZiPS while adding a human element of accounting for role changes from one year to the next.
- PECOTA: A bit more complicated in that it finds comparable players to each projected player and bases the projections on the history of those comparable players.
- Combined (MSZC): Averaging the four main free projections (Marcel, Steamer, ZiPS and CAIRO) for a player into one projection.
Now that you know the players, let’s dive into the game of analyzing the projections. First off, I needed to make sure I was only analyzing players that were shared in all nine projections so I weeded out any were not projected by one of the systems. The next step was to make sure I was only looking at players that were relevant for fantasy baseball drafts last season so I looked at 2012 preseason ADP and weeded out any players that were not being drafted in fantasy drafts last year. After all of this weeding, I washed my hands and saw that I was left with 155 pitchers to analyze. These were all players that were relevant on draft day last year.
At this point, I started to compare actual results to projected results for the traditional rotisserie pitching categories. I calculated the correlation of projections to actual results for each system as well as the root mean square error (RMSE). The correlation coefficient helps to show us how well the projections ranked the players for each stat in comparison to the actual rankings. On the other hand, the RMSE value analyzes the misses and the extent to which the projections varied from the actual results. Good RMSE results indicate a better agreement between the projected and actual results.
In this analysis, I standardized all of the stats within their own universe because we’re comparing players within their own projection systems on draft day and if the system projects Justin Verlander to be 3 standard deviations above the mean in total strikeouts then it doesn’t matter if the mean is 120 strikeouts or 150 strikeouts in that projection universe. This concept is how the WERTH values work on this site to show you projected roto value in each category.
Using that concept, I ran the analysis and came up with correlation coefficients and RMSE values for each projection system in each roto stat. While displaying the rankings at this point would be a nice snapshot of winners and losers, that wouldn’t account for the times when 1st, 2nd and 3rd place were a virtual tie or when last place was far, far behind the others. To account for that, I converted the rankings to standardized z-scores to show how far above or below average each projection was in each of the roto stats.
You know the players and you know the game now so let’s see the results. In addition to the five roto categories, I also did the same analysis for the projected total WERTH value too (adds up standardized scores in all five other categories for their total roto value). Finally, I averaged those six z-scores for the final tally of winners and losers.
The first analysis looked at all 155 pitchers.
Correlation rankings (results converted to z-scores) | |||||||
W | K | S | ERA | WHIP | WER | Av | |
Steamer | 1.1 | 1.3 | 0.7 | 1.3 | 1.7 | 1.6 | 1.3 |
Comb. | 0.3 | 0.7 | 0.7 | 1.2 | 1.0 | 0.7 | 0.8 |
PECOTA | 0.8 | 0.8 | 0.4 | -0.2 | 0.0 | 0.2 | 0.3 |
Fans | 0.9 | 0.5 | -0.1 | -0.3 | 0.4 | 0.6 | 0.3 |
RC | -0.4 | -0.7 | 0.5 | 1.1 | -0.7 | 0.3 | 0.0 |
ZiPS | -0.8 | -0.1 | -0.1 | 0.1 | 0.4 | 0.1 | -0.1 |
MORPS | 0.6 | 0.4 | 0.6 | -1.2 | -1.6 | -1.6 | -0.5 |
CAIRO | -1.8 | -1.3 | -0.2 | -1.0 | -0.6 | -0.7 | -0.9 |
Marcel | -0.7 | -1.6 | -2.5 | -0.9 | -0.6 | -1.2 | -1.3 |
RMSE rankings (results converted to z-scores) | |||||||
W | K | S | ERA | WHIP | WER | Av | |
Steamer | 1.2 | 1.4 | 0.7 | 1.3 | 1.7 | 1.7 | 1.3 |
Comb. | 0.3 | 0.7 | 0.8 | 1.2 | 0.9 | 0.7 | 0.8 |
Fans | 0.9 | 0.5 | -0.1 | -0.3 | 0.4 | 0.6 | 0.3 |
PECOTA | 0.8 | 0.8 | 0.4 | -0.3 | -0.1 | 0.2 | 0.3 |
RC | -0.5 | -0.7 | 0.5 | 1.1 | -0.7 | 0.3 | 0.0 |
ZiPS | -0.8 | -0.1 | -0.1 | 0.1 | 0.4 | 0.1 | -0.1 |
MORPS | 0.6 | 0.3 | 0.6 | -1.2 | -1.6 | -1.6 | -0.5 |
CAIRO | -1.8 | -1.3 | -0.2 | -1.0 | -0.6 | -0.7 | -0.9 |
Marcel | -0.7 | -1.5 | -2.5 | -0.9 | -0.7 | -1.2 | -1.3 |
The results actually look quite similar to what we saw when I analyzed the hitters. Steamer is top by a wide margin and the Combined projections are sitting up there as well. For the hitters, the bottom two of CAIRO and Marcel were also far below the rest. I expected to see the rankings look slightly different since the methodology of projecting pitchers is different than hitters but it seems that the takeaway is that a good system is good in any format. With that being said, Steamer continues to wipe the field.
I should note that projecting Saves is likely the hardest thing to do as it is mostly based off of opportunity and it’s hard to project opportunity for a player since that is controlled by a manager’s whimsy. If you strike Saves from the record, you can see that Steamer’s average z-score would go up even more (despite finishing well in Saves). I ran this analysis separately by taking out relievers and the results were similar with Steamer beating the field.
Looking at all 155 pitchers only tells us one story though. On draft day, you may not be too concerned with who had the best projection for Jeremy Affeldt and Antonio Bastardo. Most likely, you’re thinking more about the pitchers who might actually be starters on your fantasy team. So, I ran the test again and only included those with an ADP within the top 225 last year. This left me with 76 pitchers to analyze. In a 12-team league, that’s close to the number of pitchers that teams would have in their starting lineup.
Correlation | |||||||
W | K | S | ERA | WHIP | WER | Av | |
Steamer | 0.8 | 0.7 | 0.3 | 0.9 | 1.3 | 1.5 | 0.9 |
Comb. | 0.5 | 0.9 | 0.4 | 0.9 | 1.1 | 0.6 | 0.7 |
Fans | 1.5 | 0.9 | 0.4 | -0.4 | 1.1 | 0.8 | 0.7 |
RC | -0.5 | -0.6 | 0.3 | 1.1 | -0.6 | 1.0 | 0.1 |
PECOTA | 0.0 | 0.1 | 0.2 | -0.6 | -1.4 | -0.2 | -0.3 |
CAIRO | -1.2 | 0.8 | 0.8 | -1.0 | -0.8 | -0.9 | -0.4 |
ZiPS | -1.7 | -0.9 | 0.4 | -0.2 | 0.0 | -0.9 | -0.6 |
MORPS | 0.6 | 0.1 | -0.2 | -1.7 | -1.0 | -1.4 | -0.6 |
Marcel | 0.0 | -2.0 | -2.6 | 1.0 | 0.2 | -0.6 | -0.7 |
RMSE | |||||||
W | K | S | ERA | WHIP | WER | Av | |
Fans | 1.1 | 0.9 | 0.2 | 0.4 | 1.0 | 0.7 | 0.7 |
Steamer | 0.9 | 0.6 | 0.3 | 0.5 | -0.4 | 1.5 | 0.6 |
Comb. | 0.1 | 0.5 | 0.2 | 0.8 | 0.5 | 0.2 | 0.4 |
RC | -0.5 | -0.3 | 0.4 | 0.7 | 0.7 | 1.2 | 0.4 |
MORPS | 1.1 | 0.8 | -0.3 | -1.1 | 0.6 | -0.8 | 0.1 |
PECOTA | 0.3 | 0.2 | 0.4 | -1.2 | -2.0 | 0.3 | -0.3 |
CAIRO | -1.4 | 0.6 | 1.1 | -0.9 | -0.4 | -0.9 | -0.3 |
Marcel | 0.0 | -2.0 | -2.5 | 1.6 | 0.9 | -1.2 | -0.5 |
ZiPS | -1.6 | -1.2 | 0.2 | -0.7 | -1.0 | -0.9 | -0.9 |
While Steamer is still towards the top of the lists, the Fans from Fangraphs make a strong push towards the top with this group. While this is somewhat surprising, it makes sense since the fans have the best sense of playing time and roles for each of these pitchers. Scientific projection systems have a harder time accounting for such things. Regardless, Steamer does the best job of gauging overall roto value for a player as evidenced by their strong showing in the WERTH category.
Also, you’ll notice that ZiPS suffers a big drop here which is surprising. Marcel does a fantastic job with the rate-based stats of ERA and WHIP but suffers badly in all other areas. On the flipside, MORPS does a good job with the counting-based stats (which I somewhat expected since they seem to specifically account for player’s projected role) but suffers with the rate-based stats.
Conclusion
This was an interesting battle between the projections and the winner isn’t quite as clear as it was for hitters. Steamer is still the champion, no doubt, but the strengths of the other systems are evident. Projecting pitchers for fantasy baseball is a difficult task with the rate-based stats being one animal and the counting-based stats being a totally different one. The counting-based stats rely heavily on getting Innings Pitched correctly identified so accounting for injury and projected role is important there.
All in all, it is somewhat encouraging to see the same projection systems doing well for both hitters and pitchers. That shows how well Steamer is built and also how valuable the Fan projections can be too. For 2013, Fangraphs and Steamer teamed up and introduced a new projection that combines Steamer’s base projections with the Fans’ playing time projections. I’ll be interested to see how that stacks up next year.
Luke is better known as Mr. Cheatsheet despite his last name not being Cheatsheet. He makes spreadsheets, writes blog posts and his rankings were in the top 10 accuracy among FantasyPros experts in 2014, 2016 and 2017. When he's not doing fantasy baseball things, he can be found playing board games or rating beer.