Introduction to the Baseball Projections | 2014 Version

We all want to see into the future. We’d win every fantasy baseball league in the world if we could (and, you know, do other cool stuff too). Unfortunately for everyone, psychics won’t help you win your fantasy league so we must rely on predictions instead. To predict the stats for an upcoming fantasy baseball season, the common approach is making guesses about player performance by our own judgment but there’s also the scientific approach to player projections of doing data analysis based on a player’s past stats. As the saying goes, the best predictor of future behavior is past behavior.

In the Mr. Cheatsheet world, I trust the scientific projections. However, there’s not just one be-all-end-all projection system out there so you may wonder what the difference is between the ever-growing amount of projections out there.

(Disclaimer: The explanations I give of the projection systems below are an extremely simplified description and each system has a lot more wrinkles than I get into for this post so I encourage you to visit each of their sites if you are curious to learn more.)

Age Regression

Most of these projection systems rely heavily on some sort of aging factor. After averaging stats from the past few years with more weight on the most recent years, those results are then regressed to a certain degree based on predicted aging trends for a player of each age.

Marcel – This is as simple as baseball projections get. Marcel was developed by Tom Tango and it takes the past three years of player data, puts more weight on the most recent years and then regresses the results depending on the player’s age. That’s it.

CAIRO – This method builds upon the basics from Marcel. From Replacement Level Yankees Weblog, it weights three years of stats then also regresses the results not only for age but regresses differently depending on which position the player plays. In addition, certain statistics are regressed differently than others.

Oliver – This system from Hardball Times is also similar to Marcel with three years of weighted then age-regressed stats. However, the system was designed to try to better analyze minor league data in hopes of more accurately predicting younger player stats. The minor league data weights certain ballparks and leagues differently.

Steamer – Setting itself apart from the previous systems, Steamer uses five years of player data and treats each stat differently for regression purposes. In other words, each component within their projections uses a different projection system. This results in what I consider to be the most accurate system overall.

Comparable Players

A wrinkle that is employed by one popular projection system is finding comparable historical players then doing age regression based on the careers of those players. It’s still age regression but done in a unique way.

ZiPS – Developed by Dan Szymborski, this system takes the past four years of stats for each player, weights the more recent years heavily and then takes those results to look for comparable historical players to determine the aging regression trend to apply.

A Human Element

There’s often a bit of a human element when determining the methodology behind a projection system but some of the systems have a bit more human intervention than others. This is usually related to determining playing time.

MORPS – On the surface, MORPS does many similar things to Marcel and CAIRO. It takes four years of player data for weighting and then does regression dependent on player position and player league (AL or NL). However, the creator of this system also tries to look at current depth charts to determine playing time projections. This is a bit more of an art than a science.

FanGraphs Fans – This actually isn’t a scientific system at all except for the fact that we’re assuming that the fans involved are somewhat scientific-minded since they visit Fangraphs. It’s a crowd-sourced projection where visitors of the site project players and those results are averaged out.

Steamer + Fans – Possibly an upgrade over Steamer because it uses Steamer’s raw data but adjusts it to use the playing time projections from the Fangraphs Fans projections. The idea being that humans would be better at estimating playing time.

Mr. Cheatsheet’s Special Blend – I debuted this for the 2013 season on this site. In analyzing the accuracy of the projections systems, I tried to come up with a good way of averaging out available projections in order to come up with a very accurate system for fantasy baseball purposes. I weighted the projection systems differently for each stat and came up with the averages based off of those.

Which System Is Best?

Check back to this site in the near future to see my analysis of last year’s projections to determine which systems performed best over the past year.

You Might Also Like

  • MP
    01/23/2014 at 8:42 PM

    If not too late, can you include these in your 2013 accuracy study?

    http://projectingx.com/baseball-player-projections/

    (I am not affiliated with him, but am curious how he has performed)

  • Luke
    01/23/2014 at 8:45 PM

    Hmm. Good timing as I was putting the finishing touches on that research. However, I don't see a way to download his 2013 projections. Am I missing it?

  • MP
    02/04/2014 at 10:35 PM

    Ah, I don't see a link either. On another note, so you have access to PECOTA for 2013?

  • Kevin
    03/01/2014 at 4:32 PM

    I'm interested in the Steamers+Fans data but how do you go about combining them? I don't see a way on Fangraphs or am I missing something?

  • Luke
    03/01/2014 at 4:36 PM

    That's something that Fangraphs will put out. Not sure exactly when they'll do that though.

  • mchokie
    03/23/2014 at 3:20 AM

    Love the sheets but could you add filter to player name for auction on draft central tab? so much easier than trying to search.