Data Analyst and Lover of Baseball and Beer
By Doug Duffy | 31/01/2016
If you’ve ever participated in a fantasy draft of any kind, you’re familiar with the concept of projections. Projections, they’re (almost) as simple as they sound. What do you project a given player to accomplish based on his past accomplishments? Projections are not restricted to the realm of fantasy sports however; teams utilize projections as well, to assist them in player valuation. In this post I’ll explain how you can use projections for player valuation for your own fantasy baseball league, using a model based either on Standing Points Gained above replacement, or Fantasy Points above replacement, depending upon the scoring system of the league . In addition, I’ll be posting the R code used to perform the models, as well as Draft Cheat Sheets containing relevant draft info from many of the sources we searched.
Special Request (AL-only 4×4 10 Team no R or K): 4x4ALonly10Team
Update(3/21/2015):The projection database and draftsheets have all been updtaed, and the R code used to calculate TOTspgAR and FPtsAR has been posted. Enjoy.
I’ve compiled a database of as many publicly available projection systems as I could find
(ESPN, Yahoo, CBS, Steamer, Razzball, Zips, Clay Davenport’s and TG), and used this database to produce my own projection system, which I’ll refer to as the Duffy model (You’re right, I suck at naming things) [2-9]. I will have a later piece outlining how the various projection systems go about their business (tentatively titled: “99 Projections, But a Glitch in Each One?”), for now though, I’ll say the Duffy model takes the average of all the model’s (except TG as noted below) rate statistics (H-rate, BB-rate, etc.) for each player and projects them over the average number of plate appearances, or innings pitched for pitchers, predicted by the models . In addition, as the Zips model makes no attempt to project playing time in the major leagues, we modified it by applying its rate based statistics over the average projected playing time from the other models, except the TG model for reasons explained below .
The Yahoo dataset used here is significantly smaller than the other datasets (approximately 300 players vs 900 players). The players it projects for are the top players, i.e. the ones receiving the most plate appearances, thus accounting for its skewed distribution shown below. This is an observer bias caused by my refusal to manually scrape projections for 1000 hitters and pitchers, though the supremely thorough among you might call this laziness (If you know how to programmatically scrape off Yahoo fantasy sports please let me know. I’m pretty sure I’m just running into the Great Pay Wall of RotoWire). On another note, the TG model has a curiously flat density function while every other model is clearly bimodal, so I’m somewhat skeptical of it. For instance, TG projects Kolten Wong to have a mere 303 PA, while each other model has him projected for over 500 PA . The Duffy model closely matches the other relatively large datasets, Steamer and ModZips, in terms of projected plate appearances, as well as the other projected statistics. The Duffy model projections that will be used here to generate Fantasy Baseball values are also attached in raw form. The projected plate appearance distribution density functions for each model are plotted below in Figure 1, where the Duffy model is shown with increased line width (if you know how to display this increased line width in the legend using the ggplot2 R package, please let me know!).
Figure 1: Projected PA Population Density by Model
One of the most common fantasy baseball formats is known as rotisserie, where each team is scored in a number of hitting and pitching categories, often 5 each . The team that scores the highest (or lowest for ERA and WHIP) in each category receives the maximum Standing Points, which is equivalent to the number of teams in the league, while the team that scores the least in each category scores 1 Standing Point. Each team is then ranked based on the sum total of its Standing Points from each stat category. The question in player valuation then becomes, how valuable is a HR vs a SB? A SV vs a W? Or even ERA vs a W?
In order to evaluate the comparative worth of one statistic versus another, we must see how each stat correlates with Standings Points, or mathematically speaking, determine the slope of the function Team Stat vs Standings Points Earned. As it was easiest to find data for previous league standings from my own fantasy league, which goes by the acronym VSNIC, such a plot is shown below for that league’s run scoring history in Figure 2. It is interesting to note what appears to be a decrease in the scoring environment from 2011 to 2014 (in 2009 and 2010 the league was fewer than 10 teams), perhaps further evidence that it’s a good time to be a pitcher.
Figure 2: VSNIC Runs vs Standing Points Earned
Similar functions can be described for each of the other statistics (R, HR, RBI, SB, AVG, W, SV, K, ERA, WHIP in standard 5×5 scoring). For the SPG model on the attached draft sheets, the slope values were taken from Razzball, and are shown below in Table 1 . (Note: If performing this modeling yourself, it is always best to use your own league’s scoring history, if possible.)
Table 1: Razzball Standing Pint Gained Slopes
Using the slope of the simple linear model shown above, it can be seen that on average an additional 26.9 runs scored will result in one standings point gained (in VSNIC’s format). Then a player such as Mike Trout, who we project to score 110 runs in the season, will be worth 4.1 standings points, in runs alone. The same calculation, shown below, can be performed for each of a player’s counting statistics to determine the standings points gained for each counting statistic category.
The calculation for rate statistics such as AVG and ERA, is slightly more complicated however, as each player’s contribution to the team’s rate is weighted by both his and his team’s number of attempts, either AB or IP. In order to determine a given player’s SPG for a rate statistic we must determine how the addition of his stats to the stats of an average team composed of one too few players (or mathematically speaking, n-1) compares to the fantasy league average rate statistic. This concept is perhaps demonstrated best in the equations shown below for slugging percentage standings points gained, again taking Mike Trout and the settings from 10-team VSNIC as our example (VSNIC uses OBP and SLG, not AVG or HR), where bars over variables indicate the average of the given variable. Mike Trout’s projected 322 TB in 584 AB would shift a team composed of n-1 average players, from the league average SLG of 0.436 to a SLG of 0.444, meaning he is responsible for approximately 1.4 standings points gained for SLG.
By summing over the standings points gained for each statistical category we can obtain a single metric of total standings points gained for each player. This is again illustrated below using Mike Trout as the example player, with VSNIC scoring.
Throughout this discussion we’ve used offensive statistics as the examples, however similar computations are possible for each pitching statistic by again properly distinguishing between counting and rate statistics.
At this point we’ve created a single metric by which to value every player in MLB whom we have predicted stats for. This is a powerful tool, but not a tool by which you should draft your lineup, as this metric neglects to account for some things, notably position scarcity. Every position’s hitters were not created equal; it is much easier to find a “good” hitter at 1B or OF, as these are positions where teams can hide good hitters who struggle defensively. Accounting for this position scarcity requires us to determine replacement level for each position, in a way that is similar to Wins Above Replacement, or WAR. Think of replacement level as “that guy” you can pick up on the waiver wire the day after the draft, Matthew Berry once termed this “The Wandy Line” .
In fantasy baseball, the first player at the position who won’t get drafted, based upon the number of teams and the roster settings of the fantasy league, defines replacement level for each position. This can be slightly confusing to wrap your head around, so we’ll illustrate using standard 10-team ESPN roster settings (1x C, 1x 1B, 1x 2B, 1x 3B, 1x SS, 1x 2B/SS, 1x 1B/3B, 5x OF, 1x Util, 9x P, 3x Bench). With 10 teams each needing exactly one C the result is simple, replacement level is the 11th C as ranked by Total Standings Points Gained. For 2B and SS however, each team requires 1x 2B, 1x SS and 1x 2B/SS, therefore, assuming an even split of the additional mixed roster spot between 2B and SS, replacement level is the 16th 2B or SS.
Before we define replacement level for 1B, 3B, or OF we must determine how to distribute the bench slots between hitters and pitchers. For this analysis we determined replacement level by allotting 2 bench slots to hitters and 1 bench slot to pitchers, leaving us with 1x Util and 2x BN slots to distribute to 1B, 3B, and OF. Our model allots 1.5 of these 3 slots to OF, 0.3 to 3B and 1.2 to 1B (I’ll explain how I got these values in a bit). This gives us (5 + 1.5) * 10 = 65 OF drafted, or 66th OF as replacement level. Splitting the corner infield spot evenly among 1B and 3B, gives us (1 + 0.5 + 0.3) * 10 = 18 3B, or 19th 3B as replacement level, and finally (1 + 0.5 + 1.2) * 10 = 27 1B, or 28th 1B as replacement level. The levels of replacement that were used to compile the attached draftsheets are summarized in Table 2 below.
Table 2: Position Replacement LEvel
These replacement levels are by no means intended as definitive, as the distribution of utility, mixed position and bench spots is left to the modeler’s discretion. The rankings produced using these levels gave me the “looks right” vibe, however tuning these levels is an iterative process.
The quantitative approach that was used in this iteration was to examine the replacement level statistics for each position as shown in Table 3 in the next section. Notice how we’ve tuned replacement level total standing points gained for each position, other than C and SS (which are exceptions in hitting ability due to the large defensive requirements), to be approximately the same level.
Now that we’ve defined replacement level for each position, we can calculate the standings points gained in each stat category that we might expect to be able to pick up off the waiver wire for a given position. We determined each position’s replacement level standing points gained for each category by taking the local average around replacement level, taking the two players just above and below replacement level. This is shown below for catcher’s replacement level runs standing points gained in a 10-team format. Values for each position’s replacement level standing points gained in each category of a 5×5 rotisserie 10-team format is shown below in Table 3 (RepPos of NA is used to account for players lacking position information, where Repspg in each category was taken as average of all positions).
Table 3: Replacement Level SPG by Statistic and Position (10-Team Roto Scoring)
Performing this calculation for each statistic and position allows us to compare each player’s standing points gained in each category to that of a replacement player, i.e. his value above replacement. We show this calculation taking Buster Posey’s projected 69 R, good for 2.79 Rspg, as our example to compare against a replacement catcher.
We now have a single metric measuring each player’s total standing points gained above his hypothetical replacement, and this is the quantity we should base our draft decisions around.
Points leagues, an alternative scoring system for fantasy baseball, present us with a simpler analysis to determine each player’s comparative worth . Rather than having to determine how valuable each type of statistic is, there is only one thing that counts: Points. In order to account for position scarcity, replacement level must be determined and accounted for, in the same fashion as above for rotisserie scoring. The roster settings and thus replacement levels used in our analysis are the same as used above, while the points scoring settings are shown below in Table 4.
Table 4: Standard Points League Scoring Setting
Special Request (AL-only 4×4 10 Team no R or K): 4x4ALonly10Team
The Draft Sheets attached should be fairly self-explanatory, though we’ll explain what all the data means and where it came from. They are sorted by either total standing points gained above replacement (TOTspgAR) or total fantasy points above replacement (TOTFPtsAR), depending on the scoring type.
Take note that each draft sheet’s value above replacement transitions from positive to negative values as you cross the “draftable” threshold, i.e. “The Wandy Line” (251 for 10-team leagues and 301 for 12-team). The draftable threshold on each sheet is highlighted in red, and the bordering indicates the projected round structure of a snake-style draft, that is a non-auction draft.
If performing this modeling yourself, you should check the player values around the draftable threshold to see if you screwed up, if it doesn’t go from positive to negative near the correct threshold, start debugging.
I only recently began teaching myself how to code in R, so I’m sure there are better ways to go about a lot of this, and any suggestions are more than welcomed.
Although we think we’ve put together a pretty good model, it is by no means perfect. When determining replacement level this model requires defining a single position for each player and it has trouble accounting for multi-position eligibility, though in truth each player should be judged as the “weakest hitting” position they’re eligible at . This becomes difficult to incorporate, and somewhat subjective as the “weakest” hitting position is affected by the setting of replacement level, so we’ve simply used the positions listed by Zips in our modeling. Therefore, when valuing uber-utility men Ben Zobrist or Josh Harrison, you may want to make note of their Zips position, as well as where they are eligible in your league, and potentially adjust their ranking accordingly. In addition, in order to define a position for DH’s, I’ve lumped them in with the 1B, though this is a relatively minor issue as only a handful of DH’s are relevant anyway.
The standing points gained model used here has an additional oversimplification. It has assumed that the stats a player accrues are solely used to increase the team’s standing points, which is not entirely true . Each stat category has a varying amount of stats that must be accrued simply to obtain last place (remember that y-intercept from our linear regression?); only after achieving last place do additional stats contribute to standing points in the way we’ve assumed.
We obviously were not the first to try to quantitatively determine fantasy value above replacement, such as this and this, and we’d be seriously remiss if we didn’t mention Tanner Bell, on whose work this model was largely based [22-24,1]. I still can’t figure out how he does this stuff in Excel though.
As the projection system used here is compiled from 7 sources, each of which only update their projections occasionally, the response time of our rankings to players with recently shifting playing time expectations is slow. Put simply, players such as Cliff Lee, Yu Darvish and Marcus Stroman are all projected to be pitching this year in these rankings, to put it mildly, it doesn’t really look like that will happen and I can’t really suggest you should be drafting any of them .
Although I don’t believe this model is 100% mathematically rigorous based on the discussion points above, it does possess a certain simplicity and common sense approach that appeals to me as an engineer. Generally speaking, don’t draft in a vacuum, adjust these rankings as you see fit, because if you didn’t it’d be our team not yours.