Also see: Introducing BUBBA, Part One
Before I jump into specifics on BUBBA, I wanted to get the origin story of the name out of the way. I'm a little tired of the "forced acronym for any new metric" convention and wanted to go in the other direction. I thought that given my recent past and current jobs being primarily scouting, along with BUBBA itself being heavily influenced by scouting information, that I should pick a name that fits more in that world that the statistical research world. Also, given that my family is from and currently resides in Georgia/Alabama and I was looking for a name that would be more comfortable for a southern scout than a stat nerd, BUBBA was a natural fit.
So what does BUBBA stand for? It stands for BUBBA. Why is it capitalized? Because I said it is. You're welcome.
|1. Mike Trout
|2. Bryce Harper
|3. Miguel Cabrera
||3 Yrs x 22M
|4. Oscar Taveras
|5. Jurickson Profar
|6. Wil Myers
|7. Dylan Bundy
|8. Justin Verlander
||2 Yrs x 20M
|9. Mike Olt
Here's the players I'll use today to demonstrate BUBBA in action. The big takeaway number here is the "Asset Value" that's the one catch-all number that encompasses all the objective and some of the subjective information I've gathered on the player.
The first and most obvious use here is to judge trade proposals and I've worked up some guidelines for those purposes. There's a roughly 5% margin for error in the numbers, so if two players are within that range of each other (Taveras and Profar barely are), they can be considered of essentially equal value.
The more difficult guideline to establish is a rough benchmark for the discount rate on multiplayer trades. The only way to do this is to 1) run historical asset values and evaluate historical trades or 2) make hypothetical trades and ask industry friends to evaluate them. There's a lot of 20/20 hindsight in running historical asset values for young players, so I opted to primarily use option two.
What we came up with was 35-40% in a 2-for-1 deal and 50-60% in a 3-for-1 deal. In practice, that means that fairly trading a player with an $80 million asset value for two other players should yield two notable assets ($10 million or more each) that add up to 35-40% above the single players' ($80 million original guy) value, or about $110 million.
To use a concrete example, this would suggest that, if the teams were motivated to consider such a deal (farfetched in this situation) that Miguel Cabrera ($79.5 million) could equate to a package of Jurickson Profar ($68.1 million) and Mike Olt ($38.6 million). After talking to some scouts and execs about this particular hypothetical, they said it was about correct. Some suggested the Rangers would have to kick in a third notable player but admitted that impulse may be due to the Tigers having no reason to move Cabrera (which, for the sake of this example, we're assuming they have some motivation to trade him). The important part was this dry run says these numbers pass the smell test, regardless of the fact that this deal would never happen.
How The Sausage Is Made
There are four inputs that drive these values beyond the obvious elements covered in yesterday's article: age relative to level, pitcher/hitter projection differences, long-term contract extension opportunities and elite player scarcity.
The impact of age relative to level is pretty easy to understand. Put up huge numbers when you're younger than everyone else you're facing and that's a good indicator of future production. For those wanting more concrete proof of this fact, check out Rany Jazayerli's groundbreaking study on the impact of draft day age for high school hitters. If Mike Olt were putting up the same numbers but instead was 20-22 years old like the hitters ranked ahead of him, BUBBA would add an eight figure sum to his current asset value. Note: I will take any chance I get to anthropomorphize BUBBA.
Every one of the previous studies of prospects lists referenced in yesterday's article marvels at the stark contrast between the success rates of hitting prospects relative to pitching prospects. There is some nuance to the issue, but essentially every prospect list overrates pitchers as a whole, though the best pitchers in the game tend to come from high spots on prospect lists.
Since I converted some of the wisdom of these studies to the 20-80 scale, I can shed some more light on this issue. Every reputable prospect list I've seen has Profar, Taveras and Bundy in some order as their top three prospects and Myers a clear fourth, but I have Myers third and Bundy fourth. My projections agree with the consensus: I have Profar, Taveras and Bundy as 70s on the 20-80 scale and Myers as a 65, essentially exactly like everyone else does.
Why the difference in output? The track record of scouts over the last few decades tells us that when they put a 70 on a pitcher, all else being equal, he ends up, on average, being as valuable as a 60 or 65 hitter. Since Myers is a solid 65 and is ready to contribute at the big league level, he's on the high end of that range and is ranked high enough above Bundy to be out of the 5% margin for error.
Prospect lists and scouts have been using the 20-80 scale as gospel for decades and one of its big blind spots is the scale doesn't acknowledge that a 70 pitching prospect doesn't hit his ceiling as often as a 70 hitting prospect, so they aren't of equivalent value, even when all else is equal.
This relationship between success rates of hitting and pitching prospects isn't linear, consistent or easy to explain. Depending on the quality of the player, their age, their minor league level and a few other factors, the success rates may be the same or even wider than this Bundy/Myers example.
Extending A Hand
One factor that I think has been largely ignored in previous studies is accounting for the value of the opportunity to sign a top prospect to a long-term extension; call it the Evan Longoria effect. After accounting for inflation, time value of money and other factors, Longoria's initial six-year deal with three options will save the Rays $25-30 million. While not every player would've accepted that deal then and almost no player would now, Longoria admitted that it being his set-for-life first big payday heavily influenced his decision.
Having access to a top player during this time of his career (hopefully without Scott Boras in his ear) is undoubtedly worth millions to a club. The problem with trying to quantify this is how subjective it is to attempt to project the potential savings to the club and the chance the player accepts it.
After going through various extensions that have been signed and evaluating the money saved by the club, I made two matrices. One has some objective underpinnings: it projects the savings a club would glean from the average extension they would offer to a player based on his position, skill level and service time. The other is more subjective: the odds a player would sign such an extension, based on the same three criteria.
For top shelf prospects just entering the majors like Taveras, Profar, Bundy and Myers, my calculation of the value of this opportunity approaches $10 million. Obviously this amount changes as (service) time wears on and potential savings dwindle while free agency approaches. As I gather player/agent specific information about willingness to sign such a deal, I can move away from a generic value.
I calculated these extension bonus values for post-control players such as Cabrera and Verlander but I decided to omit them, as they are heavily player specific. These extension bonus figures are often in the low six figures as you're basically valuing an exclusive window to sign the player to a market deal with the player having heavy incentives to become a free agent.
Miguel Cabrera, The Exception
Another reason why I don't think anyone has tried this comprehensive of an asset value system before is the unrealistic values the objective systems spit out for superstars. Take Miguel Cabrera for example: he's a 5-7 win player for the remaining three years on his contract that will pay him a total of $65 million. With a win being valued by the league as a whole at a little over $5 million right now, the math is simple: roughly 30-something million in performance minus 20-something million in salary per year, times three years equals roughly $30 million in asset value.
If that Cabrera asset value were accurate, that would mean the Rangers could easily acquire the best player in baseball on a three year, below-market deal in exchange for Mike Olt. That's obviously ridiculous but using the existing framework, it's hard to argue that either value should change significantly. The only conclusions to make are that the entire baseball industry is wrong or these frameworks have limitations in extreme circumstances.
The reason the framework breaks down is because it's made to apply to average teams and average situations. A team could be valuing a win as high as $10 million right now based on their payroll, team quality, revenues, market size, ballpark situation, farm system strength, etc. but won't have to pay more than $5 million per win on the market for a free agent for a number of reasons that general economic theory can explain better than I can.
The roughly $5 million per win figure is the league average and it works for most situations. Most teams value a win right now within 20% of that value and that market drives player values. However, that all gets blown up when the best players in baseball are on the market.
Every study I've seen on player valuation says the evidence points to linear player valuation. This means that a 1 win player and an 8 win player both sign for the same $5 million per win market rate as free agents. There are two reasons why I think the evidence shows this: limited sample size of elite players (by nature) and clubs choose which players get to free agency.
Matt Swartz has done work showing that the players that become free agents perform markedly worse than the ones teams sign to extensions to keep them off the market. This would explain why the data doesn't show that truly elite players are paid non-linear rates as free agents: so few of them become free agents (coming from an already small group of elite players to begin with) that it can't be seen. Along the same logical lines, Bill James famously compared finding clutch hitting in the data to finding a carcass in a dark forest:
"The forest is dark and deep, and there are many places for a body to decay without being found. We don't need that kind of evidence. We don't need to see statistics on them. We live in the forest with them; we know that they exist."
So, do I just smell a whiff of non-linear valuation and can't find it in the forest because it's too dark? Nope. At some point, a player is worth what the other 29 teams say he is worth in trade talks and actual trades, frameworks be damned. BUBBA is a system based on trade value and having an equal footing between different types of players, so what good would it be if the best players in baseball had BUBBA values that didn't make any sense in the real world?
Only some teams can afford to pay elite players a market rate and when these large market teams are competing for an elite player, they start paying at their specific, higher dollar per win rate rather than sticking to the league average rate. With free agents and lesser players, there are warts and alternatives and they keep things from getting out of hand price-wise. With elite players under contract that haven't been allowed to hit the market, there aren't alternatives and prices get exorbitant. The Giants trading top prospect Zach Wheeler for a few months of vintage Carlos Beltran to try to make the playoffs in 2011 is a great example of this dynamic in action.
In the cases of the top MLB assets according to BUBBA, many of them have never been available in trade talks or only when a team has no choice but to trade them for non-baseball reasons. For the same reason that executives have trouble nailing down what current day Miguel Cabrera would fetch in trade is the reason his asset value should be closer to $80 million than $30 million; guys like this almost never are available.
Rubber Meets The Road
Luckily for me, I have friends in the industry that are still privy to the conversations I was privy to when I was with clubs. When they tell me the Rangers would do backflips and send multiple first born to be named later to trade Olt for Cabrera, it's one data point among many that reaffirms my suspicions. When another handful of elite players under contract are proven to be obviously more valuable that any academic framework can justify, my suspicions were confirmed.
Cabrera's trade value, as detailed above, is perceived to be and thus in actuality is in line with the $79.5 million asset value I put on him. So how did I justify getting to that value with something in the 30's seems to be more correct?
Without making this even longer than it has to be, I took a cross-section of elite and sub-elite players in baseball, made up a bunch of fake trades, surveyed people in the industry and found which players are valued higher on the trade market than with the conventional framework. Essentially, I found the boundaries where a sound valuation system breaks and poked around enough to make rules to adjust it.
While the adjustments are more sophisticated than this, in practice you can see what my research yielded. When you're dealing with Miguel Cabrera, a thirty year old third baseman that's below average defensively but arguably the best hitter in the game on a three year deal slightly below market value, you can increase his conventional asset value by 2-3 times to account for the scarcity of his eliteness.
You could say I'm just reverse-engineering what mistaken executives think players are worth but, intellectually, I think it's sound to adjust in this fashion. The existing framework can be successfully applied to 95%-plus of players is based on an assumption (an invisible hand pushing player to sign for the league average dollar per win rate) that doesn't hold up for 100% of players. Figuring out anecdotally when it doesn't apply is the only way to adjust for that phenomenon when concrete data doesn't exist.
You can argue that the framework isn't flawed, I shouldn't adjust it and that all players should be paid on a linear, fixed dollar per win basis and I can't tell you that you're wrong. Sticking completely to the framework is safe: it's more academic, more in line with previous research methods and a generally cleaner and easier to look at process that can be universally applied to all players. That said, it isn't always right to take the cleaner-looking and less challenging option: we aren't selling jeans here.