Sabermetrics 101: Introduction to the Basics

InsidethePark.com spoke to a few well-read sabermetric believers to get their take, as well as somewhat of a definition of some of the main saber stats.


Equivalent Average – EqA
Baseball Prospectus

A measure of total offensive value per out, with corrections for league offensive level, home park, and team pitching. EQA considers batting as well as base running, but not the value of a position player's defense.

The EqA adjusted for all-time also has a correction for league difficulty. The scale is deliberately set to approximate that of batting average. Average is .260. EqA is derived from REqA, which is (H + TB + 1.5*(BB + HBP) + SB + SH + SF) divided by (AB + BB + HBP + SH + SF + CS + SB/3). REqA is then normalized to account for league difficulty and scale to create EqA.

EqA by being a rate stat is a great tool to evaluate offensive players without any concern for defense, health, or lineup concerns. It has one of the lower valuations of speed statistics of the ones listed here.

"A formula designed by Clay Davenport to account for total offensive production," said David Cameron of USSMariner.com and formerly of Baseball Prospectus. "The number correlates closer to run scoring than any other single offensive statistic available. It is designed to mirror batting average; .260 is league average, .300 is good, .330 is awesome, etc."

MLB Top 10: Bonds .454, Helton .344, Pujols .342, Edmonds .342, Hafner .337, Drew .335, Guerrero .334, Berkman .333, Beltre .331, Mora .330.

M's Top 5: Reed .343, Beltre .331, Ichiro .316, Ibanez .291, Jacobsen .291.


On-Base Plus Slugging (OPS)

Simply the result in adding on-base percentage to slugging percentage equals a player's OPS.

OPS measured against the league average, and adjusted for ballpark factors. An OPS+ over 100 is better than average, less than 100 is below average.

"EqA and OPS+ (park adjusted OPS) correlate extremely well with run scoring," said Cameron. "And it's a safe statement to make that we can effectively judge how well a batter helps his team score runs with almost near certainty."

AL Top 5: Hafner 158, Guerrero 154, M.Rameriz 152, Mora 149, Ortiz 145.

NL Top 5: Bonds 260, Pujols 175, Edmonds 173, Beltre 163, Berkman 161.

Top 5 M's: Beltre 163, Ichiro 135, Sexson 128, Jacobsen 122, Ibanez 121.


Equivalent Runs - EqR
Baseball Prospectus

EQR = 5 * OUT * EQA^2.5. In the fielding charts, the estimated number of EqR he had at the plate while playing this position in the field. In Adjusted Standings, EqR refers to the total number of equivalent runs scored by the team.

MLB Top 10: Bonds 167.9, Pujols 140.6, Guerrero 134.1, Ichiro 128.7, Abreu 127.5, Beltre 127.2, Helton 125.7, Berkman 123.6, M.Rameriz 122.9.

M's Top 5: Ichiro 128.7, Beltre 127.2, Winn 98.0, Boone 82.5, Ibanez 78.0.

Another Baseball Prospectus statistic, EqR by being a "counting" stat includes some measure of health and place in the lineup. A key example is that if Ichiro batted lower in the order and had the same EqA he'd have a lower EqR due to less Abs. This means that it has some incorporation of scouting, due to its inclusion of lineup and platooning.


Runs Created Above Average – RCAA
Lee Sinins

A stat invented and tracked by Lee Sinins, the author of the Sabermetric Baseball Encyclopedia. Lee calculates each player's Runs Created, and then compares it to the league average, given that player's number of plate appearances. Lee uses a different version than the one at Hardball Times." The one used here is from Hardball Times 2004 Handbook.

"RC is okay, but I prefer EqA for a few reasons," said Cameron. "It correlates better with run scoring, and it's a rate stat as opposed to a counting stat. I'm just not a big fan of counting stats in general, and RCAA doesn't tell us anything that other stats (don't) tell us, while simultaneously being impossible to calculate.

MLB Top 10: Bonds 152, Helton 78, Pujols 75, Edmonds 73, Abreu 69, Berkman 69, Beltre 64, Rolen 57, Drew 56.

M's Top 5: Beltre 64, Ichiro 56, Ibanez 16, Winn 7, Reed 6.

This is another of the "counting" statistics that does not include defense. But it can tell you how much better a player was than the league's average, though only offense. A basic sabermetric standard is that for every ten runs a team gains a win, so just the difference between Beltre and the 2004 Mariners third basemen (Spiezeo, Cabrera, Willie, Leone though they played more than just third) is 95 RCAA, or about nine wins.


Fielding Wins Shares per 1000 Innings Played – FWS/1000 Bill James

Fielding Win Shares were developed by Bill James to calculate the number of Wins that could be attributed to a player as a result of his contributions while fielding his position. They include the impact of a number of fielding statistics, such as range, assists, error rates, double plays and catchers' caught stealing. Some positions have built in higher ratings than others. The version used here is via hardballtimes.com.

The power of FWS/1000 is that it can help determine which players should have been used more in the field or as a defensive replacement. By dividing the Fielding Win Shares per thousand one can compare the defensive skills of two players that had vastly different amounts of playing time.

"The WS system has all kinds of flaws, and the fielding metrics are by far the worst aspect of Win Shares," said Cameron. "There's a good reason that WS never gained much notoriety, despite Neyer's attempts to shill for James' book in his column; everyone who put it to the test found gaping flaws in the system, and rightfully dismissed it as an inferior system to those already available. Some work has been done updating James' formulas that help, but the fielding system is still poor. I could really care less about Win Shares."


Runs Saved Above Average – RSAA Lee Sinins:

This stat, which is also tracked and reported by Lee Sinins, is a measure of a pitcher's effectiveness and contribution. The formula is Runs Allowed/Innings Pitched, minus league-average RA/IP, times total innings pitched.

MLB Top 10: Santana 54, R. Johnson 50, Sheets 45, Zambrano 42, Clemens 32, Peavy 31, Radke 31, Schmidt 31, Buehrle 29, Westbrook 28

M's Top 5: Madritsch 10, Guardado 7, Atchison 2, Villone 2, Sherrill 1

The pitching version of Sinins two statistics makes the two comparable. One could use this stat to compare the value of an ace to entire bullpen even. Notable to the Mariners fan is that besides Mr. Madritsch, only the bullpen was any good last year.


Fielding Independent Pitching – FIP
Tangotiger (tangotiger.net)

This is a measure of all those things for which a pitcher is specifically responsible. The formula is (HR*13+(BB+HBP)*3-K*2)/IP, plus a league-specific factor (usually around 3.2) to round out the number to an equivalent ERA number. FIP helps you understand how well a pitcher pitched, regardless of how well his fielders fielded. FIP was invented by Tangotiger. FIP is Not league/park adjusted.

AL: Santana 3.16, Schilling 3.34, Hudson 3.64, Radke 3.78, Martinez 3.82, Lieber 3.94, Harden 3.95, Escobar 3.96, Arroyo 4.06, Mussina 4.18

NL: Johnson 2.32, Sheets 2.67, Schmidt 2.94, Clemens 3.13, Peavy 3.16, Oswalt 3.18, Wright 3.30, Ol.Perez 3.46, Pavano 3.56, Zambrano 3.59, Davis 3.59

M's Top 5: Madritsch 3.63, Atchison 4.01, Soriano 4.19, Hasegawa 4.35, Thornton 4.55

The primary use of this stat is as a predictor of future success as there is a greater correlation year to year of FIP than ERA. Once again the for the Seattle organization, the bullpen was so much better than the staff. This stat does put a bit higher weight on groundball pitchers.


Earned Run Average +
Baseball-Reference

ERA measured against the league average, and adjusted for ballpark factors. An ERA+ over 100 is better than average, less than 100 is below average.

AL Top 5: Santana 182, Schilling 150, Radke 136, Westbrook 134, Lopez 133

NL Top 5: Peavy 177, R.Johnson 171, Zambrano 165, Sheets 154, Clemens 145

M's Top 5: Guardado 155, Madritsch 132, Atchison 122, Sherrill 113, Villone 106

Once again the local theme is on relievers. ERA+ is great in its simplicity as most fans can understand ERA and how to compare two players vs. the league average.


Win Shares – WS
Bill James/Hardballtimes.com

Win Shares is a very complicated statistic that takes all the contributions a player makes toward his team's wins and distills them into a single number that represents the number of wins contributed to the team, times three. They have tweaked James' original formula somewhat, and you can read more about our version of Win Shares at hardballtimes.com

MLB Top 10: Bonds 53, Pujols 40, Rolen 38, Beltre 37, Abreu 37, Edmonds 36, Drew 34, Loretta 33, Dunn 32

M's Top 5: Beltre 37, Ichiro 27, Winn 18, Ibanez 13, Boone 9

Win Shares is the most recent of the James' fusion statistics. It does incorporate lineup position, runs, runs batted in, steals, defense and pitching ability. Every three Win Shares equal a win. When using win shares it is important to compare the same component (total, offense, defense, pitching).


Value Over Replacement Player – VORP
Baseball Prospectus

The number of runs contributed beyond what a replacement-level player would contribute if given the same percentage of team plate appearances.

Cameron on VORP: "Keith Woolner's creation designed to reflect offensive value in relation to Runs Over Replacement Position. In other words, how many runs did a player add with his bat over what could be expected from a Triple-A veteran who could be obtained for the league minimum. If you're going to use a counting stat, this is the one to use."

M's Top 5: Beltre 89.1, Ichiro 80.9, Winn 34.6, Ibanez 30.1, Madritsch 28.0

The only stat listed here that utilizes the importance of a replacement player. This has certain advantages in that league average isn't necessarily available to every club, but just about every club can find a "replacement level" player from within the organization, independent leagues, Japan or Mexico.


Defensive Independent Pitching Stats - DIPS
Voros McCracken, (Keith Woolner, Tom Tippett)

Cameron on DIPS: "The study of Defense Independant Pitching Stats, which stated that pitchers only have consistent control over the outcome of walks, strikeouts, and homeruns, and the outcome of a ball in play is largely determined by the batter and the fielders. Voros' work has been updated and improved upon by Keith Woolner and Tom Tippett, and his original assessment has been muted slightly; some pitchers (knuckleballers especially) have displayed a consistent ability to turn balls in play into outs at a higher than average rate. However, as a general rule, DIPS holds up; pitchers have very little influence on whether a batted ball that does not leave the park turns into an out or a hit."


Overall, the stats versus scouting argument has been an ongoing discussion for years, recently revisited by Alan Schwartz at Baseball America.

Sabermetric stats are intended, generally, to supliment what a scout's eye might see. None are typically meant to replace the actual scouting departments, though many believers have done so.

"I think there's been too much work on attempting to find "one number" that sums up all performance," said Cameron of Sabermetrics and stats in general. "There are many different facets to the game, and players succeed in different ways.

"Rather than attempting to sum their value into a singular number, I feel that we'd be better served finding better ways to analyze the different roles of players (hitting, baserunning, fielding, pitching, etc.) in a better way.

"Offensively, we've basically reached a ceiling; EqA and OPS+ (park adjusted OPS) correlate extremely well with run scoring, and it's a safe statement to make that we can effectively judge how well a batter helps his team score runs with almost near certainty."

Saber-style statistics for hitting have been given more attention and is the reason why there are not as many options for formulating stats for pitching or defense. The pitching is gaining momentum and could possibly even out with all of the offensive stats, soon.

"The pitching statistics have improved quite a bit as well," said Cameron. "And we have a pretty solid understanding of how to value starting pitchers contributions. The work on relief pitcher impact is a little behind that of starting pitchers, but we're getting there."

"The next frontier for statistical analysis is obviously defense," added Cameron. "We're still very far away from having anything resembling a very credible system for valuing individual defensive contributions. Things like defensive efficiency in combination with double play rates and extra-base hit ratios can approximate defensive performance on a team wide basis."

"As a whole, I think statistical analysis has done a few things very well," said Cameron. "Mainly the retroactive analysis of an offensive players past performance.

"There are still some areas that statistical analysis does not do extremely well, however, defense being the main one, but projections for players who do not fall into the "normal" range of players also pose a problem, as so much of statistical analysis is based upon regression to a normal performance level as based on historical peers."



InsidethePark.com would like to thank David Cameron of USSMariner.com for his thoughts on sabermetric statistics as we engage on our journey to learn more about the matter and share our discoveries with InsidethePark.com readers.

Many thanks to our Special Assignment Consultant, Dave Clark, for his continued research on sabermetric stats, as well as several other aspects of the game of baseball.




Seattle Clubhouse Top Stories