Numbers Game: Part Two

In the past decade, it's been baseball swept up by a new trend in statistics and how America views them. Is basketball making a similar transition? Kyle Lamb explores the subject in greater detail with Basketball on Paper author Dean Oliver, ESPN college basketball analyst Jay Bilas and Basketball Prospectus' John Gasaway.

It's not a new concept. The idea of tempo-free statistics, measurements that critique basketball teams based on measuring them based on how they fare per possession is one that was written about long ago.

"I learned about efficiency when I read Dean Smith's book 20-some years ago," said ESPN college basketball analyst Jay Bilas. "It's a way to accomplish things and a different way to look at things."

Efficiency, which measures points given an equal number of possessions (100), tells the story of not how often a team scores, but rather, how efficient a team is with their scoring opportunities.

The average college basketball team in the 2007-08 season currently averages 67.1 possessions per game, estimated by an accurate formula developed by Dean Oliver's book, Basketball on Paper. The formula, which adds field goals attempts (minus offensive rebounds), turnovers and 47.5 percent of free throw attempts, finds the average between one team and its opponent. Oliver's research found this estimation was successful in finding a number of possessions during the course of a game.

By taking the number of points scored, and divided by possessions, you find the number of points per possession scored by a team. Currently, according to stats listed on BasketballState.com, 25-6 IUPUI leads the nation in points per possession, averaging 1.15 points per every trip down the floor. Meanwhile, the nation's leaders defensively are Memphis (0.84 points allowed per possession) and Wisconsin (0.85 points allowed per possession).

John Gasaway of Basketball Prospectus, formerly the Big Ten Wonk, has been tracking what is called efficiency margin – simply the amount per possession a team outscores its' opponents.

"This makes the fourth season I've been looking at that in real time, and it tracks fairly well with success in March and into April," he said. "The first year that I looked at it, the best four teams in that category were Illinois, North Carolina, Michigan State and Louisville. All of them made the Final Four that year. I thought, ‘wow, I'll have to keep track of these stats and I'll make millions of dollars.'

"Of course, it's not that easy," Gasaway said, adding that it's not an exact science.

It's not exact, but it can be telling.

An extension of tempo-free statistics applies a more predictive tool found by Bill James and his work in sabermetrics for baseball. The Pythagorean Log5 method, takes runs scored and runs allowed of baseball teams and using a fairly simple formula, finds the percentage team one would be expected to beat team two.

Ken Pomeroy, owner of KenPom.com and a contributor to Basketball Prospectus has taken this method and tweaked it according to efficiency statistics.

In baseball, the expected winning percentage must be found of each team. That formula is runs scored (squared) divided by runs scored (squared) plus runs allowed (squared). The result equals an expected win percentage of that team over the course of a season. You would get the expected number of victories simply by multiplying the result with the number of games that will be played that season (162 for baseball).

Pomeroy simply substitutes runs scored and runs allowed with adjusted offensive and defensive efficiency. Efficiency, both on offense and defense, is the points per possession number multiplied by 100. Adjusted efficiency is an average of all the games played, with each game multiplying the efficiency both on offense and defense by the national average, and then divide it by the opponent's corresponding adjusted efficiency for the entire season (for instance, for offensive efficiency, you would divide by the opponent's defensive efficiency).

As an example of the Pythagorean expected winning percentage, we use two teams: Michigan State and Ohio State.

Michigan St. owns an adjusted offensive efficiency of 116.8 and defensive efficiency of 90.1. Remember, this is essentially points per 100 possessions. Ohio State offensively is 111.2 and defensively 89.3.

James, for baseball, would simply square the runs scored and allowed. Oliver, in his book, applied this concept to basketball by using an exponent of 14, instead of two. Pomeroy has instead used an exponent of 11.5 for college basketball and using the efficiency numbers instead of actual scoring differential.

So, for Michigan St, we get:

(116.8^11.5)/(116.8^11.5+90.1^11.5)=.951880

Ohio State we find:

(111.2^11.5)/(111.2^11.5+89.3^11.5)+.925689

The formula for finding the chances of one team beating another, we take Michigan St. (team A) and Ohio State (team B).

(A-A*B)/(A+B-2*A*B) = chances of team A (Michigan St.) winning.

Customized for our experiment, we find the following:

(.951880-.951880*.925689)/(.951880+.925689-2*.951880*.925689) = .6135 or 61.4 percent chance that Michigan St. would beat Ohio State based on these latest numbers.

If this were a home game for either team, Pomeroy would take the original efficiency numbers before starting the process, and add 1.4 percent to the home team's offensive efficiency, while subtracting 1.4 percent from their defensive efficiency. Likewise, he would subtract 1.4 percent from the away team's offensive efficiency and add 1.4 percent to their defensive efficiency. So if these two teams played again in Columbus, as was the case Sunday, Ohio State's efficiency stats would have been adjusted to 112.76 on offense and 88.05 on defense.

A study of games over a two-day period last Wednesday and Thursday found the Pythagorean correctly predicted winners with a 32-4 record in 36 games studied. Of course, being a limited sample size and many people noting the difference between elite and average teams are quite noticeable, these results perhaps aren't unexpected.

"Nobody can predict these games," Bilas said. "If they could, nobody would bother with this stuff, they'd just go to Vegas."

Without realizing it, Bilas was on to something.

Though no one and no mathematical equation, with stunning accuracy, ever could predict a winner every single game or account for emotional factors and other trends that factor into each and every contest, it's this very concept that Las Vegas is built on.

Using efficiency numbers and tempo averages (number of possessions), Pomeroy is able to estimate a possible final score of two teams.

First, he finds the expected tempo by averaging each team's average against the national average to get a percent. For instance, the national average, as mentioned earlier, is 67.1 possessions per 40 minutes. Ohio State averages 64.6 possessions per 40 minutes, which is 96.2 percent of the average. Michigan St. averages 64.9, which is 96.7 percent of the average. Multiplying the two and then by 67.1 (the national average), we find a possible expected tempo of 62.4, or 62 possessions rounded down.

The same logic is then applied to the efficiency. A national average is found for each team's offensive and defensive metrics. For Ohio State, the percentage of their offensive efficiency (111.2 or 109.3 percent) against the national average is multiplied by Michigan State's percentage of their defensive efficiency (90.1 or 88.6 percent) against the national average. The result is multiplied by the national efficiency average (101.7) and then the excepted tempo (62) and divided by 100.

For Ohio State this becomes: 109.3%*88.6%*101.7*62/100 = 61.08 or 61 points scored.

For Michigan State, the same calculation is applied using Michigan State's offensive efficiency and Ohio State's defensive efficiency.

This becomes: 114.9%*87.8%*101.7*62/100 = 63.59 or 64 points scored.

Michigan State would have a 61 percent chance of beating Ohio State, and the expected scored would be 64-61. That's an expected scoring margin of three points.

Though the point spread for the two, which play Friday in the quarterfinals of the Big Ten Tournament in Indianapolis, are not yet available, it's said the Pythagorean is the concept applied by Vegas to produce the opening line. Then, the odds-makers adjust the line based on emotional factors, injuries, recent trends, etc. and ultimately the line moves as human betting adjusts it one way or another.

To test this, I examined the 36 games last week, where the Pythagorean correctly predicted 89 percent of the winners. Comparing the final point spreads with the Pythagorean predicted final scoring margin, the average difference between the spread and the Pythagorean was just 1.5 points. In all but one of the 36 games, the favorite was the same. Only Rice at East Carolina differed from Vegas to the Pythagorean.

By strictly applying the Pythagorean against the spread, it fared well: a 25-11 record in that two-day span.

This obviously is inconclusive over a long haul, but 66 percent against the spread is considered successful – or financially beneficial to those interested in making a profit. However, the question has to be asked: why the success if the Pythagorean is used to find a baseline for the spread?

For starters, it's possible a different variation is used. Pomeroy uses efficiency statistics, which he believes is more accurate than point spread, because his adjusted efficiency takes into account the strength of the competition. Additionally, he's adjusted the exponent for what his studies have shown to be more accurate in retrospect. Third, changes are made to account for potential factors (i.e. an injured player or a long losing streak). And lastly, this study was done with the final adjusted spread before the contest, meaning the human impact could have swayed the line a few points away from the original Pythagorean baseline.

The key, then, if there's any correlation of long-term success, would be to determine what's more accurate: human emotion or the computers.

The moral of the story is this: no one can know for sure who will win a game. It's after all, the reason they play the games. But perhaps there is some potential value in trying to gain an advantage with these statistics if used in a predictive nature.

"It's added information that allows you to try to make good decisions," Bilas added. "You have to look at it in totality and it's another useful tool that you can use. It's not the only thing, and I've never heard of any of those guys that use it say that it is.

But Bilas surmised it wasn't totally useless.

"If your goal is to have your bracket pool win in your office, you can use that and perhaps it will give you a better chance statistically," he said.

There is, however, a bigger usage for coaches for tempo-free statistics.

"You're not going to see some coach against North Carolina, trying to stop their secondary break, pouring over efficiency stats, trying to figure it out," Bilas added. "That's not the way it works."

That's where fundamentals enter into the discussion. That's also where Oliver has spent years trying to come up with several formulas to measure and quantify what coaches are evaluating on the floor.

But Longwood head coach Mike Gillian finds a delicate balance between the two. Gillian is a regular visitor to Pomeroy's stats and uses them to support various areas of emphasis.

"The balance that truly exists is that the statistics themselves only can support or defend what your opinion is about what's going on out on the floor," Gillian explained. "One, you're talking about coaching. You're talking about what a coaching staff sees; how they fit into what you're doing offensively, defensively and how you can do that. As you get (the players) to do those things, and you get them to be in a position to be successful in one possession, within a series of possessions over the course of a game, over a game, over a season – that's coaching. The stats can only support or defend what you're trying to do (and how).

"If you do those things, the stats can benefit from it and will show it," Gillian said. "From a coach's perspective – there's only one stat that counts: whether you win or lose. But anything that gets you to that point that you can use will be most important."

The parallels have been drawn from tempo-free statistics in basketball to sabermetrics in basketball.

Gillian says the difference is that the slower pace in baseball allows every pitch and every play in baseball to be charted. Basketball, on the other hand, is a faster-paced game where a lot goes unaccounted for in the box score.

Things like making the extra pass, denying an opponent the ball or setting up a screen for a teammate wishing to shoot the ball cannot completely be quantified.

But the emphasis, much like sabermetrics, is the same.

"We all look to make projections about what players will do. We all look at simple statistical approaches at how to make teams better," Oliver said of the concept behind tempo-free statistics.

According to Oliver, teams win basketball games by maximizing the four factors. Oliver says in his book that the four factors are shooting, rebounding, free throws and optimizing possessions (turnover rate).

To measure these factors, he uses effective field goal percentage (which is made field goals plus an added .5 for every 3-pointer made divided by total attempts), turnover rate (percentage of possessions committing a turnover), free throw rate (made free throws divided by field goal attempts) and offensive rebounding percentage (ratio of a team's rebound totals on the offensive end).

Oliver claims a team will win all of its game by taking all four categories, will win a huge majority taking three and will win very few by taking just one of four categories.

To put his work to the test, a study of Ohio State's 2007-08 season, though a small sample in the grand scheme of college basketball, clearly supports his findings.

In 31 games this year, Ohio State has yet to take all four categories in a game. The reason rests largely on the fact the Buckeyes have had a better offensive rebounding percentage in only six games, two of which were against notoriously poor rebounding-team Northwestern.

Eight times this season, the Buckeyes have taken three of four factors in a game. In those eight, Ohio State boasts a 6-2 record. The two times they lost were Texas A&M in the NIT finals and at Michigan St. Against the Aggies, Ohio State was just 14-of-58 from the field, a decisive disadvantage.

In 16 games, the Buckeyes split the four categories evenly with their opponent. Ohio State went 11-5 in those games.

The Buckeyes were just 1-4 when taking one category (Cleveland St. the lone victory) and 0-2 when taking none (Butler and North Carolina). Overall, Ohio State was 13-10 when taking two or fewer categories.

An interesting breakdown of those 23 games, however, takes account for the competition. By examining the RPI, Ohio State was just 4-8 in those 23 when playing a team ranked in the top-100 of the RPI. Against everyone else, they were 9-2. The lesson learned is that Oliver is probably on to something, and against a quality opponent, taking three of four categories greatly improve your chances.

One would think, with a fair amount of relevance, these stats may begin to appear in the mainstream more often.

That jury is still out.

Bilas doesn't see them popping up much on ESPN and elsewhere nationally, unlike many sabermetrics have done over the past few years. Gasaway and Gillian think it's coming sometime soon, and Oliver is not quite sure.

"I'm bipolar on this," said Oliver. "Some days I think it will be tomorrow. Others, I think never.

"The four factors could show up tomorrow," he adds. "It's easy and I developed it in part to tell a simple story."

As for the rest of the story – that might continue to be told in books 20 years from now.

 

 


Buckeye Sports Top Stories