Baseball Stats and Freakonomics Wannabes . . .

Saturday, December 22, 2007 | 08:31 AM

Much of investing relates to mathematics and the application of statistics. Markets are statistical data generating machines, and that data can be sliced and diced in a myriad of ways. We always pay close attention whenever we see an interesting application -- or misapplication -- of quantitative data that may be instructive or applicable to investing.

So I was particularly intrigued by a study in today's NYTime's OP-ED page that purported to look at the impact of steroids on the performance of Baseball players, based on the Mitchell Report. They asked the question: "In a complex team sport like baseball, do the drugs make a difference sufficient to be detected in the players’ performance records?"

Their conclusion? The authors of More Juice, Less Punch found that Steroids, Human Growth Hormone and the like do not have a net benefit to major league players. Based on their review of pre- and post- steroidal usage, the overall impact on players stats was de minimus.

I remain unconvinced.

Ever since Freakonomics became a runaway economics best seller, there seems to be increasing attempts by "rogue economists" and others to discover the hidden, counter-intuitive side of everything. This column seems to be of that genre. They would have been better served if they were channeling the statistical approach of Moneyball, instead.

When you come across broad attempts to explain complex systems, your inner mathematician should always be concerned that the methodology employed is sound, any initial assumptions made are justified, and the analytical steps taken are well supported.

In the present case, I suspect they are not. Consider the following statistical and analytical issues:

1. The authors of the Times Op-Ed looked at 48 batters and 23 pitchers named in the Mitchell Report; This may be too small a sample to draw any valid conclusion.

2. For pitchers, they studied ERA. Is the main impact pitching advantage of Juice the impact on ERA? That stat is a function of many things -- intelligence, pitch selection, opposing batter research, etc. -- not just physical power.

The authors ignored many other stats that might be more telling as to the impact of 'roids: Consider strike outs, average pitch speed, average number of pitches thrown per game, total games pitched. These data points would have been quite instructive as to the impact of performance enhancing drugs (PED) on issues such as strength and durability, even injury recovery.

3. For Hitters, they examined batting averages, home runs and slugging percentages. The same durability issues were overlooked -- games played and missed, total at bats, swings with ball contact, distance traveled of hit balls,  etc.

And what about speed -- why not consider stolen bases? We know lots of runners and cyclers have been accused of using PEDs -- isn't this a valid data point to consider?

4. Dates: What were the Before & After dates? It appears that by drawing the line at the date of accusation, lots of PED usage will have taken place in the BEFORE data set. If the performance gains of the AFTER group, began in actuality during the BEFORE, the entire statistical conclusion becomes indeterminate.   

5. No control group: All players begin to show statistical deterioration as they age, get worn down, injured, etc. How can we tell what their stats would have been looked had they not been juiced?

Rather than comparing pre-accusation and post-accusation stats, perhaps a better comparison would have been to look at the group of players who used PEDs versus those who didn't as their careers wound down. How do the two groups compare in their mid 30s? Late 30s? Early 40s?

Note that even this grouping may be flawed, because of the self-selection factor of those who chose to use the drugs in the first place (more injury prone, weaker, slower, etc).

6. False Accusations: Are any of the players accused in the Mitchell Report not guilty of using PEDs? I have no idea, but its a valid possibility. How might their false positives impact the author's conclusions regarding stats?


I don't know what the total impact of Steroids and Human Growth Hormone were on baseball player's performance -- but based upon the above, neither do Professors Jonathan Cole and Stephan Stigler.

~~~

One last thought: Why hasn't Baseball Commissioner Bud Selig resigned or been fired? 

Shouldn't he -- like Merrill Lynch's O'Neal and Citigroup's Prince -- fall on his sword? This happened on his watch, and he apparently was asleep at the wheel. For this gross incompetency, Selig should be tossed aside like a used syringe.


>




Source:

More Juice, Less Punch
JONATHAN R. COLE and STEPHEN M. STIGLER
NYT, December 22, 2007
http://www.nytimes.com/2007/12/22/opinion/22cole.html

INDEPENDENT INVESTIGATION INTO THE ILLEGAL USE OF STEROIDS AND OTHER
PERFORMANCE ENHANCING SUBSTANCES BY PLAYERS IN MAJOR LEAGUE BASEBALL

GEORGE J. MITCHELL
DLA PIPER US LLP, December 13, 2007
http://assets.espn.go.com/media/pdf/071213/mitchell_report.pdf

Saturday, December 22, 2007 | 08:31 AM | Permalink | Comments (29) | TrackBack (0)
de.li.cious add to de.li.cious | digg digg this! | technorati add to technorati | email email this post

bn-image

TrackBack

TrackBack URL for this entry:
https://www.typepad.com/services/trackback/6a00d8341c52a953ef00e54fc0c7ad8834

Listed below are links to weblogs that reference Baseball Stats and Freakonomics Wannabes . . . :

Comments

Barry - thanks for this. The recent spate of longer and deeper posts is very helpful and much appreciated.
The book to read here is of course Moneyball for the background on SABRMetrics but the guys over at Baseball Prospectus would have been the folks to collaborate with as they could have had the huge database and toolkit to investigate the problem in depth. But of course no right thinking academic would ask an amateur to help frame and investigate such a problem :).
Nontheless the use of math and analysis in the real world is powerful but the catch, is as you imply, the intitial framing of the problem and the translation of that problem into the proper "model". Back in the day we called that model specification error.
If you'd like to see multiple interesting examples where the math is real, the problems are real but the drama is well-done try NUMB3RS. Outstanding and on-line now :)

~~~

BR: Thanks for the kind words -- As I read the column, all I could think was they were going for a Freakonomics spin, when they should have been thinking Moneyball.

I should shoot this over to Michael Lewis . . .

Posted by: dblwyo | Dec 22, 2007 10:09:57 AM

The comments to this entry are closed.



Recent Posts

December 2008
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Archives

Complete Archives List

Blogroll

Blogroll

Category Cloud

On the Nightstand

On the Nightstand

 Subscribe in a reader

Get The Big Picture!
Enter your email address:


Read our privacy policy

Essays & Effluvia

The Apprenticed Investor

Apprenticed Investor

About Me

About Me
email me

Favorite Posts

Tools and Feeds

AddThis Social Bookmark Button

Add to Google Reader or Homepage

Subscribe to The Big Picture

Powered by FeedBurner

Add to Technorati Favorites

FeedBurner


My Wishlist

Worth Perusing

Worth Perusing

mp3s Spinning

MP3s Spinning

My Photo

Disclaimer

Disclaimer

Odds & Ends

Site by Moxie Design Studios™

FeedBurner