I always liked this guy

Uncategorized No Comments

Thanks to MetsBlog for the link to an excellent op-ed by Doug Glanville, one of the geekiest*, most hilarious ballplayers I’ve known of. He talks about the various measures of baseball success and, as expected, does a really nice job in doing so.

*- “geek” is actually a badge of honor.

Statistical Scapegoating

Uncategorized No Comments

The good folks over at Metsblog linked to a story on Daily News blog “Surfing the Mets” which suggests that Brian Schneider tipped pitches somehow, leading to the Mets’ collapse. The blogger’s source masturbates his ego about how “meaningful” and “timely” his “MoneyBall type analytics” are before finally spitting out some numbers:

These numbers are so stunning it suggests that the unfortunate Schneider was somehow tipping off pitches:

SANCHEZ caught by CASTRO 65 ABs, allows 0 HRs and OPP SLUG%=292
SCHNEIDER 108 ABs, allows 6 HRs and OPP SLUG%=454

HEILMAN caught by CASTRO 50 ABs , allows 0 HRs and OPP SLUG%=340
SCHNEIDER 195 ABs, allows 9 HRs and OPP SLUG%=456

FELICIANO caught by CASTRO 36 ABs, allows 0 HRs and OPP SLUG%=306
SCHNEIDER 134 ABs, allows 6 HRs and OPP SLUG%=455

WAGNER caught by CASTRO 49 ABs, allows 0 HRs and OPP SLUG%=204
SCHNEIDER 99 ABs, allows 3 HRs and OPP SLUG%=313

SANTANA caught by CASTRO 333 ABs, allows 6 HRs and OPP SLUG%=297
SCHNEIDER 524 ABs, allows 15 HRs and OP SLUG%=401

PEDRO M caught by CASTRO 138 ABs, allows 2 HRs and OPP SLUG%=377
SCHNEIDER 185 ABs, allows 15 HRs and OPP SLUG%=600!!!!!

Same phenomena holds with John Maine, Claudio Vargas and Nelson Figueroa. Fascinating, isn’t it?

I’d like to know how similar the numbers were when Maine was on the hill. Also, what did they show when Ollie, Big Pelf, Schoeneweis, or Joe Smith pitched? What were the numbers on Stokes, who pitched about the same number of innings as Vargas?

A reader by the name of Krudler was similarly suspicious:

I think the results you present are very interesting but it in no way proves that Schneider was any different than Castro. The reason why the analysis is inconclusive is that you present no information about the heterogeneity of the data. Consider the following hypothetical scenario:
- the Bullpen got worse as the season went on (due to worse “stuff”; fatigue, whatever)
- Schneider & Castro were equal in quality but Schneider happened to catch with greater frequency than Castro towards the end.

Under these conditions, there can be a statistical association between bad pitching and Schneider catching. The association does not arise because Schneider is worse than Castro, but because the frequencies with which each catcher plays are uneven across the season (Schneider by chance happened to catch more often towards the end because Castro was hurt).

There are ways to control for this bias in the analysis, but the statisticians don’t mention the effect which I think could lead to very spurious conclusions.

Seriously. Don’t F with the guy’s reputation if you can’t prove anything conclusively.

Oh, and Happy New Year, everyone!