October 19, 2005

Bill James

In mid-2004, Rich Lederer at Baseball Analysts posted some highlights from all of the Bill James Abstracts. His 12-part series begins here with 1977. I have copies of the 1982-1988 books -- but not the homemade 1977-1981 editions -- so it's great to read at least some of the earlier books (tables of contents for the abstracts (and links to other James writings) are here).

Here are some quotes from James about what he does (and doesn't do).

From 1979:
I am a mechanic with numbers, tinkering with the records of baseball games to see how the machinery of the baseball offense works. I do not start with the numbers any more than a mechanic starts with a monkey wrench. I start with the game, with the things that I see and the things that people say there. And I ask, "Is it true? Can you validate it? Can you measure it? How does it fit in with the rest of the machinery?" And for those answers, I go to the record books.

What is remarkable to me is that I have so little company. Baseball keeps copious records, and people talk about them and argue about them and think about them a great deal. Why doesn't anybody use them? Why doesn't anybody say, in the face of this contention or that one, "Prove it. Baseball's got a million records and if that is true you can prove it, so prove it." Why do people argue about which catcher throws best, rather than figure the catchers' records against base-stealers? I really don't know.
From 1980:
A year ago I wrote in this letter that what I do does not have a name and cannot be explained in a sentence or two. Well, now I have given it a name: Sabermetrics, the first part to honor the acronym of the Society for American Baseball Research, the second part to indicate measurement. Sabermetrics is the mathematical and statistical analysis of baseball records.
From 1981:
1) Sportswriting draws on the available evidence, and forces conclusions by selecting and arranging that evidence so that it points in the direction desired. Sabermetrics introduces new evidence, previously unknown data derived from original source material.

2) Sportswriting designs its analysis to fit the situation being discussed; sabermetrics designs methods which would be applicable not only in the present case but in any other comparable situation. The sportswriter say this player is better than that one because this player had 20 more home runs, 10 more doubles, and 40 more walks and those things are more important than that players 60 extra base hits and 31 extra stolen bases, and besides, there is always defense and if all else fails team leadership. If player C is introduced into this discussion, he is a whole new article. Sabermetrics puts into place formulas, schematic designs, or theories of relationship which could compare not only this player to that one, but to any player who might be introduced into the discussion.

3) Sportswriters characteristically begin their analysis with a position on an issue; sabermetrics begins with the issue itself. The most over-used form in journalism is the diatribe, the endless impassioned and quasi-logical pitches for the cause of the day--Mike Norris for the Cy Young Award, Rickey Henderson for MVP, Gil Hodges for the Hall of Fame, everybody for lower salaries and let's all line up against the DH. Sportswriting "analysis" is largely an adversary process, with the most successful sportswriter being the one who is the most effective advocate of his position. I personally, of course, have positions which I advocate occasionally, but sabermetrics by its nature is unemotional, non-committal. The sportswriter attempts to be a good lawyer; the sabermetrician, a fair judge.

For that reason, good sabermetrics respects the validity of all types of evidence, including that which is beyond the scope of statistical validation.
Bad sabermetrics attempts to end the discussion by saying that I have studied the issue and this is the answer. Good sabermetrics attempts to contribute to the discussion in such a way as to enable it to move forward on a ground of common understanding.
James published one collection of his writings -- This Time, Let's Not Eat The Bones (1989) -- but I think that a reprint of the first five abstracts would find a decent-sized audience. (Reading the comments at Rich's posts, you can see that this is by no means an original thought.) The stats and discussions would be between 25 and 30 years old, but it would be dirt cheap to put together and there are many seamheads who would snap it up at first sight.


allan said...

thanks ben. ... stupid spam.

Anonymous said...

Let me be the first to say here, having learned of Dale Sveum's departure, YIPEEEEEEE!!!! The Sox just improved by a couple games without having to do a thing.