April 2, 2020

Sabermetrics Is The Revival Of An Old Discussion (Abolishing Fielding Percentage, 1877-1886)

[Draft Post, October 1, 2018]
The Newsletter of the Official Scoring Committee
Society for American Baseball Research (SABR)
Volume 2, Number 2, June 2017

Nineteenth Century Sabermetrics: Range Factor by Richard Hershberger

The National Association, the only professional league apart from the National League, in its convention of February 1879 voted to abolish the error column from the official scores of its games. What could possibly have motivated such a bizarre action? It turns out that this was a fairly mainstream idea at the time.

Baseball statistics underwent, as is well known, a revolution in the late 20th century, with the effects still being worked out. One common theme in the controversy (now mostly, and blessedly, past) was that the traditional baseball statistics—batting average, earned run average, and so forth—are straightforward, or even obvious metrics. Many on both sides in the debate agreed on this, while disagreeing whether this was a virtue or a failing.

The claim never really stood up to scrutiny. Batting average, the ne plus ultra of traditional stats, appears to be simplicity itself: Hits divided by At Bats. In reality a lot of complexity is hidden in those two terms. Hitting the ball and getting on base does not necessarily mean the batter got a Hit, nor does a turn at bat necessarily constitute an At Bat.

This observation merely scratches the surface. When we look closely at the history of the traditional statistics, they turn out to be the product of decades of discussion and experimentation by trial and error. They may seem obvious to us today, whether through long familiarity or simple hindsight. Either way, they were anything but obvious to even the closest observers of the day. The modern sabermetrics revolution is, it turns out, not a new phenomenon after all. It only seems that way because the discussion and experimentation had died down through much of the 20th century. Sabermetrics is not a new discussion. It is a revival of an old one.

Stew Thornley, our committee chair, has graciously invited me to write on the abstruse topic of scoring in the 19th century. My aim is to show the discussions and experiments that eventually led to the traditional statistics. My hope is that this will be a series, however irregular, and dependent upon your patience and tolerance. I begin in 1879 in media res to show a path considered, but not taken: the elimination of the error from scoring, and a surprising end to that path: the invention in 1886 of Range Factor.

The error was already an old stat by 1879. It went back to the 1850s, as a disapproving cluck of the tongue at the errant fielder. The 1860s saw the rise of the base hit as a stat. The error came into prominence with it, taking on a new and important role of ensuring that the batter did not get any undeserved credit. The newly prominent error was then in the early 1870s reapplied to fielding in a more systematic manner, resulting in the Fielding Average.

It was with Fielding Average that problems arose. Fielding Average was early recognized as an imperfect tool. Here is a discussion that nicely states the problem:
The sharp bounder between first and second base, that Gerhardt or Dunlap would field in a majority of cases, would be a safe hit were some other player on second base. The question then arises whether it is justice to Gerhardt or Dunlap to charge them with an error when they fail to stop such balls, while a lazy or indifferent second baseman allowed them to be scored as base hits by making no effort to stop them. The same is true of every other in-field position. A hard hit grounder past third base may, by the exercise of great agility, be stopped and thrown to first base in time to retire the batsman. The fielder gets credit for an assist only, no matter though he make the brilliant play a half dozen consecutive times. The seventh time he fails and is charged with an error, while a less agile baseman would fail to make an error, even, and the seven batsmen would score base hits. This is a manifest injustice. Base hits should depend upon the merits of the batsman, not upon the demerits of the fielder. If the league managers can frame any rule to rectify this error, they should do so. (Detroit Free Press, October 18, 1881)
The writer's challenge for a rule to rectify the problem went unmet until a century later Bill James invented Range Factor. There also was an early recognition of the subjectivity of scoring errors, and its susceptibility to homerism:
A pitcher would be charged with earned runs and base-hits against him by one scorer, while another would charge the field with the errors, thereby relieving the pitcher. In fact, this error business is ... ill defined in its rules ... (New York Clipper, February 8, 1879)

Some action should be taken in regard to official scorers. They are appointed by the club managers, and are generally, no doubt, moral young men, who want to secure a dead-head ticket to the games; but ... [i]n plain words, official scorers are liable to stretch their elastic consciences in favor of their home club, and will continue to do so until there are some fixed and definite rules for their guidance. (Detroit Free Press, October 18, 1881)
For all the failings of Fielding Average, and the error tabulation underlying it, it was the best fielding metric they had. It was, absent anything better, generally considered the best way to assess a fielder, and negotiate pay accordingly. People respond to incentives. Some players adopted the simple stratagem of only fielding balls they were sure they could handle. These were known as "record players" and widely condemned, even as the incentives to record playing remained in place. The solution is to change the incentives—to create statistics that better reflect team play. Many innovations, such as scoring sacrifice hits, had improving incentives as the underlying goal. This was the background to the proposal to eliminate errors. Here is an early proposal for the elimination of the error:
The abolition of the error columns. Bold, daring fielding on the part of every fielder would liven up the game twenty per cent. Base ball patrons will remember how, at times, a remarkable play by a fielder in taking great chances has enthused the spectators and given vim to the sport. But, with the error column staring them in the face, alas, most players take but few wide chances. Nothing is so disgusting to a crowd of lookers-on as to see a player shirk a difficult play when it is patent to all that he feared there were too many chances for an error against him to induce him to attempt the play. With no error record to go against him no chance would be slighted by a player, for then he would have every thing to gain if he made the play and nothing to lose if he failed. Give him the benefit of his assists and put-outs as usual, but demolish that demoralizing factor, the error column. (Cincinnati Enquirer, December 4, 1877)
The idea was a regular topic of discussion among scoring aficionados. In the end nothing came of it. The National Association was well into its tailspin into oblivion and already was irrelevant. Newspaper reports of its games often followed the traditional practice and included errors.

The National League seems never to have seriously considered the idea. The idea popped up from time to time through the 1880s, but had acquired the status of "old chestnut"—a theoretical notion to be chewed on in the winter months, but not a practical proposal.

I wrote earlier that Fielding Average was the best measure available until Bill James invented Range Factor. This is not quite true. The flamboyant and contentious sportswriter O. P. Caylor was the leading advocate of abolishing the error. Here he runs the idea up the flagpole in 1886. His proposal doesn’t stop with eliminating the error. He has a positive proposal for its replacement:
Do you ask what I would have instead of the error column? This: I would give every fielder credit for all he did—every assist and every put-out—without recording his failures. Then every fielder would be interested in taking every chance, however desperate, without fear of loss by doing so. I would then make out the players' averages by the number of assists and put-outs he had, divided by the number of games he played, and compare every man's record only with the record of the other fellows of his position. (The Sporting Life, February 3, 1886)
Put-outs plus Assists, divided by Games played: this is Range Factor, invented by O. P. Caylor in 1886. Caylor in 1877 was the baseball writer for the Cincinnati Enquirer. It is entirely likely that he wrote the proposal to eliminate errors previous quoted. His thinking had progressed over the ensuing decade. In 1886 he had a flash of genius. Sadly, the idea was both after its time and ahead of it. Had he worked out Range Factor in 1877, when the idea of eliminating errors was a viable proposal, Range Factor might have been adopted and grown beloved over the years, finding its way onto the back of baseball cards. As it was, the idea disappeared almost as soon as it appeared, not to be seen again for nearly a century.

Here we have a road not taken. While a missed opportunity, it shows the scope and the sophistication of the discussions. In later installments I will look at the roads that were taken and why.
Richard Hershberger is the author of Strike Four: The Evolution of Baseball (Introduction by John Thorn).


Paul Hickman said...

Roads taken - it is fascinating how rules & stats evolved in a lot of sports - ethics, morality & effort ! Concepts that changed themselves over the years .....

Was thinking today about 1918 & the Pandemic & presume it started in America after Sox win on Sept 11 think it was from memory ?

allan said...

No, it was going on well before the World Series. My book includes info from early summer in Kansas. (The three US studies below came out after the book.)


United States
"There have been statements that the epidemic originated in the United States. Historian Alfred W. Crosby stated in 2003 that the flu originated in Kansas, and popular author John M. Barry described a January 1918 outbreak in Haskell County, Kansas, as the point of origin in his 2004 article. A 2018 study of tissue slides and medical reports ... found evidence against the disease originating from Kansas as those cases were milder and had fewer deaths compared to the situation in New York City in the same time period. The study did find evidence through ... the virus likely had a North American origin, though it was not conclusive. In addition, the haemagglutinin glycoproteins of the virus suggest that it was around far prior to 1918 and other studies suggest that the reassortment of the H1N1 virus likely occurred in or around 1915."

United Kingdom
"The major UK troop staging and hospital camp in √Čtaples in France has been theorized by researchers as being at the center of the Spanish flu. The research was published in 1999 ... In late 1917, military pathologists reported the onset of a new disease with high mortality that they later recognized as the flu. The overcrowded camp and hospital was an ideal site for the spreading of a respiratory virus. The hospital treated thousands of victims of chemical attacks, and other casualties of war, and 100,000 soldiers passed through the camp every day. It also was home to a piggery, and poultry was regularly brought in for food supplies from surrounding villages. Oxford and his team postulated that a significant precursor virus, harbored in birds, mutated and then migrated to pigs kept near the front. A report published in 2016 ... found evidence that the 1918 virus had been circulating in the European armies for months and possibly years before the 1918 pandemic."

allan said...

The 1918 pandemic was called the "Spanish flu" because during World War I, there was a lot of censorship of newspapers in the US, England, Germany, and France. However, Spain stayed out of the war and its newspapers reportedly freely on the epidemic, supposedly creating the false impression of Spain as especially hard hit and earning it the name "Spanish flu".

Paul Hickman said...

My point is really about the difference between now & 1918 - fairly stark !

Back then there was no Aeroplanes & Mass Transit & Communication like there is now.

So the Pandemic will kind of "inch" its way around the world on each train & ship etc.

That's why it took several years back then & this thing has taken less than 6-8 weeks.

It's not the "Chinese virus" , it is the "get in a plane & fly around the World" virus !

If it had been "bad enough" in Boston & Chicago at the time, then it's possible other arrangements may have occurred ?

Mind you given the Wartime Attitude of the day, it was probably more a case of "power on through it", we started it, so let's finish it ? The Owners had more power then & the players way less & Occupational Health & Safety was a novel concept that hadn't occurred to anyone yet !

Also, remember it was the Second Wave in that Pandemic that killed the most people !

Given the fact that it started in Kansas in January 1918, it is quite possible it didn't actually "arrive" in either Boston or Chicago in a "major way", until around or after September 1918 ? The Fall is always referred to .......

Most information about 1918 talks of the War & doesn't really mention the Influenza much until the following year, possibly because History is always written after the fact ? And of course we all know what happened in 1919 with The Black Sox !

Obviously we were not around back then, but it would be interesting to find out what effects the Pandemic had on the 1918, 1919, 1920 season ?

Did teams "go on the road" so to speak, to avoid it ? Or perhaps play at different Parks or on different days or times ? etc.

I have read that, unsurprisingly, Attendances fell rather substantially & that 1918 Series was noted for that too.

Read these 2 bits :



That book was published a week ago !

Reasonable to presume it has been "in the works" for quite a while ? So Serendipity ain't the word ...... scarcely believable timing !!!!!!!!!!!

I am starting to hear a far more consistent line of :

No Vaccine = No Sport !

The word is 6-12-18 months, although 3 years has been mentioned too ........

Am beginning to think 2020 is long long gone Globally & it is more about 2021 & if I was the Tokyo Olympics Organisers I wouldn't be making too many concrete plans just yet - it may end up being Tokyo 2022 !!!!!!!!

Paul Hickman said...

Oh I forgot to mention,

Re your yearly W-L Contest :

I am going to win !

Because I am tipping The Red Sox will go 0-0 !!!!!!

Yahoo , I win a Book ! HaHa !