Friday, April 07, 2006

Accounting for Innings Pitched

Accounting for Innings Pitched

The below are a series of questions from Rex Hamann that I think will be of interest to many researchers. I will try to answer each one as they come. First, thought, let me say that what Rex is doing is how many of us got started in baseball research; i.e., filling in the blanks that history has passed down to us, whether it is adding Innings Pitched columns, like Rex, or Extra Base Hits, or compiling a whole season of stats that never existed.

I use Marshall Wright's American Association roster book as a research tool for the writing I do on the American Association. Because he did not include the quantity of Innings Pitched for the 1903 season, I embarked upon a project to do so. I am 99 percent through with this project, but there are loose ends I'd like to take care of. The process has given me a strong sense of familiarity with the pitchers of that season in the American
Association, and it's an experience I consider to be invaluable.

Rex is absolutely right. One never gets closer to players than when one compiles seasonal averages. Building those averages, brick by brick, day by day— that gives you a real feel for the season.

But I wonder about the credibility of my own information. For example, in a book called The Minor League Register, published in 1994, edited by Miles Wolff, they list Charles Chech as pitching 326 innings in 1903. The total I come up with is 304. How can it be that, after such a detailed examination as I have performed, these numbers should be so far off?

The discrepancy, as stated above, is large, and one should go back and check, first games pitched— do they match up, or did you miss a couple of games? Rex did mention how he compiled his stats, but I would assume he used a spreadsheet of some sort, so it would be easy to check that.

By the way, I would love to know where Mr. Wolff was able to find the innings information for the pitcher Chech. It would mean that there is a source I could have used other than to have generated the data on my own. But then, would that data have been trustworthy? I'm finding that it's not always possible to trust the data. At which point do we draw the line in trusting the data, especially for these earlier records?

I did a little bit of digging, and found that The Minor League Register picked up the record of Charlie Chech from the SABR publication Minor League Stars, Volume III. In looking over the people contributing to that volume, it appears that the work on Chech probably was done by Ralph LinWeber who was a big American Association researcher/expert, who also lived in Minnesota.

My final question is this, at least with respect to the pitcher Chech: Should I go over my work to see how I might be able to account for the missing 22 innings, or should I realize that the 326 innings figure may be in error and to have faith in my own work as possibly setting a new standard of accuracy?

If games pitched are all accounted for, then at some point you’ll have to go back and satisfy your own mind. Let me give you an example of what I found with my own work, and why I used a form of double entry bookkeeping when I compile averages now. (I don’t think anybody else— save Bob Tiemann— does when compiling averages.)

Not long after I did my first project, the 1903 Pacific Coast League, I decided that it would be interesting to see what were the best hitting parks in the league. So I decided recompile all the averages for each day for each park. That is, I would take the totals for both teams for each day and sum them up. If Portland had 38 AB and Sacramento had 40 AB on July 23, the total would be 78 AB for that game played at Vaughn Street Park. I did that for every game for the whole season. When I was finished, I was shocked at what I found, even though I was so sure of my work before that. For one example, Innings Pitched had the totals for all pitcher 11,225.3, but the Innings Pitched at all the parks turned out to 11,191.0. I was also off five strikeouts and four walks, which I suppose over the course of a long PCL season is not much, but has always bothered me. (Pete Palmer once told me that when the American league came up with discrepancies at the end of the season, their statistician would fudge the number to make them balance.)

After seeing that, I decided that I had to come up with a system to cut down on those discrepancies as much as humanly possible. And the system I came up with is as old as the hills, and which I detailed in an earlier post about compiling averages for the 1918 PCL season, but will restate again here.

What I do is compile a weeks’ worth of stats (I have been using a stat compiling program, StatTrak from All-Pro software since it was a DOS shareware program called Soffballs), then go back and compile all the same stats for each game, as if one single player. This is the same concept behind double entry bookkeeping. Then I go back and try to resolve any discrepancies between the two sets of data, and believe me, more often than not there is something that does not jive. Once this is done, I go on to the next week. Entering the park part of the data is not as time consuming as it may first appear. It’s only one number, not 18 or more numbers as with players, and takes me no more than a half-hour to do the whole week of park data. (At first, I tried doing a month’s worth of data, but resolving errors took me too long, so I scaled back to a week’s worth of data, and that seems about right.)

What I am doing, in essence, is counting the same numbers in a different sequence, nothing more. However, I get the added beinfit of winding up with Park Data for my effort.

Rex, I believe you should redo your data as outlined above, because you’ll either be pleasantly surprised at how good, and how careful you are— or thankful that you didn’t publish your data with all those errors in it!

Not for nothing did they invent double entry bookkeeping!

Rex Hamann
14201 Crosstown Blvd. NW
Andover, Minnesota 55304

The American Association Almanac
A Baseball History Journal (1902-1952)
www.AmericanAssociationAlmanac.com
Subscriptions available...Be the first on your block!

By the way, everything I’ve seen done by Rex is first rate!

0 Comments:

Post a Comment

<< Home