Negro League Data

Compiling Negro League Data

To the best of our knowledge, Retrosheet is the first source for game-level Negro League data on the Internet. The data presented here has been compiled by Retrosheet from original sources - newspaper game stories printed at the time of these games. For each season, Retrosheet volunteers have gone through newspapers - mostly what is available online, but also by physically visiting libraries in some cases - and identified a set of games which involved two teams of major-league caliber players. Game files were then created for each game found to identify as much statistical information as possible.

Retrosheet has tried to be fairly liberal in its standards for what constitutes a "major-league" team and whether or not to include games. For example, Retrosheet has included a few teams which were not members of a formal Negro League, including the 1946 Cincinnati Crescents and the 1942 Cincinnati Clowns. Retrosheet has also sought to include all games played between "major-league" teams, including exhibition games. Retrosheet has sought to identify the "gametype" for all of the games it has collected, so researchers are free to exclude exhibition games from their analysis if they desire. And, of course, researchers are also free to exclude games played by (and against) the Cincinnati Crescents or any other team. Basically, Retrosheet's standard is "were the two teams 'major-league'; do we know the score, the date, and the location; if yes to both, then let's include it."

Presenting Negro League Data

Retrosheet's goal in presenting Negro League data is two-fold. First, to provide our best estimate of what actually happened in the games on which we report. But, at the same time, we also want to convey the extent to which the data we have collected so far may be uncertain.

The level of detail at which Negro League data can be determined is highly variable across games and the data "known" is highly uncertain in many cases. For example, for many games, we have no box score but may have a reference to the fact that a particular player had at least one hit in the game. To attempt to convey this uncertainty in our data, teams and players are given three sets of statistical lines for each game within the data files which are available for download. These are identified within the .csv files by the variable 'stattype'. For each game, each player's record will include three 'stattypes':

As an example of the latter two of these, we may know that a pitcher was knocked out of the game in the 5th inning and that the opposing team scored 4 runs in the 5th inning. In this case, the lower and upper bound for the pitcher's innings pitched would be 4 and 4.2, respectively, and the lower and upper bound for the pitcher's runs allowed would be 0 and 4 (plus whatever we know the pitcher allowed in his first four innings pitched).

In most cases where we have some information, Retrosheet has attempted to make its best estimate of player statistics and has assigned these totals to the stattype 'value'. For example, if a game story mentions a player getting one hit, we will generally assign a 'value' of one while recognizing that the player could have had more hits via the 'upper' statistic. In cases of genuine uncertainty, however, a player's 'value' stats for a game will be left blank.

The game and player pages on our website report the 'value' figures, where they exist, and blanks otherwise. This is done for aesthetic reasons as much as anything. If people wish to work with the Negro League statistics compiled by Retrosheet, they are strongly encouraged to download our data and make their own decisions regarding which 'stattypes' - 'value', 'lower', or 'upper' - are most appropriate for their analysis.

We hope you enjoy our Negro League presentation. Please do not hesitate to let us know if you find any errors, additional sources for any of the games which we present, or additional games which we may be missing.

Retrosheet website last updated December 6, 2023.
All data contained at this site is copyright 1996-2023 by Retrosheet. All Rights Reserved. Click here for information about the use of Retrosheet data

Send comments and suggestions to Tom Thress:
Join the Retrosheet Discussion group here: RetroList
Retrosheet is an all-volunteer organization and a 501(c)(3) charitable organization. To volunteer, please e-mail Tom Thress. To make a donation, you can visit here: Donation Page