Data Files Available for Download
To keep things simple, Negro League data can be downloaded within a single zip file which contains all data relevant to the Negro Leagues. This file is approximately 18Mb large and can be downloaded here.
The centerpiece of Negro League data are a set of .csv files which summarize game-level data for all (5,255) Negro League games for which Retrosheet has compiled data. There are five such .csv files.
- gameinfo.csv - contains game-level information such as teams, attendance, umpires, etc.
- teamstats.csv - contains team-level statistics - line scores, lineups, and team statistics (batting, pitching, fielding)
- batting.csv - batting statistics by player by game
- pitching.csv - pitching statistics by player by game
- fielding.csv - fielding statistics by player by position by game
The columns are labeled and should be mostly self-explanatory. But, in case not, the columns are defined in the document context.txt which is included in the zip file (and can be read here).
The level of detail at which Negro League data can be determined is highly variable across games and the data "known" is highly uncertain in many cases (e.g., for many games, we have no box score but may, for example, have a reference to the fact that a particular player had at least one hit in the game). To attempt to convey this uncertainty in our data, teams and players are given three sets of statistical lines for each game. These are identified within the .csv files by the variable 'stattype'. For each game, each player's record will include three 'stattypes':
- stattype 'value' is Retrosheet's best estimate of the relevant statistical total
- stattype 'lower' is the lower bound on a player's total
- stattype 'upper' is the upper bound on a player's total
As an example of the latter two of these, we may know that a pitcher was knocked out of the game in the 5th inning and that the opposing team scored 4 runs in the 5th inning. In this case, the lower and upper bound for the pitcher's innings pitched would be 4 and 4.2, respectively, and the lower and upper bound for the pitcher's runs allowed would be 0 and 4 (plus whatever we know the pitcher allowed in his first four innings pitched).
In addition to these five files which aggregate all Negro League games, we also have compiled separate logs by team (subsets of teamstats.csv divided by team-season), by ballpark (subsets of gameinfo.csv) and by player (subsets of batting.csv, pitching.csv, and fielding.csv). For ballparks and players, these aggregate across all seasons.
In addition to these .csvs, Retrosheet has also compiled event files (.evx files) and box-score files (.ebx files) for games for which sufficient data is available. These are grouped by season.
Finally, the zip file here includes roster files for all teams for whom Retrosheet has compiled rosters as well as our master files for people (biofile.csv), ballparks (ballparks.csv), and teams (teams.csv). These files include data for all people, teams, and sites across all Retrosheet games, not just Negro League games.
Download All Negro League Data (18 Mb)
Download All Retrosheet Data (234 Mb)
Back to Main Page for Negro League Baseball
Recipients of Retrosheet data are free to make any desired use of the information, including (but not limited to) selling it, giving it away, or producing a commercial product based upon the data. Retrosheet has one requirement for any such transfer of data or product development, which is that the following statement must appear prominently
The information used here was obtained free of
charge from and is copyrighted by Retrosheet. Interested
parties may contact Retrosheet at 20 Sunset Rd.,
Newark, DE 19711.
Retrosheet makes no guarantees of accuracy for the information that is supplied. Much effort is expended to make our website as correct as possible, but Retrosheet shall not be held responsible for any consequences arising from the use the material presented here. All information is subject to corrections as additional data are received. We are grateful to anyone who discovers discrepancies and we appreciate learning of the details.
Retrosheet website last updated December 6, 2023.
Send comments and suggestions to Tom Thress: tthress-ATsign-retrosheet.org.
All data contained at this site is copyright 1996-2023 by Retrosheet. All Rights Reserved. Click here for information about the use of Retrosheet data
Join the Retrosheet Discussion group here: RetroList
Retrosheet is an all-volunteer organization and a 501(c)(3) charitable organization. To volunteer, please e-mail Tom Thress. To make a donation, you can visit here: Donation Page