- File format and geometry
- Geometry years
- Realigned boundaries
- Census Bureau TIGER/Line documentation
For an overview of the geographic units and years covered by NHGIS GIS files, see Data Availability.
File format and geometry
NHGIS provides its geometry files for geographic information systems (GIS) as shapefiles, a standard spatial data file format. The shapefile format was originally defined for use in Esri GIS applications, but the format has become an industry standard, and many GIS and mapping tools are able to read and write shapefile data.
Most NHGIS GIS files have polygon geometries representing the boundaries of census reporting areas.
NHGIS also supplies two types of point files:
- Centers of population for states, counties, tracts, and block groups
- Place points representing the functional centers of cities and villages
NHGIS generally identifies each GIS file by the survey year in which the file's represented areas were used for tabulations, which may be different than the vintage of the represented areas. For example:
- The 2012 boundary file for Core Bases Statistical Areas (CBSAs) follows the official 2009 CBSA delineations, which are the delineations used in 2012 American Community Survey (ACS) tables.
- The 2010 and 2011 boundary files for Public Use Microdata Sample Areas (PUMAs) identify 2000 PUMAs, which are the PUMAs used in 2010 and 2011 ACS tables.
This Census Bureau page identifies the vintages of geographic areas for each ACS survey year since 2009.
Note on 2009 census tracts and block groups:
To find the NHGIS boundary files for block groups and census tracts derived from 2009 TIGER/Line files, users should filter on the year 2000 in the Data Finder. NHGIS identifies these boundaries with 2000, not 2009, because they correspond to the boundaries of 2000 census units and are not completely consistent with the units identified in 2009 ACS tables. Most of the block group and census tract tables from the 2009 5-Year ACS Summary File correspond to the Census 2000 definitions, but according to ACS documentation, "in 19 counties from 8 different states, many of the census tracts and block groups used to tabulate and present the 2005-2009 ACS 5-year estimates are either those submitted to the Census Bureau for the 2010 Census, or a preliminary version of 2010 Census definitions." More information on these discrepancies, including a listing of affected counties, is available here.
Unfortunately, no TIGER/Line files represent the actual set of tracts and block groups identified in 2005-2009 ACS tables, so NHGIS does not provide boundary files for the "2009 vintage" of these units.
NHGIS boundary files are derived primarily from the U.S. Census Bureau's TIGER/Line files with numerous additions to represent historical (1790-1980) boundaries that do not appear in TIGER/Line files. For more recent boundary files (1990 or later), NHGIS typically makes only a few key changes to the TIGER/Line source:
- We merge files that the Census provides only for individual states or counties to produce new nationwide or statewide files
- We project the data into Esri's USA Contiguous Albers Equal Area Conic Projected Coordinate System
- We add a “GISJOIN” attribute field, which supplies standard identifiers that correspond to the “GISJOIN” identifiers in NHGIS data tables
- We rename files to use the NHGIS naming style and geographic-level codes
- We add NHGIS-specific metadata in the XML file that accompanies the shapefile
- Most substantially, we erase coastal water areas to produce polygons that terminate at the U.S. coasts and Great Lakes shores
1980 and earlier boundaries based on 2000 TIGER/Line files
Because the 2000 TIGER/Line files contain no identifiers for census areas from 1980 and earlier, NHGIS researchers obtained boundary definitions for those years by consulting other sources, including 1992 TIGER/Line data for 1980 census tracts; maps from printed census reports for 1910-1980 census tracts and other small areas; and the Map Guide to the U.S. Federal Censuses, 1790-1920, by William Thorndale and William Dollarhide (Genealogical Publishing Co., Baltimore, MD, 1987), for counties and states back to 1790. Where the historical boundaries follow 2000 TIGER/Line features, the original NHGIS boundary files re-use those TIGER/Line features. Elsewhere, NHGIS researchers digitized new boundaries. NHGIS boundary files based on these files are identified as "2000 TIGER/Line +" in the Basis column in the Select Data grid of the Data Finder.
1980 place and county subdivision boundaries
1980 boundaries for places and county subdivisions are derived from the U.S. Census Bureau's 1992 TIGER/Line files. NHGIS modified the TIGER/Line definitions only by erasing coastal water areas using the 2000 TIGER/Line coastal water definitions. NHGIS boundary files based on these files have a "1992 TIGER/Line +" Basis in the Data Finder.
2000 boundaries based on 2010 TIGER/Line files
NHGIS also provides 2000 boundaries derived from the 2010 TIGER/Line files, which have a "2010 TIGER/Line +" Basis in the Data Finder. For these, NHGIS modified the TIGER/Line definitions only by erasing 2010 coastal water areas. The 2000 boundaries derived from 2010 TIGER/Line will better align with 2010 and newer GIS boundary files.
2009 and later boundaries
NHGIS generally derived 2009 and later boundaries from the corresponding TIGER/Line release, as indicated by the Basis in the Data Finder. In each case, NHGIS modified the TIGER/Line geometry only by projecting the data and erasing coastal water areas. For the 2009 boundary files, NHGIS used 2010 TIGER/Line coastlines to erase coastal water areas.
Centers of population
NHGIS provides point files representing the 2000 and 2010 centers of population for states, counties, census tracts, and block groups. NHGIS derived these points from the U.S. Census Bureau's Centers of Population data files.
Each point represents the mean center of population within the corresponding area, computed as an average of census block locations, weighted by block population, using a simple spherical model of the Earth surface. As described in the Census Bureau's Centers of Population Computation documentation: "The center of population is the point at which an imaginary, weightless, rigid, and flat (no elevation effects) surface representation of [a geographic area] would balance if weights of identical size were placed on it so that each weight represented the location [of] one person."
NHGIS undertook the following steps to generate its center-of-population shapefiles:
- Convert the Census Bureau's latitude/longitude coordinates to GIS features.
- Modify the feature attribute fields for consistency with other NHGIS files.
- Apply an equal-area conic projection for consistency with other NHGIS files.
- Attach metadata describing the file contents.
See the place points documentation page for complete information on the derivation of NHGIS place point files.
Because the Census Bureau made major accuracy improvements to TIGER/Line features between the 2000 and 2008 TIGER/Line releases, the original NHGIS shapefiles based on 2000 TIGER/Line features are not comparable with newer TIGER/Line data. We therefore generated new 2008-based boundary files by systematically realigning the boundaries for tracts and counties to fit with 2008 TIGER/Line features, a process referred to as conflation. These conflated NHGIS boundary files are identified as "2008 TIGER/Line +" in the Basis column found in the Select Data grid.
The Census Bureau made additional improvements to TIGER/Line features after 2008, so the 2008 TIGER/Line-based files are not consistently comparable with 2010 and later TIGER/Line files. In general, most 2008-based boundaries align better than 2000-based boundaries with 2010 and later TIGER/Line files, but the 2008-based boundaries also include occasional gross inaccuracies.
For users who have no need to compare historical boundaries with boundaries from 2010 or later, we recommend using the original 2000-based NHGIS boundary files.
For users who do wish to compare or overlay historical boundaries with boundaries from 2010 or later, we recommend downloading and examining both the 2000- and 2008-based versions of historical boundaries in order to determine which is more suitable for your study area and analysis.
For users who wish to overlay 2000 boundaries with boundaries from 2010 or later, we recommend using the 2000 boundaries derived from 2010 TIGER/Line.
Census Bureau TIGER/Line documentation
- 1992 TIGER/Line
- 2000 TIGER/Line
- 2008 TIGER/Line
- 2009 TIGER/Line
- 2010 TIGER/Line
- 2011 TIGER/Line
- 2012 TIGER/Line
- 113th Congressional District (2013) TIGER/Line
- 2013 TIGER/Line
- 2014 TIGER/Line
- 2015 TIGER/Line
- 2016 TIGER/Line
- 2017 TIGER/Line