- What are time series tables?
- How to access time series tables
- An example: Persons by Sex tables
- Integration methods
- Table coverage
- Table layouts
- Current table details
- Webinar
- Credits
- References
What are time series tables?
An NHGIS time series table links together comparable statistics from multiple U.S. censuses in one downloadable bundle. A table is comprised of one or more related time series, each of which describes a single summary statistic (e.g., the count of occupied housing units) measured at multiple times (e.g., each census year from 1970 to 2020) at selected geographic levels (e.g., states or counties).
The set of covered statistics, years, and geographic levels varies from table to table according to which categories are available, and at which levels, at different times. Tables may also differ in the method of geographic integration they use to align geographic units across time.
How to access time series tables
To browse and access time series tables, enter the NHGIS Data Finder, choose a data filter of interest (geographic levels, years, topics), and then click on the TIME SERIES TABLES tab to see which tables are available for the selected filters.
Users may request any set of time series tables, for any set of years and geographic levels covered by those tables, in any of three possible layouts. NHGIS then delivers data for all selected years and including all areas for the requested geographic levels (excluding Puerto Rico data prior to 2010).
An example: Persons by Sex tables
NHGIS provides three Persons by Sex time series tables, each of which contains two time series: Persons: Male and Persons: Female. The three tables differ in the geographic levels and years they cover, and in the type of geographic integration they use.
Two Persons by Sex tables use nominal integration, aligning geographic units across time simply by unit name or code without regard to any changes in unit boundaries. The first nominally integrated table (code A08
) provides state or county data for censuses back to 1820. The second table (AV0
) provides data for states, counties, county subdivisions, places, or census tracts from censuses back to 1970 and from the American Community Survey for all 5-year periods back to 2006-2010.
The third Persons by Sex table (CM0
) provides geographically standardized 1990, 2000, and 2010 data for ten geographic levels, ranging from states down to block groups, according to their 2010 definitions.
Integration methods
Linking census summary statistics across time requires two types of integration: attribute integration, ensuring that the measured characteristics in a time series are comparable across time, and geographic integration, ensuring that the areas summarized by time series are comparable across time.
-
Attribute integration: To define time series tables, NHGIS researchers create metadata specifying sets of comparable statistics from various source datasets. In many instances, generating a single time series (e.g., Persons: Under 5 years) requires aggregating multiple source statistics (e.g., summing Males under age 5 and Females under age 5) to produce a comparable statistic across all years. NHGIS researchers specify any needed operations in the metadata, and the extract system completes the computations in advance… saving NHGIS users from another big data processing hassle!
NHGIS researchers execute attribute integration one tabulation type at a time, where a single "tabulation type" includes all tables that summarize a particular feature (e.g., persons, families, housing units, etc.) using a particular aggregation method (e.g., counts, medians, quotients, etc.) broken down into categories for a particular set of characteristics (e.g., sex, age, race, sex by age, etc.). Example tabulation types include: Persons by Sex by Age, Median Age by Sex, and Per Capita Income.
For each tabulation type, NHGIS researchers define time series tables to cover as many categories as possible for different sets of years and geographic levels. As a result, a single time series (e.g., Males under age 5) may appear in multiple tables, but each complete table is unique in the combination of categories, years and geographic levels it covers, or in the method of geographic integration it uses.
NHGIS assigns a consistent label and code to each complete time series, and the labels supply key information about changes in measured concepts. For example, to indicate precisely the timing and nature of a discrepancy in educational attainment categories, one time series is labeled 4 or more years of college (until 1980) or bachelor's degree or higher (since 1990).
-
Geographic integration: NHGIS time series tables link geographic units across time in one of two ways, nominal integration or geographic standardization. The NHGIS Data Finder indicates which type of geographic integration each time series table uses. Users may select and download tables of one or both types. When a data request includes tables of different integration types, NHGIS delivers separate data files for each type.
-
Nominally integrated tables link geographic units across time according to their names and codes, disregarding any changes in unit boundaries. The identified geographic units match those from each census source, so the spatial definitions and total number of units may vary from one time to another (e.g., a city may annex land, a tract may be split in two, a new county may be created from parts of others, etc.). The tables include data for a particular geographic unit only at times when the unit's name or code was in use, resulting in truncated time series for some areas.
Nominal integration is useful for:
- Mapping spatial patterns at different times, in which case it is appropriate to map the geographic units in use at each time
- Measuring changes in areas where boundaries were stable, as they are for most states and counties between censuses
- Studying changes in characteristics of places and county subdivisions according to their legal definitions, including annexations, etc.
Users should be cautious when interpreting changes in nominally integrated time series because a single unit code may refer to distinctly different areas at different times. We recommend that users of nominally integrated tables inspect NHGIS boundary files (which are available for most years and levels covered by time series tables) to identify any boundary changes in areas of interest.
-
Geographically standardized tables provide data from multiple times for a single census's geographic units. At this time, NHGIS's standardized time series tables provide 1990, 2000, 2010, and 2020 data for 2010 census units.
To allocate 1990, 2000, and 2020 summary data to 2010 census units, NHGIS starts from the smallest source units for which data are available: census blocks. Where a source block intersects multiple 2010 units, NHGIS applies interpolation to estimate how the source block's characteristics are distributed among the 2010 units.
The primary interpolation method that NHGIS uses is "target-density weighting" (TDW) (Schroeder 2007). TDW assumes that characteristics within each source zone have a distribution proportional to the densities of another characteristic among target zones. For example, if a 2020 block intersects two 2010 blocks, one of which was 10 times as dense as the other in 2010, then TDW assumes that the same 10:1 ratio holds within the 2020 block in 2020.
The interpolation from 1990 and 2000 blocks to 2010 units involves some more advanced modeling as documented in these pages:
We provide the interpolation weights used to construct standardized time series in the NHGIS Geographic Crosswalks.
For each standardized statistic in a time series table, NHGIS supplies lower and upper bounds based on the spatial relationship between the source units and standard units. For example, if there are three 2000 census blocks that straddle a 2010 census unit's boundary, then it is possible that either all or none of the three blocks' 2000 residents were located in the 2010 unit. The upper bound assumes that all residents and housing units in straddling blocks were located in the 2010 unit, and the lower bound assumes that none were. Bounds for 1990 estimates also take account of additional uncertainty due to accuracy improvements in Census Bureau's geographic data files, which make it impossible to determine exact spatial relationships between all 1990 and 2010 units.
NHGIS has not yet implemented standardization for non-count statistics such as medians and quotients. Therefore, currently available standardized tables supply only count statistics.
NHGIS delivers standardized statistics with two decimal digits of precision in order to reduce the size of rounding errors when users sum estimates. Rounding errors may still occur, but in most settings, these errors will be small and can be cleanly eliminated by rounding sums to integers.
-
Table coverage
Time series tables are designed to encompass as many different statistics as possible for a predetermined set of topics, years, and geographic levels:
-
Topic coverage: Nominally integrated time series cover a variety of topics from both 100%-count short-form sources (sex, age, race, Hispanic or Latino origin, household and group quarters type, housing occupancy and tenure, etc.) and sample-based long-form sources (education, income, poverty, marital status, place of birth, etc.).
At this time, geographically standardized tables cover only statistics that were published for 1990, 2000, or 2020 census blocks, which are a subset of all 100%-count statistics from 1990 Summmary Tape File 1, 2000 Summary File 1, and the 2020 P.L. 94-171 Redistricting Data Summary File.
We plan to extend the standardized table collection to include data for long-form subjects by interpolating from 1990 and 2000 "block group parts" (the smallest units for which the Census Bureau published long-form decennial census data) and from 2010 and 2020 block groups (the smallest units identified in ACS data). In preparation, we have already constructed crosswalks from 1990 and 2000 block group parts to 2010 units and between 2010 and 2020 block groups. We provide instructions on our crosswalks page for users who would like to produce their own standardized long-form data.
Time Series Tables by Tabulation Type and Geographic Integration Method Tabulation Type Nominally Integrated Tables Tables Standardized to 2010 Total Persons 3 1 Persons by Urban/Rural Status 2 1 Persons by Sex 2 1 Persons by Age 5 5 Median Age of Persons 1 0 Persons by Sex by Age 4 4 Median Age of Persons by Sex 1 0 Persons by Race 10 3 Persons by Hispanic or Latino Origin 4 1 Persons by Hispanic or Latino Origin by Race 7 5 Persons by Race by Sex 3 2 Persons by Race by Age 29 13 Persons by Race by Sex by Age 14 12 Persons by Hispanic or Latino Origin by Sex 1 1 Persons by Hispanic or Latino Origin by Age 4 2 Persons by Hispanic or Latino Origin by Sex by Age 2 2 Persons by Hispanic or Latino Origin by Race by Sex 1 1 Persons by Hispanic or Latino Origin by Race by Age 3 2 Persons by Hispanic or Latino Origin by Race by Sex by Age 8 1 Persons by Household, Family and Group Quarters Type 7 5 Persons by Household and Group Quarters Type by Sex 5 4 Persons by Household and Group Quarters Type by Age 11 7 Persons by Household and Group Quarters Type by Sex by Age 9 4 Total Households 1 1 Households by Household Type 1 1 Households by Household Type by Household Size 2 2 Total Families 1 1 Persons in Families 2 1 Families by Family Type by Presence and Age of Own Children 2 1 Persons by Household Type by Relationship to Householder 10 2 Persons by Household Type by Relationship to Householder by Age 11 6 Persons by Marital Status by Sex (by Age) 7 0 Persons by Nativity 1 0 Persons by Nativity by Place of Birth 13 0 Persons by Educational Attainment 5 0 Persons by Educational Attainment by Sex (by Age) 7 0 Persons by Labor Force, Employment and Armed Forces Status (by Age) 5 0 Persons by Labor Force, Employment and Armed Forces Status by Sex (by Age) 6 0 Workers by Means of Transportation to Work 4 0 Workers (by Means of Transportation to Work) by Travel Time to Work 2 0 Aggregate Travel Time to Work 1 0 Households by Income in Previous Year 4 0 Median Household Income in Previous Year 1 0 Families by Income in Previous Year 4 0 Median Family Income in Previous Year 1 0 Per Capita Income in Previous Year 1 0 Persons for Whom Poverty Status is Determined 1 0 Persons by Poverty Status in Previous Year 2 0 Persons by Ratio of Income to Poverty Level in Previous Year 3 0 Persons by Poverty Status in Previous Year by Age 6 0 Total Housing Units 1 1 Housing Units by Urban/Rural Status 2 1 Housing Units by Occupancy/Vacancy/Tenure 6 3 Persons by Housing Tenure 2 1 Occupied Housing Units (by Tenure) by Race of Householder 7 5 Occupied Housing Units (by Tenure) by Hispanic or Latino Origin of Householder 4 2 Occupied Housing Units (by Tenure) by Hispanic or Latino Origin of Householder by Race of Householder 5 4 Occupied Housing Units (by Tenure) by Household Size 6 2 -
Year coverage: For most tabulation types, nominally integrated tables cover all census years back to 1970 or 1980, stopping there because available 1970 data cover a much larger range of topics and geographic levels than do 1960 data. One Total Population table goes back to 1790 and one Persons by Sex table goes back to 1820.
Nominally integrated time series include data from the American Community Survey (ACS) for topics covered only in long-form census data (and therefore not in 2010 or later decennial census data) and for a selection of short-form subjects. We use 5-year ACS data rather than 1-year because the 5-year data have more complete geographic coverage, including census tract data.
Tables that include ACS data include data for all available 5-year periods back to 2006-2010. NHGIS also includes ACS summary tables for 2005-2009, the first 5-year period for which ACS summary data were published, but time series tables do not yet include that period due to some data integration challenges that pertain to that period and not later years.
The year coverage for individual tables also depends on the ranges of time for which different aggregate statistics are available, and on the method of geographic integration used. For example, the Persons by Race Combination tables extend only as far back as 2000 because the 2000 census was the first to tabulate such counts.
At this time, geographically standardized tables cover only 1990, 2000, 2010, and 2020 data standardized to 2010 census units.
-
Geographic coverage: Nominally integrated tables provide data for up to 8 geographic levels: nation, regions, divisions, states, counties, census tracts, county subdivisions, and places.
Geographically standardized tables provide data for 10 geographic levels: states, counties, census tracts, block groups, county subdivisions, places, congressional districts (as defined for the 110th-112th Congresses, 2007-2013), core based statistical areas (using 2009 metropolitan and micropolitan statistical area definitions, as in 2010 Census Summary Files), urban areas, and ZIP Code Tabulation Areas (ZCTAs).
The geographic levels available for nominally integrated tables are restricted according to the availability of statistics for all years covered by the table. For example, because the 1970 census summary files do not provide statistics at the nation, region, or division levels, tables covering 1970 do not provide data for any of these levels. Similarly, tables that use data from 1990 Summary Tape Files 2 or 4 do not provide statistics for the place level because the place data in those summary files was restricted to larger places.
Data for Puerto Rico are available only in nominally integrated tables for years after 2000. (At this time, NHGIS includes no source table data for Puerto Rico from 2000 or earlier.)
Table layouts
Time series tables can be downloaded in one of three layouts:
- Time varies by column: Data for different times are placed in separate columns within one file. The rows correspond to geographic units, and the columns correspond to particular times within a time series. E.g., one column reports Median Age in 2000 and another column reports Median Age in 2010.
- Time varies by row: Data for different times are placed in separate rows within one file. Each row represents a single geographic unit at a single time (e.g., Alabama in 1990 or Alabama in 2000), and each column corresponds to a single time series.
- Time varies by file: Data for different times are placed in different files. Within each file, the rows correspond to geographic units, and each column corresponds to a single time series instance at a single time.
Current table details
For each time series table, NHGIS provides a complete listing of table contents, coverage, and sources, along with notes describing any known comparability issues and links to relevant source documentation. These details can be accessed by clicking on a table name in the Data Finder. The complete set of all table details is also available here:
Webinar
In this one-hour webinar given on November 20, 2019, NHGIS staff member Jonathan Schroeder presents an overview of time series tables, including comparisons with alternative sources of standardized time series:
Credits
The initial definition, documentation, and dissemination of NHGIS time series tables was a central component of the Integrated Spatio-Temporal Aggregate Data Series (ISTADS) project at the Minnesota Population Center, with funding provided by the Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD) at the National Institutes of Health. Two current grants from the National Science Foundation and the NICHD support the geographic standardization and expansion of NHGIS time series tables.
References
- ^ Schroeder, J. P. (2007). "Target-density weighting interpolation and uncertainty evaluation for temporal analysis of census data." Geographical Analysis 39(3), 311–335. http://dx.doi.org/10.1111/j.1538-4632.2007.00706.x