Skip to main content

RESEARCH BRIEF Children  |   April 2024

CONTRIBUTING FACTORS OF THE 2020 U.S. CENSUS CHILDREN AND YOUNG CHILDREN UNDERCOUNT IN TEXAS

In partnership with Dr. Bill O’Hare from Count All Kids, this research report identifies the factors contributing to the 2020 Census undercount of Texas children and young children.

By: Dr. Francisco A. Castellanos-Sosa, Texas Census Institute, Senior Research Associate
Dr. William P. O’Hare, Count All Kids Campaign, Consultant

ONE PAGERFULL REPORT

The Children’s Census Initiative

The Texas Census Institute created the Children’s Census Initiative to improve the accuracy with which the 2030 Census will count Texas children. The initiative comprises five related parts, each tackling a specific aspect of this phenomenon to enable a thorough analysis and informed decision-making. The first product of this initiative was a descriptive overview of net child undercount in Texas counties and regions.1 The second product of the initiative served as a detailed overview of the counties with the highest numbers and rates of net child undercount.2 This third product describes the young children’s net undercount in Texas counties and regions. This is the fourth product of the initiative. It studies the factors contributing to the undercount of Texas children and young children. The fifth product will measure the funding implications and its short-and long-term effects on relevant children and young children topics. This initiative will offer valuable insights and recommendations for addressing the U.S. Census child undercount and empowering stakeholders with the knowledge for effective decision-making and action.

[/vc_row_inner]

Research Overview

The 2020 Census showed the largest undercount rate ever for young children (5.4%) and a 2.1% undercount rate for all children. Previous research indicated Texas had a 7.9% undercount rate for young children and 2.1% for all children. This study uses correlation and regression analysis to examine factors influencing undercount rates across Texas counties. Results indicate four factors affecting undercount rates for children and young children, with two factors having the most significant impact (non-geocoded addresses and the proportion of SNAP-eligible people who are young children). Additionally, ten factors influence children’s undercount rates only, and fourteen factors affect young children’s rates only.

Main Findings

2 factors have the largest effect on the undercount rate of both children and young children: the share of addresses not geocoded by the U.S. Census Bureau and the share of SNAP-eligible people who are young children.
Through different variables, higher levels of poverty and low income show a significant association with higher levels of undercount rates of children and young children.
The share of children that are not biologically related to, adopted by, or a stepchild of householders is related to higher levels of undercount rates of children.
The higher the share of young children with at least one foreign-born parent, the higher the undercount rate of young children.
Higher shares of vacant households and rented households with 7+ people are related to higher undercount rates of children.

Pros

This report uses estimates of the undercount of young children that is highly accurate at the county level.
This report uses an extensive array of variables, 182, which is a large increase from its closest piece that uses 40.
The data required for this methodology is publicly available and independently verified.
This methodology avoids multicollinearity, which is the inclusion of highly related variables among them within and between each of 6 dimensions: Socioeconomic Status, Race and Origin, Family Structure and Living Arrangements, Age Characteristics, Housing, and Census-and Geography-Related Features.

Cons

The exclusion of variables due to their high correlation with others within and between dimensions might hide additional angles of the reality behind the undercount of children and young children.
Despite that the 182 variables were not chosen arbitrarily, additional variables could have been not considered unintentionally.
The results are not accompanied by ethnographic nor on-the-ground research, which could help understand the results at a greater detail.

Abstract

The 2020 Census’ undercount rate of 2.1% undercount rate for all children (ages 0 to 17) was its largest ever, as was the 5.4% undercount rate for young children (ages 0 to 4). Previous research suggests the undercount rate for children in Texas is even worse: 2.1% for all children and 7.9% for young children; however, little is known about the factors contributing to variation in children’s undercount rates across Texas counties. This study analyzes the potential explanatory variables related to the variation in undercount rates for children and young children in Texas counties. Correlation and stepwise regression analysis uncover potential contributing factors related to child undercounts. The results suggest four factors contribute to the variation of both the undercount rate of young children and all children. From them, two factors have the largest effect on the undercount rates of both groups of children (share of addresses not geocoded by the U.S. Census Bureau and share of SNAP-eligible people that are young children). We also find ten factors that contribute to the variation in children’s coverage rates and fourteen factors related to the variation in the rates of young children only.

1. Introduction

The U.S. Decennial Census is the backbone of our national statistical system, and the data from the census have many important uses. One of their most important uses, in terms of the economic impact, is to distribute federal funds. Villa Ross (2023, p. 2) recently found that at least “…353 assistance listings used Decennial Census Programs data in whole or in part to distribute more than $2.8 trillion in funds during fiscal year 2021.”[1] A recent study by the Project on Government Oversight (POGO) estimates that more than $150.3 billion in federal funds were distributed to Texas in Fiscal Year 2020 through 338 programs that use census-derived data to geographically allocate funds (Project on Government Oversight, 2023). The fiscal implications demonstrate the importance of census accuracy. Cities, towns, and counties which are undercounted in the census do not receive their fair share of government assistance.
Some groups experience net census undercounts and overcounts.[2] In other words, people in some areas might not be counted or might be double-counted, creating a net error that can be positive (overcount) or negative (undercount). One of the groups historically undercounted at a high rate is children (ages 0 to 17). In particular, young children (ages 0 to 4) have a very high net undercount rate, which has increased since 1980. Nationwide in 2020, children experienced an undercount rate of 2.1%, and young children experienced an undercount rate of 5.4% (Jensen, 2022; Jensen & Kennel, 2022). Despite the considerable and persistent undercount of children and young children in the United States, there has been no consensus on the factors contributing to this phenomenon.
This study examines the critical issue of undercounting children and young children in the U.S. Decennial Census by identifying factors contributing to the variation in coverage errors for children in Texas counties. The significance of this research lies in its potential to enhance the accuracy of future censuses, particularly the planning and execution of the 2030 Census.
In this study, we focus on two different populations of children. We examine the population of all children defined here as those from birth to age 17. In the Census Bureau’s terminology, this is 0 to 17. The second population we examine is young children, defined here as those from birth to age 4. In the Census Bureau’s terminology, this is 0 to 4.
This paper identifies potential factors contributing to variation in young children’s 2020 Census undercount rates in Texas counties. By doing so, it expands the work of O’Hare et al. (2019) in at least three ways. First, we use a larger set of potential causal variables. Studying more variables allows us to better understand which factors contribute most to child and young child undercount rates. Second, we use all counties in Texas regardless of population size. The O’Hare study only included larger counties. Third, we use the latest data available and provide an updated set of results compared to O’Hare et al.’s study.
This study is relevant for at least two reasons. First, the contributing factors identified in this study can be used to target outreach and resources during the planning and implementation of the 2030 Census to improve the count of children and young children. Second, data analysts and researchers, particularly those with local knowledge, can use these contributing factors to better understand why children and young children have such a high undercount rate in the census and in Texas.
The structure of this paper is organized as follows: this introduction makes up Section 1. Section 2 presents a literature review showing national undercount rates for all children and young children and the approaches used in studying young children undercounts at a sub-national level. Sections 3 and 4 present the data and methodologies. Section 5 shows the main results. Section 6, the final section, contains concluding remarks.

2. Literature Review

2.1 Approaches on Undercount of Children and Young Children

Recognition of undercounting in social science surveys is not new (Clogg et al., 1989; de la Puente, 1995; King & Magnuson, 1995; Martin & de la Puente, 1993; O’Hare, 2019; Tourangeau & Plewes, 2013; West & Fein, 1990); however, scholarly work has not yet achieved a consensus framework for studying its determinants.
It is important to note that the census coverage of children differs from that of adults or the whole population. Children cannot answer census questionnaires and have no direct participation in U.S. Census activities. Thus, the coverage error in children depends on whether an adult is available to provide information for the entire household and on the quality of the data provided by the adult.
Studies on sub-national accuracy assessment of the U.S. Census are limited. Moreover, exploring overall undercounting determinants has provided limited information on the causes of undercounting children (Cohn, 2011; Mayol-Garcia & Robinson, 2011; Robinson et al., 1993; Siegel et al., 1977).
Recent studies have provided insights into potential determinants of variation in young child undercounting in the 2010 Census (O’Hare, 2014, 2017; O’Hare et al., 2019). O’Hare (2014) suggests that at the state level, young child undercount rates are correlated most highly with the relative size of the Black population—Alone or in Combination—and the Hispanic population. He also found that characteristics such as linguistic isolation, the percentage of adults without a high school degree, and the unemployment rate were more highly related to variations in the young child undercount among states than were housing unit features.
O’Hare (2017) examined variations in young child undercount rates in the 2010 Census across counties. He found that the combined Black and Hispanic population was a key determinant of young child undercounts. King et al. (2018) also found that counties’ racial/Hispanic composition was closely related to county coverage rates for young children.
One of the most prominent research works on this issue is that of O’Hare et al. (2019). They used 2010 young child undercount rates for large counties and several potential causal variables to obtain county-level regression estimates showing the relationship between the potential causal variables and the young child undercount rate. They then combined the coefficients from that model with data from the 2013-2017 American Community Survey to predict the tract-level risk of young children being uncounted in the 2020 Census in 689 counties with at least 5,000 children under age 5. The authors explored 40 potential explanatory variables and clustered them into six domains:

  1. Race and Hispanic Origin;
  2. Socioeconomic Status;
  3. Family Structure and Living Arrangements;
  4. Other Demographic Measures;
  5. Housing; and
  6. Census Response/Return Rates.

O’Hare et al. (2019) examined the potential determinants of young child undercounts in the 2010 Census at the census tract level for large counties (those with 250,000 people or more in 2010). O’Hare et al. (2019) found a close association between young child undercount rates and:

percent of adults ages 18 to 34 with less than a high school diploma, GED, or alternative;
percent of children under age 18 living in a female-headed household with no spouse present;
percent of children under age 6 living with a grandparent householder;
percent of households that are linguistically isolated (no one ages 14+ speaks English “very well”);
percent of children under age 6 who are in immigrant families (the child is foreign-born or at least one parent is foreign-born); and
percent of people living in renter-occupied households.

The present analysis builds upon the above studies. We extend the previous work of O’Hare et al. (2019) by exploring multiple variables, organized in 6 domains, using all counties in Texas, studying not only young children (ages 0 to 4) but also children (ages 0 to 17). Moreover, our approach’s wide set of variables allows a clearer understanding of the mechanisms of the potential determinants of undercounting.
Note that all the studies reviewed above were focused on young children (ages 0 to 4). This study compares results for young children to those for all children. This analysis will help researchers understand if the same factors drive the coverage of all children as the coverage of young children. At present, it is unclear whether the determinants of variations in the net undercount rates for all children and young children are the same. This study will shed light on that question.

2.2 The Undercount of Children and Young Children in the United States

Before looking at the counties in Texas, a short U.S. overview of the issue is provided. Of the two main methods used by the Census Bureau to assess accuracy, Demographic Analysis (DA) and Post-Enumeration Survey (PES), only DA provides reliable data for young children. Because of correlation bias, the PES method greatly underestimates the undercount of young children (O’Hare et al., 2016).

The Census Bureau’s Demographic Analysis () assesses the quality of the 2020 decennial census for children by age groups (see Table 1). Data in Table 1 indicate that the youngest children (ages 0 to 4) had the highest net undercount at 5.4 percent, and the oldest children (ages 10 to 17) had the smallest net undercount at 0.6 percent.[3]
One of the advantages of using to study the undercounting or overcounting of children in the Census is that it allows exploration by a single year of age. Figure 1 shows how the youngest children are highly undercounted. Children under 1 year have a 7.0% net undercount rate as measured by the Middle Series. As children grow, the undercount shrinks with each year of age until the 8-year-old age group, for which the undercount is 0.6 %.

From the 8-year-old to the 17-year-old age group, the net undercount rates are relatively stable, with an average DA’s Net Coverage Error of -0.7%. The two high peaks for children aged 10 and 15 can be explained by age heaping. This is a phenomenon where people guess another person’s age (for example, in a proxy response in the Census) and often use ages that end in 0 or 5 (Jensen et al., 2023). The effect of age-heaping appears at around age 10, with a smaller bump around age 15.

O’Hare (2023) presents a nationwide analysis of the country’s county-level estimates of young child undercounts using Census Counts and the April 1, 2020 version of the Vintage 2020 Estimates. The coverage estimates are derived by calculating the difference between the county-level Census Counts and 2020 Vintage Population Estimates such that when the census is lower than the Vintage Estimates, there is an undercount, and when the census is higher than the Vintage Estimates, there is an overcount. We focus on children (ages 0 to 17) and young children (ages 0 to 4) because these age groups experienced a statistical undercount, according to the U.S. Census Bureau Demographic Analysis (Jensen, 2022; Jensen & Kennel, 2022). This method for calculating net coverage has been widely used by others, and it is used in this study.
Net coverage refers to net undercounts and net overcounts. Net undercounts and overcounts are based on a balance between people missed in the census and people counted more than once or included in the census inappropriately. Net undercounts are not the same as the number or rate of people missed in the census.

The census coverage of young children is distinct from the coverage of other age groups. Figure 2 shows the census coverage rates for young children (ages 0 to 4), school-age children (ages 5 to 17), and adults (ages 18 or older) for each census from 1950 to 2020 (O’Hare, 2023b).[4] At least two key points are revealed in Figure 2. First, after 1980, the coverage of young children plunged from an undercount of 1.4 percent in 1980 to an undercount of 5.4 percent in 2020. This 40-year trend is strikingly clear.

If we do not do something different in 2030 than we have done over the past 40 years, there is no reason to expect the count of young children to improve in the next census. If the net undercount of young children increases between 2020 and 2030 as it did between 2010 and 2020, it will be over 6 percent in the 2030 Census. No other demographic group has experienced the deterioration of census quality that young children have experienced since 1980. That makes young children a special population in planning for the 2030 Census. The key point of Figure 2 is that the net coverage of young children will likely worsen in 2030 unless we take bold action to change the trajectory of the past 40 years.
Second, it is also important to note that while the coverage of young children has deteriorated since 1980, the census coverage of older children and adults has not. The coverage of adults continued to increase to the point that they had a small overcount in the 2020 Census. The coverage of children ages 5 to 17 remained relatively stable after 1980.

This means that over the past 40 years, the census experience of young children has been very different than that of adults and even different than that of older children. This is another reason young children deserve to be a separate focus in the census, separate from adults and older children.
Figure 3 shows the relationship across states between the state-level coverage of the total population and the coverage for young children in the 2020 Census. It shows almost no statistical relationship between the quality of the census count of the total population and the quality of the census count of young children across the states (with a correlation coefficient of -0.08, which is not statistically significant). In other words, states with better census coverage for their total population do not necessarily have better coverage for young children.

For example, the total population in Hawaii had a net overcount of 6.8 percent in the 2020 Census, while young children had a net undercount rate of 8.6 percent. Delaware had a net overcount of 5.5 percent for the total population but a net undercount of 6.1 percent for young children.
The trends over the past 40 years and the lack of a relationship between total population coverage and young child coverage in states show that improvements in the quality of census counts for the total population are unlikely to lead to improvements in the count of young children. There has been a significant improvement in the total population coverage over the past 40 years, but there has been no improvement in counting young children.
Given this disconnect between the quality of census data for the total population and the quality of census data for young children, it is critical to focus on young children separately from the total population and older children.

2.3 The Undercount of Children and Young Children in Texas

Of the 7,432,438 children in Texas reflected in the Vintage 2020 Population Estimates, the census undercounted 153,633 or 2.1 percent (Castellanos-Sosa & O’Hare, 2023a). Of 254 counties in Texas, 190 experienced an undercount (74.8%), and 64 showed an overcount (25.2%).
On the other hand, of the 1,975,115 young children in Texas reflected in the Vintage 2020 Population Estimates, the 2020 Census undercounted 155,855 young children (7.9%) (Castellanos-Sosa & O’Hare, 2023b). Many counties in Texas not only had an undercount of children and young children, but their undercount was quite high. In total, 184 Texas counties (72.4%) had a high net young child undercount (rate, number, or both).[5] These results suggest that 85.8% of Texas young children live in a county with a high undercount (rate, number, or both) and that of the 184 Texas counties with a high net young child undercount, 31 Texas counties had a high net young child undercount rate and number, with a net 134,421 young child undercount (being this the 9.5% of their children).

3. Data

The undercount of children and young children is estimated using the Vintage 2020 Population Estimates and the 2020 Census counts (U.S. Census Bureau, 2021b, 2021c).[6] The Vintage 2020 Population Estimates for ages 0 to 4 are completely independent from the census. However, the Vintage Estimates for ages 10 to 17 are partly dependent on the results of the 2010 Census. So, the Vintage Estimates for ages 0 to 17 are not independent of the census, which may impact the coverage estimates’ reliability for ages 0 to 17.
The 2020 Census was implemented from January to October 2020. So, using data up to 2020, such as the 2016-2020 ACS 5-Year Estimates, allows us to include population characteristics just before and during the census. This helps us understand relevant factors, such as the conditions for the planning and implementation of the 2020 Census and the environment in which people did or did not participate. The use of post-COVID data was avoided as those data would undermine our objective of identifying the factors contributing to undercounting children and young children.
The potential explanatory variables used here come from the U.S. Census Bureau, Texas Health and Human Services, Social Capital Atlas, and Every Texan. In particular, we extracted 168 variables from the U.S. Census Bureau’s 2020 ACS 5-Year Estimates and the Decennial Demographic and Housing Characteristics file tables. Population and Population Density are from the 2020 Census Counts. Addresses unable to be geocoded in counties were obtained from the Local Update of Census Addresses (LUCA) program (U.S. Census Bureau, 2021a).
The authors focused on variables used by the U.S. Census Bureau’s Hard-to-Count (HTC) scores and Low Response Scores (LRS), as well as research literature related to the undercounting of children and young children.
The 182 potential explanatory variables used here are largely an extension of the 32 variables used by O’Hare et al. (2019). Based on the latest research reports on the topic, we expand the variables to include more elements used on previous research (O’Hare et al., 2019)(Griffin & Konicki, 2017; Nickell, 1981; O’Hare et al., 2020; O’Hare, Griffin, et al., 2019). For instance, O’Hare et al. (2019) use the following three variables for young children in their Race and Hispanic Origin domain: Percent of Not-Hispanic black alone (ages 0 to 4); Percent of Hispanic or Latinos (ages 0 to 4); and Percent of minority (ages 0 to 4). In this paper, we expand their criterion and use eight variables instead: Percentage of young children that are American Indian or Alaska Native (Not Hispanic); Percentage of young children that are Asian (Not Hispanic); Percentage of young children that are Black (Not Hispanic); Percentage of young children that are Hispanic or Not Latin; Percentage of young children that are Native Hawaiian and other Pacific Islander (Not Hispanic); Percentage of young children that are some other race alone (Not Hispanic); Percentage of young children that are two or more races (Not Hispanic); and Percentage of young children that are White (Not Hispanic).
We created six variables using data from the Texas Health and Human Services.[7] Cohesiveness by Clustering and Volunteering variables were obtained from the Social Capital Atlas (Chetty et al., 2022a, 2022b). The variables from the Social Capital Atlas are not available for nine counties, representing 3.5% of the sample. So, to preserve the entire set of counties, each of the nine counties will have the average value of the counties surrounding them. Three variables were obtained as published by Every Texan, who processed them using 2020 ACS 5-Year Estimates.[8]

4. Methodology

4.1 Undercounting and Overcounting at the County Level

The U.S. Census Bureau does not plan to produce sub-state measures of accuracy for all children in the 2020 census using either of the two main methods the Census Bureau uses to assess census accuracy (Post-Enumeration Survey () and Demographic Analysis . The Census Bureau will provide coverage rates for the population ages 0 to 4 for counties with at least 1,000 children in this age range.
It is unfortunate there are not more sub-national assessments of census quality because many stakeholders are seeking sub-state measures of census quality (Adlakha et al., 2003; American Statistical Association, 2021; Borusyak et al., 2021; Hill et al., 2022; National Academies of Sciences, Engineering, 2022; National Association of Latino Elected Officials Educational Fund, 2022; U.S. Census Bureau, 2014). This study responds to those requests for more sub-state quality measures for the 2020 Census.
It is important to study counties because state-wide numbers can mask big differences within a state. One set of counties may have high undercounts, which may be counterbalanced by another set of counties with high overcounts, leading to an average low overall coverage error for the state. For example, in the 2010 Census, nearly all the undercounts of young children (ages 0 to 4) in New York and Illinois were accounted for by large urban counties in those two states (U.S. Census Bureau, 2014).
This study employs a commonly used demographic benchmark to identify counties with net undercounts and overcounts of children in the 2020 Census. Specifically, the 2020 Decennial Census county-level counts of children (ages 0 to 17) and young children (ages 0 to 4) are compared to corresponding figures from the Census Bureau’s Vintage 2020 Population Estimates to ascertain census coverage for children and young children. The Vintage 2020 Population Estimates are based on the 2010 Census results with births, deaths, and net migration between 2010 and 2020 incorporated. When the 2020 Census is smaller than the Vintage 2020 Estimates, we observe an undercount (expressed with a negative undercount estimate), and when the 2020 Census is higher than the Vintage 2020 Estimates, we observe an overcount (denoting a positive undercount estimate). Then, net undercounts are reflected with negative values, and net overcounts are reflected with positive values.
According to the U.S. Census Bureau, “Both the 2020 DA Estimates and the Vintage 2020 Population Estimates can be used as demographic benchmarks for evaluating certain aspects of the 2020 Census results” (Jensen & Johnson, 2021). This study is closely linked to Jensen and Johnson’s (2021) work since we compare 2020 Census results and Vintage 2020 Population Estimates to assess the 2020 Census data quality for young children at the county level. According to Jensen and Johnson (2021), “Increasingly, data users are comparing the Population Estimates to the results of the 2020 Census to try and understand the quality of the census results.” This study adds to that research stream and uses the methodology developed by O’Hare (2023) and used by several others.[9]
There are four main reasons why the Vintage Population Estimates are thought to be more accurate than the census counts for the population ages 0 to 4.

    1. There is a high net undercount of young children in the 2010 and 2020 censuses (Jensen, 2022).
    2. The Vintage Population Estimates for ages 0 to 4 are largely based on birth certificate data, which is widely considered reliable. Ninety-nine percent of the population ages 0 to 4 in the 2020 Population Estimates are based on birth certificates (U.S. Census Bureau, 2022).
    3. The data sources and methodology for producing Vintage Population Estimates are nearly identical to the Census Bureau’s Demographic Analysis method, which is the preferred method for estimating young child undercounts at the national level (U.S. Census Bureau, 2021b).
    4. The results of the Vintage Population Estimates for young children are nearly identical to the Census Bureau’s Demographic Analysis estimates at the national level, which underscores the suitability of using the Vintage Population Estimates to examine the subnational geographic distribution of the net undercount rates of young children (O’Hare, 2023c).

It is worth noting that some of the differences found here may be important even if differences do not reflect true undercounts. According to the U.S. Census Bureau, “…significant or unexpected differences can be useful for identifying areas for further investigation” (Hartley et al., 2021, p. 2). In other words, a difference between the census count and the Estimates may signal a problem with the underlying data.

4.2 Potential Explanatory Variables

As per O’Hare et al. (2019), our set of potential explanatory variables is sorted into six domains to reflect broad dimensions of the environment experienced by persons responding to the U.S. Census Bureau. These domains consider aspects that are inherent to the individual (Race and Origin and Age Characteristics), related to their lifetime achievements (Socioeconomic Status), linked to their interpersonal connections (Family Structure and Living Arrangements), descriptive of their housing units (Housing), and reflecting the geography and the implementation of the census itself (Census-and Geography-Related Features).[10]
For Race and Origin, many racial and ethnic minorities have long had high net undercounts in the census. This analysis aims to identify which groups have the highest disparities. On Socioeconomic Status, our research would underscore the intersectionality of undercounting with poverty, education, and income-related features—characteristics that usually raise systemic challenges. In Family Structure and Living Arrangements, household complexity influences census accuracy. For instance, O’Hare et al. (2019) found that the share of children under age 18 living in a female-headed household with no spouse present or the share of children under age 6 living with a grandparent householder was related to the variation in net undercount rates for young children. Similarly, Housing characteristics might shed light on factors influencing census accuracy, such as overcrowding and tenure.
It is not the children or young children that respond to the census survey, but an adult person in the housing unit. Therefore, the Age Characteristics of people living or caring for children are explored. Lastly, in terms of Census-and Geography-Related Features, the response rate of people and geographic accessibility of the housing units are incorporated to cover the data collection stage of the census implementation.
Figure 4 shows the number of variables and their proportion from each domain’s number of variables. For instance, Socioeconomic Status, the domain with the largest number of variables, considers 95 potential explanatory variables, representing slightly more than half of the variables (52%).

This is largely due to the many thresholds in which income can be split and the distinct levels to measure income (household and families). For instance, 20 variables account for ten income classifications for households and families’ income.[11] Moreover, Socioeconomic Status includes 20 variables related to computer and internet access, 17 associated with poverty, 15 to account for the value of the houses or monthly rent payments, and 14 to social security programs.

Rather than splitting the Socioeconomic Status 95 variables, we preserved them within the same domain since we acknowledge that income, poverty status, access to technology, social security programs, and the value of the houses and monthly rent payments result from the lifetime achievements of families.
The second largest domain in terms of the number of variables is Race and Origin, with 39 (21.4%) of the variables). It includes variables related to race, nativity, migration, and language. The other domains, Family Structure and Living Arrangement, Age Characteristics, Housing, and Census-and Geography-Related Features, reflect 48 factors (26.4 %).

4.3 Statistical Approach

We use a two-stage approach for selected potential explanatory variables. The first stage helps us find variables statistically associated with undercounts but not highly correlated with other measures in the domain and across domains. The second stage consists of a stepwise regression analysis with those selected variables.
In the first stage, we refine the number of variables in three steps, as in O’Hare et al. (2019). We identify the variables within each dimension that have a statistically significant correlation with the undercount rates for children and young children. Among these, we preserve the variables with a low correlation with other variables within each dimension. Furthermore, we keep only those variables with a low correlation with variables in other dimensions. The last two steps help us avoid multicollinearity problems.
The second stage consists of a stepwise forward regression analysis.[12] It consists of identifying the best one-variable model and extending it by comparing it against the best two-variable model, and so forth. We set the stepwise regression analysis to add more variables only if they are statistically significant, at least at the 10% level, and keep being significant at least at the 10% level when other variables are included (StataCorp, 2021). These are the steps used for the inclusion and exclusion of variables:

    1. Fit “empty” model.
    2. If the most-significant excluded term is “significant” at least at the 10% level, add it and reestimate; otherwise, stop.
    3. Do that again: if the most-significant excluded term is “significant” at least at the 10% level, add it and reestimate; otherwise, stop.
    4. Repeatedly, until neither of the next steps is possible,
      • if the least-significant included term is “insignificant” at the 10% level, remove it and reestimate; and
      • if the most-significant excluded term is “significant” at least at the 10% level, add it and reestimate.

5. Results

5.1 Contributing Factors of Children and Young Children Undercount in Texas

Our undercount rate measures can be negative or positive. As described in the Methodology section, negative rates exist when the 2020 Census is smaller than the Vintage 2020 Estimates as a signal of undercount. Furthermore, positive rates exist when the 2020 Census exceeds the Vintage 2020 estimates, indicating overcount.
All of the factors were constructed so that a higher value of the factor is expected to be associated with a lower value (worse) of the net undercount rate. Recall that net undercount rates are reflected as negative numbers. A worse net undercount rate is a lower number.
The first stage of our statistical approach is finding variables statistically associated with undercount but not highly correlated with other measures in the domain and across domains. In this first step, it is essential to identify the variables that have a significant correlation with child and young child undercount rates. Correlation coefficients can go from -1 to 1, where a negative correlation coefficient suggests that an increase in the variables would associate with lower values of the undercount rate (this means more undercounting). A positive correlation would, therefore, suggest that an increase in the variables would relate to higher values of the undercount rate (this means more overcounting). Of the 182 variables, 99 had a negative correlation with the undercount rate of children, and 102 had a negative correlation with the undercount rate of young children. To identify whether the correlation coefficient is significant (i.e., the correlation coefficient is not the result of pure chance), we test the null hypothesis of zero correlation against the alternative hypothesis that the correlation coefficient is different from zero.
For the counties’ child undercount rate, we find 92 variables with a statistically significant correlation. We also find 90 statistically significant variables correlated to the young child undercount rate. After preserving the variables with a correlation below 0.7 with other variables within each domain and variables in other domains, these numbers go down to 54 and 51 for the study of child and young child undercount, respectively.
The second stage of our statistical approach consists of a stepwise forward regression analysis with the aforementioned selected variables. Table 2 shows the results from applying the stepwise forward regression analyses using the 54 and 51 potential contributing factors of the child and young child undercount. The dependent variable is the undercount rate for children in Model 1 and young children in Model 2. Of the 54 potential contributing factors used for children, only 14 are part of the final model. Of the 51 potential contributing factors for young children, only 18 are in the final model.
Our results suggest four factors contribute to both undercount and overcount rates of children and young children; 10 unique factors relate to the rates of children only; and 14 unique factors relate to the rates of young children only. The four variables contributing to the undercount and overcount of both children and young children are:

the share of addresses unable to be geocoded by the U.S. Census Bureau,
the share of 1-17 year-old people that moved into the county,
the share of SNAP-eligible individuals that are young children (ages 0 to 4), and
the share of occupied units paying rents between $1,500 and $1,999.

The share of addresses unable to be geocoded by the U.S. Census Bureau are those addresses in the Census Bureau’s Master Address File that were used to take the 2020 Census that were not possible to have a tract and block geocode at the beginning of the Local Update of Census Addresses (LUCA) operation (by January 2018). In other words, this variable reflects problems with the Census Bureau’s Master Address File at the county level.

Therefore, since the U.S. Census Bureau needs to have information on every household address, discrepancies in the Master Address File could easily lead to misleading outcomes during the imputation and filling process of those addresses unable to be geocoded (Cantwell, 2021).
Another important factor in the domain of Census-and Geography-Related Features, one that is particularly significant for the study of children (ages 0 to 17), is the nonresponse rate for reasons other than refusal. This is the proportion of housing units that did not participate in the 2020 ACS for any other reason different from refusal.[13]

Although participation in the ACS is required by law, some housing units – nontraditional housing in sheds and outbuildings, trailer parks and even multi-family dwellings or multiple families in one dwelling – went un-located, or there was no one at home, the interviewed people faced language problems, people were temporarily absent, the maximum contact attempts were reached, or any other reason that impeded the housing unit from participating successfully in the ACS. In any case, a higher nonresponse rate for reasons other than refusal implies that more housing units did not complete the ACS interview and as a result, the Census Bureau did not get accurate data from the housing unit. People’s attitudes and motives for not responding to the ACS interview, even though a response is mandated by law, might be the same as those related to not responding to the 2020 Census. Non-response forces the U.S. Census Bureau to impute characteristics on those addresses and potentially endanger the accurate counting of certain population segments. Self-response is widely seen as the most accurate data in the census. Counties where self-response rates are low are likely to have less accurate data.
Another predictor of an inaccurate count is the share of children ages 1 to 17 that moved into the county, i.e., the proportion of children ages 1 to 17 (from those in the same age range) that did not live in the same county one year ago and that moved to the county from another county within the same state, from another state, or that moved from abroad. This is a reflection of mobility and might relate to an overcount (due to double counting) of children since neighbors might provide information for the former occupants of a housing unit.
The share of SNAP-eligible individuals that are young children (ages 0 to 4) is the proportion of eligible individuals in the counties that are young children ages 0 to 4, according to the Texas Health & Human Services Commission. This variable mostly reflects the share of counties’ young children in low-income households, regardless of receipt of SNAP benefits. It is widely believed that adults in low-income households are not present at home or do not have the time to respond to the census questionnaire due to their workload, which would lead to the undercount of the children and young children living in the housing unit.
Occupied units paying rent between $1,500 and $1,999 could be considered middle income, with a potentially higher ability and motivation to participate in civic activities and to attend to the census questionnaire (Chetty et al., 2022a, 2022b). A higher share of people in higher income groups could result in lower undercount rates due to civic engagement levels.
While these four potential explanatory variables are closely related to variables in the coverage of children and young children, it is unclear if they are driving the coverage of children or the coverage of the total population, including children. In other words, they are not child-specific factors. Table 3 summarizes the variables that appear to contribute to the undercount of children compared to young children, that is those contributing only to the undercount of children and those contributing only to the undercount of young children. These factors can appear through diverse mechanisms and deserve further research to identify how they relate to census coverage.

A standard way to know how well the data explain the variability of our dependent variables (undercount rates) is through the coefficient. The coefficient goes from 0 to 1, where 0 means the variables do not explain any of the variability of the undercount rates, and 1 means the variables perfectly explain the variability. The higher the , the better the model is at explaining the data. The coefficients in this study—0.63 and 0.66—are higher than those of the closest related work (0.38) in the best model of O’Hare et al. (2019)).

Our values are moderately high by social science standards. Another way of thinking about the is that these models explain about sixty six percent of the variation in coverage rates of children and young children across Texas counties.

5.2 Understanding the link between the Contributing Factors and the Undercount of Children and Young Children in Texas Counties

Interpreting the results in Table 2 may be complicated because net undercounts are reflected as negative numbers. A negative sign in the coefficients from the regression suggests that the net undercount is worse when a given factor increases. A positive sign in the coefficients suggests that the net undercount is lower when a factor increases. The coefficients themselves express the magnitude of the changes.
This study finds that the set of explanatory variables for all children is different from the set of explanatory variables for young children. This underscores the extent to which distinct factors drive the coverage of older and younger children; therefore, this subsection contains two parts that explain the results of each Table 2 model separately.

5.2.1 Contributing Factors and the Undercount of Children in Texas Counties

Our findings suggest that two out of the eight variables in the Census-and Geography-Related Features domain worsen the undercount rate of children (ages 0 to 17) when they increase. In other words, the higher the nonresponse for reasons other than refusal, the worse the undercount rate of children. Similarly, the higher the share of addresses that the U.S. Census Bureau cannot geocode, the worse the undercount rate of children.
Two of the nineteen variables in the Family Structure and Living Arrangements domain seem to worsen the undercount rate of children (ages 0 to 17) when they increase. The higher the share of children who are not immediate family members, the worse the undercount rate of children.[14] Similarly, the higher the share of children that are not biological children, adopted children, or stepchildren, and that are other relatives, the worse the undercount rate of children.
In the Housing domain, our results suggest one factor contributes to a worse undercount rate of children (ages 0 to 17) when it increases. The higher the share of vacant households, the worse the undercount rate of children is.
Regarding the Socioeconomic Status domain, we find that three factors worsen the undercount rate of children (ages 0 to 17) when they increase. The higher the average share of SNAP-eligible individuals who are young children (ages 0 to 4), the worse the undercount rate of children. The higher the average share of families whose income was below the poverty level during the last 12 months with related children of the householder under 5 years, the worse the undercount rate of children. The higher the average share of married-couple families whose income was below the poverty level during the last 12 months with related children of the householder under 18 years, the worse the undercount rate of children. These could both be seen as different ways to measure poverty.

5.2.2 Contributing Factors and the Undercount of Young Children in Texas Counties

One out of the eight variables in the Census-and Geography-Related Features domain are factors that worsen the undercount rate of young children (ages 0 to 4) when they increase. The higher the share of addresses unable to be geocoded by the U.S. Census Bureau, the worse the undercount rate of young children.
In the Race and Origin domain, our results suggest one factor contributes to a worse undercount rate of young children (ages 0 to 4) when it increases. The higher the share of young children that are Hispanic or Latino, the worse the undercount rate of young children.
In the Housing domain, our results suggest one factor contributes to a worse undercount rate of young children (ages 0 to 4) when it increases. The higher the share of rented households occupied by 7+ people, the worse the undercount rate of young children.
Regarding the Socioeconomic Status domain, our results suggest eight factors contribute to a worse undercount rate of young children (ages 0 to 4) when they increase. The higher the share of SNAP-eligible individuals who are young children, the worse the undercount rate of young children. The higher the share of married-couple families whose income was below the poverty level during the last 12 months with related children of the householder under 5 years only, the worse the undercount rate of young children. The higher the share of children (0-17) in male (no spouse) led households that received Supplemental Security Income, cash public assistance income, or food stamps/SNAP in the last year, the worse the undercount rate of young children. The higher the share of children 0-5 with all parents in the family in the labor force, the worse the undercount rate of young children. The higher the share of households with cash public assistance income, the worse the undercount rate of young children. The higher the share of households with Supplemental Security Income, the worse the undercount rate of young children. The higher the share of people (children under 5 years related to the householder) whose income was below the poverty level during the last 12 months, the worse the undercount rate of young children. The higher the share of households with income between $10,000 and $14,999 the worse the undercount rate of young children.

6. Concluding remarks

Our study extends the exploration of determinants of geographic variation in the coverage of children and young children in the decennial census. For instance, we found two operational measures (nonresponse rate for reasons other than refusal and share of addresses unable to be geocoded) were closely associated with the coverage rate of children and young children. Previous studies did not include operational measures.
One of the key findings of this study is that operational measures help explain variation in the coverage rates of children and young children. This has not been examined in previous studies. We also find that many of the factors that appear to drive variation in the coverage of young children are not the same as those that drive variation in all children’s coverage. This is consistent with other analyses showing that coverage of young children differs from that of older children or adults. By incorporating a more extensive set of variables, studying all counties in Texas, and using the latest available data, we provide more Texas-specific results and updated insights into the factors potentially contributing to the undercounting of American children.
In particular, we identify that the nonresponse rate for reasons other than refusal, the share of addresses unable to be geocoded by the U.S. Census Bureau, the share of SNAP-eligible individuals that are young children (ages 0 to 4), and the share of occupied units paying rent between $1,500 and $1,999, are factors that contributed to the variation of the net undercount rate of both children and young children, both positively and negatively.
The relevance of our findings extends beyond academic discourse. The identified contributing factors can guide targeted outreach and resource allocation for the 2030 Census, improving the accuracy of child and young child counts. Moreover, local data analysts and researchers can leverage this comprehensive set of determinants to gain deeper insights into the persistent issue of undercounting in the census. Hopefully, this study will be replicated for other states to discover whether the determinants found here are generalizable or are more Texas specific.
As we conclude, this research contributes to the ongoing dialogue on census accuracy assessment, emphasizing the need for continuous refinement of methodologies and a thorough understanding of factors influencing undercounting. The roadmap presented in this study lays the groundwork for future research, facilitating a more accurate and inclusive representation of children in the U.S. Census.

References

Adlakha, A. L., Robinson, J. G., West, K. K., & Bruce, A. (2003). Assessment of Consistency of Census Data with Demographic Benchmarks at the Subnational Level.

American Statistical Association. (2021). 2020 Census State Population Totals: A Report from the American Statistical Association Task Force on 2020 Census Quality Indicators. https://www.amstat.org/asa/files/pdfs/POL-CQI-Task-Force-final-report.pdf

Borusyak, K., Jaravel, X., & Spiess, J. (2021). Revisiting Event Study Designs: Robust and Efficient Estimation. https://doi.org/10.48550/arxiv.2108.12419

Cantwell, P. (2021). How We Complete the Census When Households or Group Quarters Don’t Respond. https://www.census.gov/newsroom/blogs/random-samplings/2021/04/imputation-when-households-or-group-quarters-dont-respond.html

Castellanos-Sosa, F. A., & O’Hare, W. P. (2023a). The 2020 Census Undercount of Children in Texas Counties (No. RB23–002). Texas Census Institute.

Castellanos-Sosa, F. A., & O’Hare, W. P. (2023b). The 2020 Census Undercount of Young Children in Texas Counties (No. RB23–004). Texas Census Institute.

Chetty, R., Jackson, M. O., Kuchler, T., Stroebel, J., Hendren, N., Fluegge, R., Gong, S., Gonzalez, F., Grondin, A., Jacob, M., Johnston, D., Koenen, M., Laguna-muggenburg, E., Mudekereza, F., Rutter, T., Thor, N., Townsend, W., Zhang, R., Bailey, M., … Wernerfelt, N. (2022a). Social Capital I: Measurement and Associations with Economic Mobility. Nature, 608(7921), 108–121. https://doi.org/10.1038/s41586-022-04996-4

Chetty, R., Jackson, M. O., Kuchler, T., Stroebel, J., Hendren, N., Fluegge, R., Gong, S., Gonzalez, F., Grondin, A., Jacob, M., Johnston, D., Koenen, M., Laguna-muggenburg, E., Mudekereza, F., Rutter, T., Thor, N., Townsend, W., Zhang, R., Bailey, M., … Wernerfelt, N. (2022b). Social Capital II: Determinants of Economic Connectedness. Nature, 608(7921), 122–134. https://doi.org/10.1038/s41586-022-04997-3

Clogg, C. C., Massagli, M. P., & Eliason, S. R. (1989). Population Undercount and Social Science Research. Social Indicators Research, 21(6), 559–598.

Cohn, D. (2011). State Population Estimates and Census 2010 Counts: Did They Match?de la Puente, M. (1995). Using ethnography to explain why people are missed or erroneously included by the Census: Evidence from small area ethnographic studies (No. SM95-16; Census Working Papers).

Griffin, D., & Konicki, S. (2017). Investigating the 2010 Undercount of Young Children – Analysis of Census Coverage Measurement Results. https://www2.census.gov/programs-surveys/decennial/2020/program-management/final-analysis-reports/2020-report-2010-undercount-children-further_analysis_coverage_results.pdf

Hartley, C., Perry, M., & Rogers, L. (2021). A Preliminary Analysis of U.S. and State-Level Results From the 2020 Census (POP-WP104; Census Working Papers).

Hill, C., Heim, K., Hong, J., & Phan, N. (2022). Census Coverage Estimates for People in the United States by State and Census Operations (No. PES20-G-02RV). U.S. Census Bureau. https://www2.census.gov/programs-surveys/decennial/coverage-measurement/pes/census-coverage-estimates-for-people-in-the-united-states-by-state-and-census-operations.pdf

Jensen, E. B. (2022). Census Bureau Expands Focus on Improving Data for Young Children. United States Census Bureau. America Counts: Series. https://www.census.gov/library/stories/2022/03/despite-efforts-census-undercount-of-young-children-persists.html

Jensen, E. B., & Johnson, S. L. (2021). Using Demographic Benchmarks to Help Evaluate 2020 Census Results. United States Census Bureau. Random Samplings. https://www.census.gov/newsroom/blogs/random-samplings/2021/11/demographic-benchmarks-2020-census.html

Jensen, E. B., & Kennel, T. (2022). Who Was Undercounted, Overcounted in the 2020 Census? Detailed Coverage Estimates for the 2020 Census Released Today (America Counts: Stories Behind the Numbers).

Jensen, E. B., Roberts, A., & Rogers, L. (2023). Age heaping in the 2020 Census Demographic and Housing Characteristics File (DHC). https://www.census.gov/newsroom/blogs/random-samplings/2023/05/age-heaping-2020-census-dhc.html

King, H., Ihrke, D., & Jensen, E. (2018). Subnational estimates of net coverage error for the population aged 0 to 4 in the 2010 Census.

King, M. L., & Magnuson, D. L. (1995). Perspectives on Historical U. S. Census Undercounts. Social Science History, 19(4), 455–466.

Martin, E., & de la Puente, M. (1993). Research on sources of undercoverage within households (No. SM93-03; Census Working Papers).

Mayol-Garcia, Y., & Robinson, J. G. (2011). Census 2010 Counts Compared to the 2010 Population Estimates by Demographic Characteristics. Poster presented at the annual meetings of the Southern Demographic Association.

National Academies of Sciences, Engineering,  and M. (2022). Understanding the Quality of the 2020 Census: Interim Report. The National Academies Press. https://doi.org/10.17226/26529

National Association of Latino Elected Officials Educational Fund. (2022). NALEO Educational Fund Urges Census Bureau to Release More Data on State and Local Undercounts.

Nickell, S. (1981). Biases in Dynamic Models with Fixed Effects. Econometrica, 49(6), 1417–1426. http://www.jstor.org/stable/1911408

O’Hare, W. P. (2014). State-Level 2010 Census Coverage Rates for Young Children. Population Research and Policy Review, 33(6), 797–816.

O’Hare, W. P. (2017). Geographic Variation in 2010 U.S. Census Coverage Rates for Young Children: A Look at Counties. International Journal of Social Science Studies, 5(9). https://doi.org/10.11114/ijsss.v5i9.2611

O’Hare, W. P. (2019). Differential Undercounts in the U.S. Census. http://link.springer.com/10.1007/978-3-030-10973-8

O’Hare, W. P. (2023a). Counties with High Undercounts of Children in 2020 U.S. Census. https://2hj858.a2cdn1.secureserver.net/wp-content/uploads/2023/03/Counties-with-High-Undercounts-of-Children-in-2020-U.S.-Census.pdf

O’Hare, W. P. (2023b). County-level Coverage Rates of Young Children in the 2020 Census: The National-Level Data Do Not Tell the Full Story. https://countallkids.org/resources/county-level-coverage-rates-of-young-children-in-the-2020-census-the-national-level-data-do-not-tell-the-full-story/

O’Hare, W. P. (2023c). State Undercount Rates for Young Children in the 2020 Census. https://countallkids.org/resources/state-undercount-rates-for-young-children-in-the-2020-census/

O’Hare, W. P., Griffin, D., & Konicki, S. (2019). Investigating the 2010 Undercount of Young Children – Summary of Recent Research. https://www2.census.gov/programs-surveys/decennial/2020/program-management/final-analysis-reports/2020-report-2010-undercount-children-summary-recent-research.pdf

O’Hare, W. P., Jacobsen, L. A., Mather, M., & Vanorman, A. (2020). Predicting Tract-Level Net Undercount Risk for Young Children. https://www.prb.org/wp-content/uploads/2020/12/us-census-undercount-of-children-1.pdf

O’Hare, W. P., Jacobsen, L. A., Mather, M., Vanorman, A., & Pollard, K. (2019). What Factors Are Most Closely Associated With the Net Undercount of Young Children in the U.S. Census? https://www.prb.org/wp-content/uploads/2019/03/net-undercount-children-acs.pdf

O’Hare, W. P., Robinson, J. G., West, K., & Mule, T. (2016). Comparing the U.S. Decennial Census Coverage Estimates for Children from Demographic Analysis and Coverage Measurement Surveys. Population Research and Policy Review, 35(5), 685–704. https://doi.org/10.1007/s11113-016-9397-x

Project on Government Oversight. (2023). Dollars and Demographics: How Census Data Shapes Federal Funding Distribution.

Robinson, J. G., Ahmed, B., Das Gupta, P., & Woodrow, K. A. (1993). Estimation of population coverage in the 1990 United States Census based on demographic analysis. Journal of the American Statistical Association, 88(423), 1061–1071. https://doi.org/10.1080/01621459.1993.10476375

Siegel, J. S., Passel, J. S., Rives, N. W., & Robinson, J. G. (1977). Developmental Estimates of the Coverage of the Population of States in the 1970 Census: Demographic Analysis (Series P-23, No. 65; Current Population Reports. Special Studies).

StataCorp. (2021). Stata: Release 17.

Tourangeau, R., & Plewes, T. J. (2013). Nonresponse in social science surveys: A research agenda. In R. Tourangeau & T. J. Plewes (Eds.), Nonresponse in Social Science Surveys: A Research Agenda. The National Academies Press. https://doi.org/10.17226/18293

U.S. Census Bureau. (2014). The Undercount of Young Children.

U.S. Census Bureau. (2021a). Local Update of Census Addresses (LUCA) Operation. https://www.census.gov/programs-surveys/decennial-census/about/luca.html

U.S. Census Bureau. (2021b). Methodology for the United States Population Estimates: Vintage 2020.

U.S. Census Bureau. (2021c, August 12). 2020 Census: Redistricting File (Public Law 94-171) Dataset. https://www.census.gov/data/datasets/2020/dec/2020-census-redistricting-summary-file-dataset.html

U.S. Census Bureau. (2022). National Demographic Analysis Tables: 2020. https://www.census.gov/data/tables/2020/demo/popest/2020-demographic-analysis-tables.html

Villa Ross, C. (2023). Uses of Decennial Census Programs Data in Federal Funds Distribution: Fiscal Year 2021 (Census Working Papers). U.S. Census Bureau.

West, K. K., & Fein, D. J. (1990). Census Undercount: An Historical and Contemporary Sociological Issue. Sociological Inquiry, 60(2), 127–141. https://doi.org/10.1111/j.1475-682X.1990.tb00134.x

Author’s Message

The undercount of children and young children are complex issues and, therefore, are hard to solve. For instance, the undercount of young children has been worse decade by decade since 1980. This study presents a set of variables explaining the variation in children and young children undercounting across Texas counties. Our results could help identify counties more likely to experience undercounts in the 2030 Census if nothing is done to prevent it. On the other hand, our results are helpful to target outreach efforts and awareness campaigns. On-the-ground research should be conducted to expand the understanding of the presented findings.

Acknowledgements: The authors appreciate the insightful support provided by Angela Broyles, Helen You, and Lloyd Potter in the development of this study.

FAQ

1) Why does the U.S. Census Bureau not publish undercounting and overcounting estimates at the county level for children?

As it is well known, the U.S. Census Bureau assesses the quality (undercounting or overcounting) of its Decennial Census using the Post-Enumeration Survey (PES) and the Demographic Analysis (DA). 

The PES was implemented in 2020 by characteristics of the housing units only to the national and state levels. The PES uses the location of the housing units to obtain results at the subnational level, but it does not consider demographic characteristics such as age or gender. Moreover, “…the sample size for the 2020 PES and the assumptions required to make unbiased sub-state estimates, the Census Bureau was unable to include county or place estimates in the 2020 PES reports, as well.” (U.S. Census Bureau, 2022). 

On the other hand, the Demographic Analysis uses “…current and historical vital records, data on international migration, and Medicare records to produce national estimates of the population on April 1 by age, sex, the DA race categories, and Hispanic origin.” (U.S. Census Bureau, 2022). While the DA is rich in demographic characteristics, it cannot identify the current place of residence of the population since a great part of it is based on vital records. Therefore, due to its nature, the official undercounting or overcounting by demographic characteristics is estimated at the national level only.

Therefore, it is not possible to obtain an official undercounting and overcounting estimate at the county level for children.

2) Why are we using counties as geographies?

Counties are used here as the geographical level of study because they are political subdivisions small enough to capture within-state disparities, and large enough to group social representation.

3) How accurate or precise are our net child undercount estimates?

While there is no statistical measure of accuracy or precision for our estimates, our estimates of the undercount of young children are highly correlated with those recently released by the U.S. Census Bureau for 135 Texas counties. Our estimates have a correlation of 0.997 with those provided by the Bureau. However, we perform the analysis for all 254 Texas counties. So, we believe our net young children undercount estimates are highly accurate. Regarding the undercount of children (ages 0 to 17) there is no statistical measure of accuracy or precision, but they were built using a similar approach as that for our young children’s undercount estimates.