Summary: The Hispanic population in the United States is a large and diverse group of people with multidimensional identities. Existing survey instruments pose critical limits on research into this population and thus impact resulting policies. While the principle of racial and ethnic self-identification is important to respect and preserve, designing better surveys with more objective indicators of racial and ethnic background would provide a clearer picture of diverse subgroups and how they fare economically compared with one another and with other demographic groups. This is a critical step that would enable researchers to advance the collective understanding of the Hispanic population and thus allow policymakers to better address the challenges Hispanic people in the United States face.

Here are five important things that policymakers and researchers need to understand about the multidimensional identities of the Hispanic population in the United States:

  • Current survey definitions do not match the way many individuals think about race and ethnicity. For many people, race and ethnicity are inextricably linked, yet current survey instruments leave individuals with limited options to identify their racial and ethnic identities. This may lead them to be undercounted.
  • Current data on the Hispanic population is incomplete at best and thus cannot definitively measure disparities or confidently track socioeconomic progress. Many statistics used by researchers are based on studies in which individuals self-identify as Hispanic. But self-reporting of race and ethnicity is limited by survey terms that don’t match how Hispanics view themselves, and is affected by social and economic context.
  • Immigrants identify themselves differently across generations. Hispanic people who have been in the U.S. for more generations are more likely than recent immigrants to stop self-identifying as Hispanic, which makes it hard to track measures of well-being over time.
  • The factors that affect the experience of discrimination may also affect self-reported identities. Latino families and communities experience different employment, wage, and other outcomes depending on such factors as skin color, language, and immigration experience. Better systemic collection of such factors in survey instruments would help advance research toward developing policy measures to address bias and discrimination.
  • Surveys that collect additional, more objective indicators of racial and ethnic background, such as ancestry and the places of birth of antecedents, would help provide a clearer picture of disparities within and across demographic groups. While the principle of racial and ethnic self-identification is important to respect and preserve, additional information would go a long way toward tracking progress and challenges of Latinos over time.

Introduction: The Hispanic/Latino/Latinx population in the United States

The Hispanic/Latino/Latinx population is a large and growing share of the United States that defies a simple description.1 Numbering just over 60 million, these individuals—constituting roughly one-fifth of the U.S. population—come from diverse backgrounds and origins and face distinct challenges related to immigration status, discrimination, social factors, and economic vulnerabilities.2 Examining differences within this group, as well as comparing this group with others in the U.S. population, is a crucial first step toward addressing the challenges they face. But measuring disparities within and across groups depends critically on identifying who is, and who is not, Hispanic/Latino. Research uncovering the nuanced differences within this population is tremendously important, yet at the same time is severely limited by the fact that the data collected for this group of people are often inadequate to fully describe the complexity of their identities and experiences.

Current data collection methods can lead to overly broad categorizations or even mischaracterizations and generalizations that ultimately may impact policies aimed at remedying inter- and intra-group disparities. In this essay, I discuss the multidimensional identities of the Hispanic population in the United States as represented in current surveys and discuss some of the barriers inherent in moving toward a greater understanding of this population. I conclude with a list of proposed solutions.

Current survey definitions do not match the way many individuals think about race and ethnicity

The terms defining Hispanic/Latinx individuals represent one dimension of the challenge in tracking their progress over time and across generations because the terms themselves are not without controversy. The origins of the pan-ethnic Hispanic term can be traced to the late 1970s when Congress mandated collection of data on immigrants from Spanish-speaking countries and their descendants residing in the U.S. (Lopez, Krogstad, and Passel 2021), as well as the immigration waves and social movements that preceded it (Mora 2014). This motivation can still be seen quite literally in the so-called Hispanic origin question on many U.S. surveys, which asks whether a person is “of Hispanic, Latino or Spanish origin.” The federal government standards and related survey instructions define “Hispanic or Latino” as a “person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race” (OMB 1997, U.S. Census Bureau 2020). Importantly, the U.S. Census Bureau emphasizes directly on its surveys that ethnicity and race are separate concepts and Hispanic individuals can be of any race. Thus, respondents are asked to state with which race they identify as well as whether they identify as Hispanic/Latino. For many individuals, however, race and ethnicity are inextricably linked, and many Hispanic individuals may not see themselves reflected in the options provided to the race question. Currently, the race options include white, Black or African American, American Indian or Alaskan Native, Asian categories, and “some other race” (for which they are allowed to print their own responses) (U.S. Census Bureau 2020).

Analyzing responses to the race question on many U.S. surveys is also complicated by several features related to the American Indian race category (Antman and Duncan 2021). Though the current definition of American Indian race includes “original peoples of North and South America (including Central America),” it also suggests that individuals selecting this category maintain “tribal affiliation or community attachment” (OMB 1997). Additionally, respondents who indicate an American Indian race category are asked to provide the name of their enrolled or principal tribe. As a result, Hispanic/Latina(o) individuals who view their Latino identity as a race are left without an obvious choice on the race question and may be more likely to select “some other race” or leave the question blank instead. A clear reflection of the limitations of the race question is that recent analysis of the 2020 Census indicates that “some other race” is now the second largest racial group after “white” and respondents selecting the “some other race” category are overwhelmingly Hispanic (Wang 2021; Bahrampour 2021). This calls into question the value of a survey that is not actually recording the race(s) with which a large share of people identify. As a consequence, it also poses limits to our understanding of race from a research and societal perspective.

Self-reporting of race and ethnicity is affected by social and economic context

Responses to the race and Hispanic origin questions are also closely linked in their connection to the principle of self-identification, meaning that there is no right or wrong answer to these questions—racial and ethnic identities are based on how people identify themselves. This underscores the social construction of race and ethnicity as an individual’s view of race and ethnicity may vary by social and economic context as do the terms they use. With respect to ethnicity, Hispanic immigrants often prefer to identify themselves with country-specific labels as opposed to a pan-ethnic label (Lopez, Gonzalez-Barrera, and López 2017). The Hispanic origin question specifically asks respondents to indicate whether they are Mexican/Mexican-American, Puerto Rican, or Cuban, as those are the three largest origins within the Hispanic group, and individuals from other Hispanic origin groups may indicate their origins separately.3 While the term Latinx does not appear on official surveys, this gender-neutral term is often used in academic and media circles in the spirit of inclusion. Nevertheless, it should be emphasized that Latinx is not a widely adopted term that Latinos use to identify themselves, and in fact, many have not even heard of the term (Noe-Bustamante, Mora, and Lopez 2020). As language continually evolves, survey developers should be mindful of the terms people choose to identify themselves and consider how those terms vary across members of the population. 

Immigrants identify themselves differently across generations

At the same time, changing terminologies may create additional challenges for researchers wishing to track the progress of specific immigrant groups over time. The importance of the connection to the immigrant experience is particularly salient in this regard as descendants of immigrants may choose to identify themselves differently across generations. Note that researchers typically define Hispanic generations in the following way: first-generation immigrants are Hispanics that are foreign born, the second generation are U.S.-born children with at least one immigrant parent, and the third and higher order generations are U.S.-born children of U.S.-born parents (Antman, Duncan, and Trejo 2020), where all individual respondents identify as Hispanic.4

However, surveys may also collect additional information that researchers might define as more objective measures of race and ethnic background. For example, the American Community Survey (ACS), collected by the U.S. Census Bureau, asks questions about an individual’s ancestry while the Census Bureau’s Current Population Survey (CPS) asks about parental countries of birth. Using the more objective measures of ancestry in the ACS and CPS, researchers can then compare the population of Hispanics defined by self-identification with the population of individuals with Hispanic ancestry. As one might expect, these comparisons suggest that a larger share of individuals have Hispanic ancestry than self-identify as Hispanic, and the share that identifies as Hispanic or Latino falls considerably across immigrant generations. By the fourth or higher generation, only about half of those individuals who have Hispanic ancestry are identifying as Hispanic (Lopez, Gonzalez-Barrera, and López 2017). Importantly, recent studies suggest that individuals who do not identify as Hispanic despite having Hispanic ancestry have higher socioeconomic status than those who self-identify, which may overstate perceived disparities in education (Duncan and Trejo 2011) and health (Antman, Duncan, and Trejo 2020) across generations and ethnic groups. In sum, most statistics based on studies that focus only on individuals who self-identify as Hispanic are at best incomplete, and as a result may mischaracterize the population of individuals with Hispanic ancestry and understate conventional measures of socioeconomic progress for this group.

Factors that affect the experience of discrimination may also affect self-reported identities

Just as immigrant generation has been found to play an important role in racial and ethnic identities, researchers may also consider the impact of the historical treatment of Hispanics in the U.S. on how individuals perceive race and ethnicity. For instance, historical accounts suggest Mexican Americans were routinely segregated into separate schools in certain areas. Unfortunately, few official records document the practice, making it difficult to evaluate its long-run impacts. Despite these challenges, Antman and Cortes (2021) examine the effects of the legal desegregation of so-called Mexican schools in the U.S. and how that policy change affected educational outcomes for this population. More research could examine the impacts of this history of discrimination on self-identification and other socioeconomic outcomes, in particular for Latino communities that have been in the U.S. for generations, and have experienced long-standing structural discrimination as a result.

Similarly, other factors that affect the experience of discrimination for the Latinx population should be studied further. For example, although the majority of Hispanic adults say they have experienced discrimination on the basis of race or ethnicity, this share is higher among Hispanics with darker skin color (Gonzalez-Barrera 2019). However, few nationally representative surveys collect information on observable markers of race and ethnicity such as skin color so we know relatively little about how discrimination might affect peoples’ willingness to identify with a racial or ethnic group (Golash-Boza and Darity 2008).

Language use is another area that may affect identity formation and change, as Spanish language use declines and English language dominance rises steadily over immigrant generations (Lopez, Gonzalez-Barrera, and López 2017). More generally, the link between immigrant experience and Hispanic self-identification also remains understudied despite significant evidence that immigration policy affects Hispanic groups, especially the educational investments and labor market outcomes of likely undocumented individuals (e.g., Amuedo-Dorantes and Antman 2017, 2021). At the same time, it should be emphasized that collecting additional data on factors such as skin color and immigration status would be highly controversial due to concerns over how the data might be used. Moreover, to ensure that individuals from vulnerable populations are accurately represented in official survey counts, we need to better understand how adding questions on these sensitive topics might discourage these individuals from participating in surveys. Thus, addressing these concerns prior to data collection should remain paramount. Nevertheless, these additional factors add important dimensions that researchers do not presently have the capacity to analyze in many surveys and these analyses could make substantial contributions to our understanding of the complex relationships between race, ethnicity, ancestry, language, immigration, discrimination, and a whole host of economic and social outcomes.

Conclusion: We know enough about this population to know that we need more precise survey instruments

The Hispanic population in the U.S. represents a large group of individuals from diverse backgrounds and origins, and who face many distinct challenges. Despite relatively high labor force participation rates (BLS 2019), Hispanic individuals have relatively high rates of unemployment, lower earnings, lower education, and higher rates of poverty than white households, as well as less access to health care (Gould, Perez, and Wilson 2020). All of these factors suggest Hispanic individuals are among the demographic groups that are especially vulnerable and that are likely to be hit especially hard during recessions, such as the one that occurred in the wake of the COVID-19 pandemic (Gould, Perez, and Wilson 2020). To fully understand and address the challenges faced by Latinos in the U.S., it is critical for policymakers to have accurate data on this population. While the principle of racial and ethnic self-identification is important to respect and preserve, relying solely on self-identification to measure disparities within and across demographic groups may misrepresent the challenges and progress of Latinos over time and generations. 

More generally, surveys should allow people to self-identify racially and ethnically with a choice of categories that are meaningful to them and that can also be aggregated into groups that accord with socially recognized demographic groups as needed. At the same time, surveys should also collect additional, more objective indicators of racial and ethnic background such as ancestry and the places of birth of antecedents. Of course, many people, especially disenfranchised individuals and those impacted by forced migration and resettlement, may not even be aware of their ancestries. One way to address this limitation would be by linking surveys over time and across generations so that respondents could be connected directly to their ancestors and would not have to rely on their memories and any biases associated with selective recall. The extent to which surveys can be linked accurately across years for the same person, across generations of the same family, and even potentially across countries, remains an open question. Nevertheless, data linkages would help us build a much better understanding of demographic groups and enable researchers to more fully track progress for individuals and pinpoint areas of ongoing disparity. The potential value of these linkages is especially high for the Hispanic population in the U.S., which continues to be shaped by migration. 

While collecting information on an individual’s own view of their racial and ethnic identities as well as their racial and ethnic heritage is critically important, a more complete understanding of inequality and bias in the U.S. also requires collecting information on how others may view an individual’s race and ethnicity, such as asking about an individual’s “street race” or socially ascribed race (López et al. 2017).5 Moreover, better systematic collection of factors such as skin color, language, and immigration experience would help us advance research much further in this regard and is an important step toward developing policy measures to address bias and discrimination. Finally, building our knowledge of the historical experience of all groups, including U.S.-born Latinos and Hispanic immigrants, is vital to understanding the challenges these groups have faced over time, the extent to which those challenges have been overcome, and identifying continued barriers to progress. Together these measures would go a long way toward enhancing our understanding of the role that race and ethnicity continue to play in the United States today.


This essay is based in part on a lecture the author delivered during an online workshop on Contemporary Social Issues and the Latinx Experience in the United States hosted by the Economic Policy Institute and supported by the Program on Race, Ethnicity and the Economy (PREE) on December 2, 2020. The author thanks PREE Program Director Valerie Rawlston Wilson, Kyle Moore, and workshop participants for their feedback.

Additional reading and resources

Readers interested in delving deeper into the issues touched on in this chapter are encouraged to explore the following resources suggested by the author.


1. The terms Hispanic, Latina/o, and Latinx are used interchangeably throughout this essay and are discussed in greater detail below.

2. The U.S. Hispanic population reached just over 60 million and made up about 18 percent of the U.S. population in 2019 (Noe-Bustamante, Lopez, and Krogstad 2020).

3. Many researchers focus on the three largest subgroups, and in particular the Mexican origin group, which remains the largest Hispanic origin group, making up about 62% of the Hispanic population (Noe-Bustamante 2019).

4. Each of these generational groupings constitutes roughly one-third of the U.S. Hispanic population (Gonzalez-Barrera 2020).

5. While surveys such as the Latino National Health and Immigration Survey ask respondents questions on this topic (López et al. 2017), answers are subject to self-reporting, thus independent supplemental measures of race and ethnicity are still useful, such as the additional more objective indicators described above.


