Data Opportunities for Studying the Sexual and Reproductive Health of Immigrants in the United States
This paper aims to identify, review, and evaluate publicly available national- and local-level data sources that collect information on the sexual and reproductive health (SRH) of immigrants in the United States. We review public-use sources from the last 30 years that include information on immigration, SRH, health service utilization, and race/ethnicity. For each source, we evaluated the strengths and challenges of the study design and content as they relate to studying immigrant SRH. We identified and reviewed 22 national and seven local sources. At the national level, the National Longitudinal Study of Adolescent to Adult Health and the National Survey of Family Growth contained the most information; at the local level, the New York City Community Health Survey was the most robust. These sources present opportunities to advance research, improve public health surveillance, and inform policies and programs related to the SRH of this rapidly growing and often underserved population.
Immigrants, sexual health, reproductive health, database
Since the passing of the Immigration and Nationality Act of 1965, the share of the United States population that was not born in the country has grown from 9.6 million or 5% of the total population to a record estimated population of 43.7 million (or 14%) in 2016.1,2 The foreign-born population lives in every state in the country, with over half living in California, New York, Texas, and Florida.3 (In this paper, we refer to the foreign-born population as immigrants.) With the rapid growth of the immigrant population and the country's changing political climate, researchers and advocates alike have called for more research on the health behaviors, needs, and outcomes of immigrants.4–7 Literature suggests that immigrants, broadly, have better birth and maternal health outcomes and lower overall mortality rates than the U.S.-born population.8,9 At the same time, their access to health care is often challenged due to lower rates of health insurance coverage, lack of familiarity with the health system, and linguistic [End Page 560] barriers.8,9 Despite the vast heterogeneity of the immigrant population, research rarely disaggregates immigrant data by factors such as race/ethnicity or length of stay.5–7 Furthermore, the sexual and reproductive health (SRH) of many immigrant groups is not well-documented in the current public health literature. Research in the immigrant population has been constrained by data limitations, particularly at the subgroup level. Few data sources collect disaggregated race/ethnicity data for subgroup analyses, provide comparable definitions of immigration status across surveys, use multilingual survey tools, and/or employ methodological techniques such as oversampling immigrants or linking individual-and population-level data sources to facilitate analyses.4,5,10,11 Consequently, we do not know the extent to which many SRH measures differ by nativity status (foreign-versus U.S.-born) and between ethnic subgroups. Race and ethnicity data are particularly important given that they provide additional depth and context to the immigrant experience of specific groups. Other facets of nativity status may also influence SRH and health care access, including length of stay, language skills, and documentation status.12–18 Cumulatively, these factors may influence immigrants' ability to seek out health resources and navigate the health system. In fact, small-scale studies suggest that immigrant women are less likely to seek out SRH-related cancer screenings.19–21 Additionally, existing literature suggests that outcomes and behaviors vary by immigrant generation. For example, for a variety of health indicators, the protective immigrant health effect decreases with each subsequent generation.22 However, less is known about how SRH measures change from one generation to the next since data across immigrant generations are typically unavailable.
In order to develop effective, evidence-based programs and policies that support immigrant SRH, additional research is needed. The goal of this paper is to review and summarize existing publicly-available quantitative data sources that include information on nativity status, race/ethnicity, SRH behaviors and outcomes, and health care utilization. We focus on national datasets and highlight select local data sources as these may be more responsive to research questions at the state and city levels. We evaluate the information collected for each data source, propose recommendations to address existing data challenges, and highlight opportunities to advance future study of immigrant SRH. This article should serve as a resource for researchers and advocates interested in further supporting the SRH of immigrant populations in the U.S. by using robust data to inform the policy and programmatic decisions made on behalf of these populations.
Data sources are included in this review if they are publicly available and contain at least 1) one measure of immigration, 2) one measure of sexual and/or reproductive health, and 3) data collected between 1987 and 2017. We focus on sources from the last 30 years because demographic trends show that immigrant populations in the U.S. have rapidly and steadily increased within this time frame.1 We examine data sources into two categories: national level and local level (i.e., state, county, and city data). To identify national-level data, we reviewed publicly-available data sources from select federal agencies such as the Centers for Disease Control and Prevention, the U.S. Census Bureau, and the Bureau of Labor Statistics, as well as academic institutions such [End Page 561] as the UNC Carolina Population Center and Princeton University. Eligible local-level surveys were identified by focusing on data sources from states with the largest immigrant populations, including California, Texas, New York, Florida, New Jersey, Illinois, and Massachusetts.2 We identify only a shortlist of these local data sources, primarily from state departments of health, to provide examples of the utility of non-national data in assessing immigrant SRH. In contrast, we aim to provide a comprehensive list of relevant national-level data sources.
For each data source included in this review, we provide a brief description of the survey, including its aims, design, and sample size. As a point of reference, we also provide an example publication that uses the data source. We identify measures of immigration, race and ethnicity, SRH, and health service utilization. Each of these categories is detailed below.
Survey aims and design
For each data source, we include its full name, the affiliated research institution, and a summary of the survey's main purpose. We also indicate the timeframe covered by the data source, the number of survey waves or rounds, and the intervals at which they occurred. Throughout this paper, we refer to most data sources by their acronyms, which are spelled out in Tables 1 and 2. We indicate the survey design (e.g., cross-sectional or longitudinal), population, sampling frame (e.g., neighborhoods, households, clinics), and the representativeness of the sample. Each survey that we included is conducted in languages other than English.
We include the rounded sample size from the most recent data available at the time of this review and any oversampling that occurred. We also report the number of women aged 15–44 and the foreign-born sample from the data source's website and/or codebook, when available. Although imprecise, these figures can provide a sense of the number of immigrant women of reproductive age (15–44) in the sample. All reported sample sizes, unless otherwise indicated, are rounded to the nearest hundred and include both men and women.
For each data source, we include a published study as an example of quantitative research that uses the corresponding data to study immigrant health. When possible, we list articles specifically about immigrant SRH.
We identify measures of immigration related to immigrant status and context such as nativity (i.e., born in or outside of the U.S.), citizenship, documentation (e.g., green card or visa status), length of stay in the U.S., year/age of entry, country or region of birth, and language spoken at home.
Race and ethnicity
We report racial categories from each of the surveys using the 2015 guidelines issued by the Office of Management and Budget (OMB).23 The OMB recommends that the minimum standard for measures of race should include the distinct categories of American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or other Pacific Islander, and White. For ethnicity, although the OMB minimum standard only includes Hispanic/Latino and non-Hispanic/Latino origin, we highlight additional ethnicity information when provided. In surveys that collected this additional information, ethnicity categories typically reflected regions or countries of origin (e.g., Chinese, Ethiopian). We also identify data sources that report on ancestry, a component of immigrant history and potential identifier of ethnicity or country of origin. [End Page 562]
Sexual and reproductive health
This review focuses on the following six SRH domains, each representing multiple measures of SRH: (1) pregnancy intention (i.e., wantedness of past, current, or future pregnancy), (2) fertility history (e.g., number of births or children, maternal age at first birth, history of abortion, and other pregnancy outcomes), (3) contraceptive use (e.g., current method(s) used and history of method use), (4) HIV/STI history (e.g., history of HIV/STI testing or treatment, HPV vaccination, and STI diagnoses), (5) sexual behavior (e.g., age at first sexual encounter, timing of last sexual encounter, and other information about sexual encounters), and (6) sexual orientation.
Health service utilization
We also examine each data source for two key domains of health service utilization: health insurance status and source of care. Source of care typically assesses the type of health provider visited for services, such as private doctor, public clinic, hospital, or urgent care. Follow-up questions related to each source of care may include the type and quality of care provided. This information is important when considering if and how immigrant populations access health services generally and SRH care specifically.
Data sources and sample size
Based on the selection criteria described earlier, we identified 29 publicly available data sources that collected data on immigration, race/ethnicity, SRH, and health service utilization. Table 1 reports on 22 publicly available nationally representative data sources, including 14 longitudinal and eight cross-sectional surveys. We identified the immigrant sample size in the most recent rounds of 14 of these data sources. Three surveys (National Survey of Reproductive and Contraceptive Knowledge [Fog Zone],24 Panel Study of Income Dynamics [PSID]—Child Development Survey [CDS],25 and National Longitudinal Survey of Youth [NLSY97]26) had fewer than 500 immigrants in their most recent samples. Another three (National Longitudinal Survey of Youth [NLSY79],27 Fragile Families and Child Wellbeing Study [FFCWS],28 and Panel Study of Income Dynamics [PSID]—Individual and Family Data [Main]29) surveyed 500–1,000 immigrants. Four surveys (National Longitudinal Study of Adolescent to Adult Health [Add Health],30 Abortion Patient Survey [APS],31 National Survey of Family Growth [NSFG],32 National Survey of Families and Households [NSFH],33 and National Health and Nutrition Examination Survey [NHANES]34) included 1,000–2,000 immigrants in their samples. Three surveys had sample sizes as large as 13,000 (Current Population Survey [CPS]35), 18,000 (National Health Interview Survey [NHIS]36), and 82,000 (American Community Survey [ACS]37). Note that some of these surveys have multiple waves of data; in these cases, pooling waves of data would further increase the sample size of immigrants.
We also identified seven examples of (1) state-level data from New York and California (Pregnancy Risk Assessment Monitoring System [PRAMS],38 California Health Interview Survey [CHIS]39, California Women's Health Survey [CWHS],40 and California Maternal Infant Health Assessment [MIHA]41); (2) county-level data from Los Angeles (LA County Health Survey [LACHS]42); and (3) city-level data from El Paso (Border Contraceptive Access Study [BCAS]43) and New York City (New York City Community [End Page 563]
[End Page 572]
[End Page 574] Health Survey [NYCCHS]44) (Table 2). Six surveys are cross-sectional and all but one (BCAS) are representative of their target population. The BCAS is also longitudinal in design. We identified the immigrant sample size in three of the local data sources; these ranged from approximately 700 foreign-born respondents in BCAS to 4,000 or more in the NYC CHS and CHIS.
At a national level, with the exception of five surveys (National Vital Statistics System: Birth Data [NVSS—Birth Data],45 National Longitudinal Survey of Youth [NLSY79 Child/YA],46 PSID-CDS, Panel Study of Income Dynamics [PSID]—Child Development Survey 2014 [CDS-2014],47 Panel Study of Income Dynamics [PSID]—Transition to Adulthood Supplement [TAS],48) all data sources included at least two measures of immigration. The most common were nativity (included in 19 surveys), language spoken at home, country or region of birth, and year of entry (each in 13 surveys) (Table 3). Fourteen data sources (Add Health, CPS, Early Childhood Longitudinal Study—Birth cohort [ECLS-B],49 Early Childhood Longitudinal Study—Kindergarten [ECLS-K],50 FFCWS, NHANES, New Immigrant Survey [NIS],51 NLSY79, NLSY79 Child/YA, NLSY97, NSFH, PSID-Main, and Survey of Income and Program Participation [SIPP]52, NVSS) collected additional data on respondents' parents' country of birth, year of entry, citizenship, and language spoken at home, allowing for research on second-generation immigrants. Without information about parents' country of birth, it is difficult to determine second generation status, which is defined as being born in the U.S. to foreign-born parents (first-generation immigrants). The NIS, NLSY-79, and PSID-Main collected data on documentation status.
At the state and county level, all seven data sources included at least two measures of immigration; the most commonly collected data were on country of birth, length of stay, and language(s) spoken at home, each included in five surveys (Table 4). The CHIS included the most measures of immigration, including citizenship, language spoken at home, length of stay, country of birth, and documentation status. Four surveys (CHWS, NYC CHS, PRAMS, and BCAS) included a measure of nativity and only NYC CHS and BCAS asked about parents' nativity (Table 4).
Race and ethnicity
Apart from seven data sources (Add Health, FFCWS, NSFH, Fog Zone, PSID-CDS, BCAS, CWHS, and MIHA), all of the surveys we examined followed the basic OMB guidelines for collecting race and ethnicity data (Tables 3 and 4).23 Of these surveys that met OMB guidelines, all but another two (APS and ECLS) included two or more more detailed categories for ethnicity. Hispanic subgroups were most frequently disaggregated in comparison with other ethnic groups, followed by Asian and Pacific Islander subgroups. Distinctions for Asian origin focused primarily on select countries such as China, Japan, Korea, the Philippines, India, and Vietnam. Pacific Islander origins include Native Hawaiian, Guamanian, and Samoan. African and European origin were rarely disaggregated. Seven surveys at the national level (ACS, Add Health, FFCWS, NHANES, NLSY-79, NLSY-79 Child/YA, and PSID-Main) and three at the local level (CHIS, LACHS, NYC CHS) had a measure of self-reported ancestry (Tables 3 and 4).
Sexual and reproductive health and health service utilization
With the exception of six data sources (ACS, CPS, ECLS-K, NIS, SIPP, and NVSS), all of the included national surveys collected data on two or more SRH domains relevant to this review (Table 3). The most common domain measured was fertility history; in contrast, data [End Page 575]
[End Page 577]
[End Page 578] on sexual orientation was collected by only four data sources (NHANES, NHIS, Add Health, and NSFG surveys). All of the surveys except the NSFH collected information on health service utilization. Overall, the NSFG and Add Health, followed by NHANES, NHIS, and the Fog Zone, collected the most detailed information across multiple SRH domains and on health care utilization.
At the state, county, and city level, each of the seven data sources included measures of at least two SRH domains; the most common were pregnancy intention, contraceptive use, and sexual behavior, addressed in at least four of seven surveys (Table 4). Five of the seven surveys collected information on health insurance status and source of care. Overall, the CWHS and NYC CHS contained the most information related to SRH and health care utilization.
In general, at the national level, Add Health and NSFG contained the most detailed information on immigration and SRH, followed by NHANES, NHIS, and PSID (Table 3). In contrast, NSFH and NVSS collected the least information on immigration and SRH. At the state and county level, the NYC CHS included the most information for both SRH and immigration, followed by CWHS and CHIS; however, CHIS included more information on immigration (compared with SRH) whereas CWHS collected notably more data on SRH than immigration (Table 4).
Notably, questions that measured immigration, SRH, and health service utilization varied between data sources. In some cases, questions varied between survey wave, as a result of evolving categories and definitions of particular measures. Similarly, the breadth and depth of available SRH information often varied between data sources depending on the purpose of the data source, with some including SRH-related questions in optional modules rather than mandatory core sections of a survey.
This analysis identifies national-, state-and local-level data sources that can facilitate further examination of immigrant SRH. We highlight large publicly-available data sources with notable immigrant sample sizes and key information on immigration and SRH to encourage broader use of these data. At the same time, we note potential challenges across data sources. For example, despite being publicly available, some data sources restricted access to disaggregated data by ethnicity due to small sample sizes; however, organizations such as the National Center for Health Statistics can grant access to these restricted data after an application process, allowing for subgroup analyses. The type of information on immigration, race and ethnicity, and SRH also varied between data sources and over time. Race/ethnicity categories also change over time, often reflective of shifting OMB guidelines, and measures of immigration may vary based on published literature identifying new characteristics of the immigrant context. In these cases, pooled analyses across survey waves or surveillance of SRH outcomes over time can be challenging. Furthermore, few data sources collect detailed data on both immigration and SRH, suggesting that researchers may face a trade-off in each data source. Despite these potential challenges, the data sources presented in this paper are critical to initiating and advancing much-needed research on immigrant SRH.
This study has several limitations. Although we attempt to provide a [End Page 579] far-reaching list of data sources relevant to the study of immigrant SRH, this paper is not intended as a formal systematic review, so neither the national nor local lists are exhaustive. We also do not include qualitative data sources in this paper. Given our objective to highlight publicly available data sources to study immigrant SRH, we chose to focus on quantitative data, which are often more readily accessible than data from qualitative studies. That being said, we acknowledge the need for qualitative and mixed-method studies to identify the range of SRH issues that affect immigrant communities. For example, qualitative research can help explore how individual motivations, interpersonal dynamics, and social context influence immigrant women's decision-making and outcomes related to SRH—information that is critical to understand immigrants' access to and use of SRH services. These data can also help shape quantitative research that is community-relevant.
This study also does not assess the specificity of the information collected on immigration status and SRH across surveys. For instance, while we highlight which surveys assessed fertility history, the specific measures that represent this domain in each survey may vary. Some surveys may exclude abortion or miscarriage from fertility history or change the number of fertility-related measures collected over time, but these details are not reported here. In an effort to hone the focus of this review, we do not address all measures of SRH or immigration such as reproductive cancer screening or acculturation. These data are available in some of the data sources we reviewed, such as CHIS and Add Health. We encourage future research to examine how acculturation, assimilation, and identity affect a range of SRH behaviors and outcomes.
We do not present specific linkages between data sources, although data from many of the sources we reviewed can be linked to other information such as contextual data. Typically, a study's website or data user manuals will indicate the linkage capabilities of a data source. Incorporating additional information such as census or birth data in the study of individual-level outcomes can help assess the impact of contextual factors such affordable housing or neighborhood-level poverty.
Finally, we do not include measures related to undocumented status in this paper, given the minority of surveys that collect these data and limitations in the quality of these data. For multiple reasons, including safety and confidentiality, undocumented status is rarely asked of survey respondents. Especially in the current political climate, individuals may be particularly hesitant to reveal or accurately report their documentation status.53 Without these data, it is and will continue to be challenging to assess the health status of this population and meet their varied and changing needs. We encourage researchers to continue employing innovative methods to estimate undocumented status.54
This paper brings to bear areas for improvement in future study design, data collection, and analysis in order to advance our understanding of immigrant SRH. For example, research on immigrants and their generational-as well as ethnicity-specific subgroups is often limited due to small sample sizes. In response, we echo previous recommendations to employ oversampling techniques and pool waves of data, when appropriate, to power subgroup analyses sufficiently.5,6 We also encourage wider use of local-level data sources, in addition to national datasets, to study immigrant SRH. State-or county-level data may better address research questions on specific immigrant [End Page 580] populations given geographic clustering of these groups. Increased and ongoing collection of longitudinal data could also assess changes in health behaviors and outcomes over immigrant generations and include data on recently emigrated families.
There is also a need for more comprehensive data on immigrants and their SRH. Future health surveys should consider collecting a minimum set of data on immigrant status, race and ethnicity, and SRH. For example, factors related to generational status such as year of arrival, length of stay, respondent and parent nativity, and ancestry may be useful to assess immigration context.12–18,55 Measures of language proficiency should also be collected more consistently and with increased specificity. Indeed, limited English proficiency can impede access to care and exacerbate other nonfinancial barriers.9 We also recommend collecting race, ethnicity, and country of origin data with as much specificity and granularity as possible given the heterogeneity of the immigrant population and of immigrants' experiences.
Furthermore, there is a distinct lack of data on SRH coverage, care, behaviors, and outcomes in the immigrant population; only two (NSFG and Add Health) of the national surveys reviewed in this paper collected information on each of these topics. Without these data, disparities in SRH service use and access may remain hidden within the immigrant population and between immigrants and non-immigrants. Instead, future research efforts on immigrant health, generally, and SRH, specifically, should make sure to collect data related to use of and access to health insurance status, contraceptive counseling and services, screening for reproductive cancers, STI prevention and treatment, gynecological and obstetric services, and abortion. These data are particularly important given mounting legal and logistical barriers to obtaining reproductive health care in the U.S., combined with enforcement of immigration policies, which may uniquely target and affect immigrants' access to SRH care. Furthermore, data efforts that allow for the exploration of how individual, interpersonal, community, and structural factors influence different immigrant women's SRH experiences, decisions, and outcomes could also help inform policies and protocols that both enhance and safeguard immigrant women's access to and use of SRH services as well as their wellbeing more broadly.
In general, there is a critical need for data sources to include measures of immigration and SRH. However, it is also important to note that data collection, on any topic, can be highly sensitive for immigrants in the United States. Fear of legal backlash in the current political environment may deter immigrants, regardless of documentation status, from disclosing personal information. Study participation may be further limited due to linguistic barriers or—specifically related to SRH research—the cultural stigma associated with topics such as sex, pregnancy, and abortion.56 In order to mitigate some of these concerns and increase the quality and quantity of immigrant SRH data, future research may consider adapting the language, tone of survey questions, and translation of instruments to meet the needs of specific immigrant groups more adequately, while also protecting their confidentiality. Involving and engaging immigrant populations throughout the research process—from formulating research questions and survey instruments to data collection, analysis, and dissemination—is also critical to developing relevant and culturally competent research.7,57,58 These efforts can contribute to a better understanding of the sexual and reproductive health of immigrant groups, [End Page 581] which, ultimately, will help inform programs and policies that aim to improve the overall health and wellbeing of the U.S. population.
ATHENA TAPALES was a Senior Research Scientist with the Guttmacher Institute during the writing of this manuscript; she is now a consultant. SHEILA DESAI is a Senior Research Associate with the Guttmacher Institute. ELLIE LEONG is a Senior Research Assistant with the Guttmacher Institute.
Support for this work was provided by the Guttmacher Center for Population Research Innovation and Dissemination (NIH grant 5 R24 HD074034). The authors thank Dr. Liza Fuentes, Dr. Laura D. Lindberg and Dr. Megan L. Kavanaugh for their insightful comments on earlier versions of the manuscript.