General Information

AskCHIS Neighborhood Edition (NE) is an online data dissemination and visualization platform that provides population health estimates at sub-county geographic regions. With AskCHIS NE, you can access and visualize authoritative health data at the census tract, ZIP code, city, county, and legislative district geographic levels.

Estimates are powered by data from the California Health Interview Survey (CHIS) and are created through a sophisticated modeling technique called small area estimation (SAE). Before using estimates from AskCHIS NE, we recommend reading more about our methodology and data limitations.

AskCHIS NE is compatible with most modern browsers, including Microsoft Edge, Google Chrome, Mozilla Firefox, and Safari.

If you feel that some features are not working as expected, please contact us to report an issue.


Data Limitations and Use

Health estimates available in AskCHIS NE are model-based small area estimates (SAEs) except state-level estimates and some county estimates. These estimates are the result of complex statistical modeling that uses relevant characteristics of populations and geographical areas to predict health conditions for small geographic units (cities, ZIP codes, some counties). SAEs are not direct estimates (estimates produced directly from survey data, such as those provided through AskCHIS). While direct estimates are produced solely using survey data and design weights, the model-based estimates in AskCHIS NE also rely on secondary data describing characteristics of both geographic regions and populations. AskCHIS NE users should decide on the appropriateness of using model-based SAEs based on the strengths and limitations briefly discussed below and elaborated in the methodology documentation.

In developing final local-level estimates, data from the California Health Interview Survey (CHIS) were used as the primary data source for modeling. Area-level data providing contextual information was provided by the American Community Survey (ACS). Population characteristics were provided by Nielsen-Claritas Pop-Facts (Claritas) data.

Health estimates in AskCHIS NE are subject to errors that can impact data accuracy: sampling error and non-sampling error of the input data, and model error.

Sampling errors occur because inferences about the entire population are based on information obtained from only a sample of that population. If the sample is representative, as with CHIS, then the sampling errors are reduced. The models for AskCHIS NE health estimates are built on samples of California residents from CHIS and ACS instead of on information from all members of the California population.

Non-sampling errors include coverage errors, measurement errors (respondent, interviewer, questionnaire, collection method, etc.), non-response errors, and processing errors. CHIS and ACS data, as with all survey data, are subject to these errors. Non-sampling errors are partially corrected through post-collection data cleaning and weighting processes. Claritas data are based on Census data and administrative data, and thus also subject to non-sampling errors.

Generally, model errors occur when the statistical model does not account for all the information contributing to variation of the dependent variable. In AskCHIS NE, we reduce model error by borrowing strength from several data sources to inform our statistical models.

Demographic variables available in AskCHIS NE were produced using data from the American Community Survey (ACS). These data were adjusted to match the CHIS population, which excludes the population living in group quarters (such as prisons, hospitals, dormitories, etc.). The demographic variables are included in AskCHIS NE to provide additional context to health estimates and may not be generalized to the entire population of California. AskCHIS NE is a public health surveillance tool, not an official source of demographic information. Demographic information available in AskCHIS NE is not meant to replace data from the U.S. Census.

Your use of estimates, data, and features from AskCHIS Neighborhood Edition signifies agreement that The Regents of the University of California, UCLA, the UCLA Center for Health Policy Research, and the California Health Interview Survey shall not be liable for any activity involving these data, estimates, or features of them for any purpose.


Methodology Brief

Survey Data: The California Health Interview Survey (CHIS) is the nation’s largest state health survey (over 40,000 adult, child, and teen respondents) and has been conducted every other year from 2001 through 2009, and every year since 2011. CHIS provides important information on the health, health behaviors, and access to health care services of Californians. Conducted and disseminated by the UCLA Center for Health Policy Research (CHPR) since 2001, CHIS data and analytic results are used extensively in California in policy development, service planning, and research, and CHIS is recognized and valued nationally as a model population-based health survey. The current AskCHIS NE estimates are based on CHIS 2019-2020 data.

Population data: Nielsen Claritas. These data consist of projected population data provided by Nielsen, a San Diego-based private marketing research firm. Total population and household estimates are based on estimates produced by the Census Bureau, as well as information from state and local agencies. The Claritas data has been augmented using modeled distributions of income-to-poverty ratios in CHIS 2015-2016. The resulting dataset was further adjusted to multiple CHIS weighting dimensions using proportional-iterative-fitting so that it represents the population of the CHIS sample design.

Contextual data: American Community Survey (ACS). Publicly available ACS 2017-2020 5-Year summary tables were downloaded at the census tract level. We used 236 variables and classified them into 22 sociodemographic categories. Principal component analysis was conducted on each of the 22 set of variables. The first principal component of each set were used for a second principal component analysis, and two principal components with the largest variance were eventually used as contextual variables in the model for health estimates.

Models are first built using both survey data and contextual data. These models began with a unit-level generalized linear mixed model that includes individual level fixed predictors to capture individual effects and a random effect at the survey strata level to take into account the survey design. In addition, a non-parametric function of census tract level auxiliary variables was added to the unit-level parametric model to better reflect the non-linear association between the contextual variables and the health indicator or outcome.

The estimated model parameters were then applied to the population dataset with the same set of independent variables, and merged with contextual variables to obtain the predicted probabilities at the individual level. Finally, individual-level predicted values were aggregated into area level estimates for different sets of areas, such as ZIP codes, cities, and legislative districts.

The estimates were calibrated through a two-step process. First, a random intercept from each stratum was included in the model to take the sampling design into account and to "soft-calibrate" modeled estimates to approach direct estimates from CHIS when aggregated to the stratum level. Second, when specific predicted values for some geographic levels fell outside acceptable limits relative to the observed direct estimates, the modeled estimates were adjusted through "hard-calibration," applying proportions of direct estimates to the modeled estimates at strata level and the variances of the small area estimates adjusted accordingly.

The calibrated modeled estimates were validated in the following ways. First, they were checked against the observed values at larger geographic levels (stratum level) where the direct estimates were stable. Scatter plots of predicted and observed estimates at the stratum level were used for this purpose. Second, the modeled estimates were compared to external information that was not used in our estimation process, including direct estimates from previous CHIS cycles or relevant non-CHIS sources. Finally, local area subject-matter experts were consulted to examine modeled estimates.

The coefficient of variation (CV) was calculated for each estimate to assess statistical stability. The coefficient of variation is defined as the ratio between the standard error of the point estimate and the point estimate. A point estimate with CV ≥ 30% is considered unstable. Unstable estimates and estimates for areas with a population universe of less than 1,000 are suppressed.

For unstable estimates, or estimates for areas with a population universe of less than 1,000, geographic locations may be combined to produce stable estimates or to achieve a sufficiently large population. The pooled point estimate and variance are population-weighted averages of the original point and variance estimates. The confidence intervals and coefficient of variations are adjusted accordingly.


GIS Information

AskCHIS NE produces thematic web maps using standard shapefiles provided by the U.S. Census through the TIGER/Line® service. To create an on-screen map, 2013 TIGER/Line shapefiles for the state of California, counties, cities, ZIP code tabulation areas, as well as State Assembly, State Senate, and the 113th U.S. Congressional Districts were used (released August 22, 2013).

5-Digit ZIP Code Tabulation Areas (ZCTAs) are approximate representations of U.S. Postal Service 5-digit service areas. The Census Bureau defines ZCTAs by allocating each block that contains addresses to a single ZIP Code Tabulation Area, usually to the ZCTA that reflects the most frequent occuring ZIP code for the addresses within that tabulation block. AskCHIS NE provides data at the ZCTA level.

What does it mean for my ZIP code?

Some census blocks that do not contain addresses but are completely surrounded by a single ZCTA are assigned to the surrounding ZCTA. This means that residents in a specific ZIP code may have been distributed to a ZCTA different from their actual ZIP code. Per the recommendation of the U.S. Census, data users should not use ZCTAs to identify the official USPS ZIP code for mail delivery as there may be ZIP codes that are primarily non-residential that may not have a corresponding ZCTA. The USPS makes periodic changes to ZIP codes to support more efficient mail delivery.

AskCHIS NE provides data at both state (senate and assembly) and federal legislative districts. State legislative districts are the areas from which members are elected to the state legislature. In California, the upper state legislative district is the California Senate and the lower district is the California Assembly. At the federal level, the application provides estimates for California based on the 116th U.S. Congressional districts.

Determining your legislative district can sometimes be difficult as all three district types (senate, assembly, and congressional) have boundaries that cross cities and ZIP codes. AskCHIS NE provides a legislative lookup tool to quickly and easily determine legislative districts. To use this feature, enter a full address, including ZIP code (5 or 9-digit). Note: if you simply enter a ZIP code or city, the tool will not yield any results.

AskCHIS NE provides users the ability to combine same-type geographic entities into larger geographic areas. This feature is particularly useful when searching for health estimates in geographic areas where the population is below 1,000 or estimates are statistically unstable. When combining geographies, the system automatically recalculates the point estimate, 95% confidence interval, as well as the total population universe. The map also updates to show one continuous geographic entity. Note: Users can also combine non-contiguous same-type geographic entities.


Frequently Asked Questions

All health estimates in this version of AskCHIS Neighborhood Edition are based on data from the 20112012, 20132014, 20152016, 20172018, and 20192020 California Health Interview Survey (CHIS). Sociodemographic indicators come from the 20082012, 20102014, 20122016, 20142018, and 20162020 American Community Survey (ACS) 5-year summary tables.

Model-based small area estimates (SAEs) are the result of statistical modeling that uses relevant characteristics of populations and geographical areas to predict health conditions for small geographic units (cities, ZIP codes, legislative districts, and some counties). SAEs are not direct estimates (estimates produced directly from survey data, such as those provided through AskCHIS). While direct estimates are produced solely using survey data and design weights, the model-based estimates in AskCHIS NE also rely on secondary data describing characteristics of both geographic regions and populations.

Results in AskCHIS NE are statistical estimates. The point estimate is a single number that summarizes the sample, such as 8.4% of adults in California were diagnosed with diabetes. Because the estimated value is based on CHIS sample and statistical models, it has a degree of uncertainty, and the confidence interval (CI) shows the range where the actual value may lie. A 95% CI means that if we were to repeat our CHIS sample and modelling approach for a large number of times, then 95% of the time, the ACTUAL value will lie between the lower and upper CI range.

5-Digit ZIP Code Tabulation Areas (ZCTAs) are approximate representations of U.S. Postal Service 5-digit service areas. The Census Bureau defines ZCTAs by allocating each Census block that contains addresses to a single ZIP Code Tabulation Area, usually to the ZCTA that reflects the most frequently occurring ZIP code for the addresses within that tabulation block. For more information, please visit our GIS Information page and information available from the Census Bureau.

Legislative districts (Assembly, Senate, U.S. Congressional) boundaries cross ZIP codes, cities, and county boundaries. For this reason, our legislative lookup tool requires that users enter either their full address (ie. 10960 Wilshire Blvd, Los Angeles, CA 90024). If you have entered your full address and are not getting any results, please contact us at askchis@ucla.edu.

While AskCHIS NE has data on all ZCTAs and cities in California, two factors may influence our ability to display the estimates:

  1. A small population (under 1,000): currently, the application only shows estimates for geographic entities with populations above 1,000. If your ZCTA/city has a population below this threshold, the easiest way to obtain data is to combine it with a neighboring ZCTA/city and obtain a pooled estimate.
  2. A high coefficient of variation: high coefficients of variation denote statistical instability.

The population estimates in AskCHIS NE represent the CHIS 20112012, 20132014, 20152016, 20172018, and 20192020 population sample, which excludes Californians living in group quarters (such as prisons, nursing homes, and dormitories). The population estimates are generated based on Nielsen-Claritas Popfacts (demographic projection), non-group quarter proportions from Census 2010 at the Census tract level, and CHIS control totals (based on the California Department of Finance). CHIS uses a weighting methodology that forces CHIS estimates to be consistent with official population estimates at county-level from California Department of Finance. These benchmark population estimates are called population control totals. Further information on the CHIS control totals is available in the CHIS 20112012, 20132014, 20152016, 20172018, and 20192020 Methods Report #5. Due to differences in the target population as well as in data sources for control totals, our population estimates may differ from other sources.

A population universe is the population for which the health estimate is defined, for example, the population universe for flu vaccine among children ages 6 months to 11 years is the total number of children ages 6 months11 years within each individual location. For each health indicator in AskCHIS NE, the point estimate represents the proportion of the population with the condition among the specified population universe. The population universe is the denominator, whereas people with the condition is the numerator.

The application allows users to export the data behind the table with a limit of 5 total geographic areas. This limitation is mainly due to screen space constraints related to data display. If you'd like access to more of the data that backs AskCHIS NE, please contact our Data Access Center at dacchpr@ucla.edu.

While the initial breadth of health estimates is comprised of 1520 health indicators, we look forward to working with the public health community to support adding more indicators to AskCHIS NE. If your organization has a need for a specific health indicator, please contact us at askchis@ucla.edu. You can also find data on other CHIS indicators by using AskCHIS or the CHIS Public Use Files.