Authors: Emily Molfino*, U.S. Census Bureau, Shawn Bucholtz, Housing and Urban Development, Jed Kolko, Indeed
Topics: Applied Geography, Quantitative Methods, Urban Geography
Keywords: urban, suburban, machine learning, american housing survey
Session Type: Paper
Start / End Time: 8:00 AM / 9:15 AM
Room: Plaza Court 4, Sheraton, Concourse Level
Presentation File: No File Uploaded
Classifying areas as rural, suburban, and rural is of interest to researchers and policymakers alike. Yet, even local residents may disagree given their own context and lived experience. These subjective features, we argue, form a crucial element of any classification of an area as rural, urban, or suburban. Data have not been available at a national scale for the creation of such an indicator until the 2017 American Housing Survey (AHS) asked respondents which type of area they reside. This nationally representative survey allows us to build an urban, suburban, and rural indicator at a granular level that captures both socioeconomic and respondent-level elements, combining the objective metrics with subjective perceptions. We employ classification models to create small area estimates of urban, suburban, and rural areas. We accomplish this by tuning and training a random forest classifier on weighted AHS respondent data and additional tract level estimates. We use this trained model to predict the 2013-2017 American Community Survey (ACS) microdata. Predictions are then aggregated up to provide an indicator of how the majority of households in an area (tract, county, and congressional district) view where they live. This approach provides an illustrative example of the use of existing federal data to create innovative new data products of substantial interest to researchers and policy makers alike.