Risk Factors and Geographic Variations of Late-Stage Breast Cancer Diagnosis among U.S. Counties, a Machine Learning Approach

Authors: Weichuan Dong*, Kent State University
Topics: Health and Medical, Medical and Health Geography, United States
Keywords: Health Geography, Breast Cancer, Disparity, Machine Learning
Session Type: Virtual Guided Poster
Day: 4/9/2021
Start / End Time: 8:00 AM / 9:15 AM
Room: Virtual 53
Presentation File: No File Uploaded


Understanding the risk factors and geographic patterns behind late-stage breast cancer (LSBC) at diagnosis is critical for cancer screening promotions. However, mixed results have been found in identifying geographic areas with higher rates of LSBC. With highly variable geographies among U.S counties, population characteristics are likely to present nonlinear associations with LSBC due to their unbalanced distribution across space. These complicated scenarios may draw inaccurate conclusions if inappropriate models are used. Researchers have attempted to avoid these complications by performing analyses in different regions independently, but a more unified model is needed to cover all risk factors and geographies together. Using the SEER cancer registry data, we classified six phenotypes of counties based on LSBC and associated characteristics by using the classification and regression tree machine learning method. Among those, uninsured rate, obesity, income, education, and top-tier occupations were found to be the strongest predictors of LSBC. By showing the phenotypes on maps, we observed that favored phenotypes were dominant in some states, whereas unfavored phenotypes were dominant in others, indicating geographic clustering across the study area (P<0.001). Our findings suggest that the disparity of LSBC was found at the state level with the associated characteristics highly variable across space. The study also indicated that the phenotypes classified from the aspatial machine learning method also presented spatial clustering. Therefore, geography might have been served as a medium where the patterns of LSBC were formed, and the associated characteristics may be a reflection of the mechanisms that formed the clustering.

Abstract Information

This abstract is already part of a session. View the session here.

To access contact information login