Spatio-Semantic Comparison Used in POI Deduplication and Dataset Enhancement

Authors: Joseph Bentley, Oak Ridge National Laboratory, Gautam Thakur*, Oak Ridge National Laboratory, Kelly Sims, Oak Ridge National Laboratory, Jamie Wray, Oak Ridge National Laboratory, Chantelle Fortier, Oak Ridge National Laboratory, Kevin Sparks, Oak Ridge National Laboratory, David Sheldon, Oak Ridge National Laboratory
Topics: Geographic Information Science and Systems, Temporal GIS, Urban Geography
Keywords: Points of Interest, PlanetSense, Platial
Session Type: Virtual Paper
Day: 4/9/2021
Start / End Time: 8:00 AM / 9:15 AM
Room: Virtual 22
Presentation File: No File Uploaded


Points of Interest (POIs) guide decision-making on a daily basis by offering valuable information about a location as well as the region and population associated with it, from publicly available resources like Google Maps to large scale projects such as FEMA relief efforts and urban planning. Although POI datasets have a wide range of potential applications, data coreference (any two data that represent the same establishment) frequently creates barriers for analysts. By assessing spatial, nominal, and categorical similarity between POIs our research attempts to remove redundant information while, when possible, improving the accuracy of specific data through cross-validation. Using a collection of 137,990 POIs from eight different sources from within the borders of Mali, 79,384 POIs were identified as exact or near-exact copies, while an additional 3,998 POIs were used to generate a semi-verified subset of 1,644 POIs. Resulting in a 32% reduction in overall POI count (93,562), these methods improve the reliability of a wide range of geospatial analyses, potentially revealing new insights about the world.

Abstract Information

This abstract is already part of a session. View the session here.

To access contact information login