Authors: Alexandre Sorokine*, Oak Ridge National Laboratory, Jason Kaufman, Oak Ridge National Laboratory, Jacob Arndt, Oak Ridge National Laboratory, Robert Stewart, Oak Ridge National Laboratory
Topics: Cyberinfrastructure, Geographic Information Science and Systems
Keywords: geospatial ontology, semantics, machine learning
Session Type: Virtual Paper
Start / End Time: 11:10 AM / 12:25 PM
Room: Virtual 46
Presentation File: No File Uploaded
Integration and conversion of large and semantically diverse geodatasets is a challenge. For example, Volunteered Geographic Information (VGI) like OpenStreetMap and similar products often contains thousands of tags with a large spectrum of their meanings. More formalized data like topographic maps and navigation charts have extensive well-documented features catalogs. However, these catalogs are not uniform across the vendors, agencies, or countries that create these products. Here we investigate an automated approach that uses a combination of manually created rules and machine learning for data transformation, rule generation, and validation of the conversion results. The proposed approach is built around an extensive database of rules formalized from expert knowledge and recommendations. The rules are applied automatically to create output representations. Rule validation is performed by matching and comparing features with independent datasets manually created by experts. Validation results are used to ensure rule correctness, find the shortcomings in the expert knowledge, and create rules that were deduced from the matched features. To improve the rules, we use several machine learning algorithms and natural language processing methods. We investigate issues that arise from incompatible feature definitions and how to transform one semantic model into another. The outcome of the study will improve understanding the challenges of transforming one dataset into another and can be used for automation of geodata conflation and conversion operations.