A Spark-based Framework for Big GeoSpatial Raster Data Management and Mining

Authors: Fei Hu*, George Mason University, Chaowei Yang, George Mason University
Topics: Geographic Information Science and Systems, Remote Sensing, Cyberinfrastructure
Keywords: Big Data, Spark, HDFS, Remote Sensing, CNN, Deep Learning
Session Type: Paper
Day: 4/11/2018
Start / End Time: 8:00 AM / 9:40 AM
Room: Studio 7, Marriott, 2nd Floor
Presentation File: No File Uploaded


Earth observation and simulation are generating large volume of geospatial raster data at an unprecedented growth. However, the large data volume and complex contend pose grand challenges for both data engineers and data scientists to efficiently processing and analyze these data. This paper proposes a Spark-based framework with Hadoop Distributed File System (HDFS) to improve the efficiency of geospatial raster data processing in parallel, and also integrate Tensorflow to support CNN-based image processing models. The experiment results demonstrate that the proposed system could not only provide a scalable fashion to query big geospatial raster images, but also speed up the CNN-model inference processing.

Abstract Information

This abstract is already part of a session. View the session here.

To access contact information login