In order to join virtual sessions, you must be registered and logged-in(Were you registered for the in-person meeting in Denver? if yes, just log in.) 
Note: All session times are in Mountain Daylight Time.

A Probabilistic Principal Components Analysis (PPCA) Approach to Impute Missing Values in Spatiotemporal Datasets

Authors: Behnam Nikparvar*, University of North Carolina - Charlotte, Jean-Claude Thill, University of North Carolina at Charlotte
Topics: Spatial Analysis & Modeling, Geographic Information Science and Systems, Temporal GIS
Keywords: Missing data, Probabilistic Principal Components Analysis, Spatiotemporal Analysis
Session Type: Paper
Presentation File: No File Uploaded


Missing data imputation has less been explored in spatiotemporal analysis. Some studies discard the incomplete data or impute values based on simple statistics (e.g. nearest neighbor or average in either time or space). Others utilize interpolation or prediction methods. Discarding values is not applicable when we are missing a large number of the elements in the observation matrix. Interpolation methods dismiss stochasticity, and prediction methods ignore data from the future. Additionally, the imputation is either based on spatial or temporal variation of data and not both. In this study, we use probabilistic principal components analysis (PPCA) to impute values for missing data considering both spatial and temporal variations in the observation matrix at the same time. We determine the principal axes of the observation matrix using maximum-likelihood estimation and expectation maximization (EM) algorithm in an iterative process. We implement and evaluate the model on nighttime light data, which have frequently been used as a proxy for population change and economic development. Nightlight data are prone to several sources of error including the cloud cover that results in having images with a lot of missing value pixels. Our results show that simple methods of imputing data, especially when a large portion of spatiotemporal observation matrix is missing, produce high RMSE while the value of the same measure of goodness-of-fit for the PPCA based imputation method does not change significantly.

Abstract Information

This abstract is already part of a session. View the session here.

To access contact information login