Authors: Marina Toger*, Uppsala University, Ian Shuttleworth, Queen's University Belfast, John Östh, Uppsala University
Topics: Quantitative Methods
Keywords: Big data, Mobile phones, temporal variability
Session Type: Paper
Start / End Time: 3:20 PM / 5:00 PM
Room: Bayside A, Sheraton, 4th Floor
Mobile phone data with file sizes scaling into Terabytes easily overwhelm the analytical capacity available to researchers. In addition, data access is often limited to particular subsets, restricting analyses to cover only single days, weeks, or cities. Withal, it is frequently impossible to set a particular analysis in its context and to know how typical it is compared to other days or months. This is important for academic referees questioning research on mobile phone data and for the analysts, in deciding how to sample and how much data to process, and which events are really anomalous. This paper provides an overview of a large mobile phone dataset over 500 days of events at 5-minute intervals 24/7 to answer these basic but necessary questions. We show that file size is a robust proxy for the number of events contained by profiling the temporal variability of the data at an hourly, daily and monthly level. We then apply time-series analysis to isolate temporal signals and periodicity. Finally, we set confidence limits to the anomalous events in the data, such as the Stockholm lorry attack in April 2017 and the following three days. We recommend an approach to mobile phone data and suggest that ideally data should be sampled across the day, across the working week, and across the year to work towards a representative average. However, where this is impossible, the amount of temporal variability is such that most weekdays’ data are a fair picture of other days in their general structure.