In order to join virtual sessions, you must be registered and logged-in(Were you registered for the in-person meeting in Denver? if yes, just log in.) 
Note: All session times are in Mountain Daylight Time.

Predicting dialect use – Insights from a large-scale project combining historical and current data

Authors: Péter Jeszenszky*, University of Berne, Switzerland Geographisches Institut, Carina Steiner, University of Berne, Adrian Leemann, University of Berne
Topics: Spatial Analysis & Modeling, Geographic Information Science and Systems, Cultural Geography
Keywords: linguistic geography, dialectology, language change, spatiotemporal change,
Session Type: Paper
Day: 4/10/2020
Start / End Time: 11:10 AM / 12:25 PM
Room: Virtual Track 4
Presentation Link: Open in New Window
Presentation File: No File Uploaded


Web-crawled corpora on social media communication (e.g. tweets, personal messages) can be used with an increasing reliability to define the probability of people’s social traits, age, emotions, attitudes, and whereabouts solely based on writing style and words used. But would it be possible to estimate language use based on known social traits and location (history)? To address this question, our research combines the power of language data and metadata from different systematic dialect surveys in German-speaking Switzerland, known for its uniquely diverse dialects that are associated high prestige. We include historical data from two comprehensive surveys around the 1950s (variables on all levels) and the 2000s (morphosyntactic variables). In our current project ‘Swiss German Dialects Across Time and Space’ (SDATS), we partially replicate these surveys and generate a contemporary corpus of the same locations, enabling us to define spatio-temporal trends. As the linguistic surveys also contain socio-demographic metadata, we are able to quantify and connect language change to spatial and socio-demographic factors. We use redundancy analysis, multidimensional clustering methods, local and regionalized logistic regression predictors for our estimations. Based on the ‘apparent time hypothesis’ and the aforementioned factors’ effects it becomes possible to predict the probability to which a certain unsurveyed person would use language at a dialectal level. Besides studying language change, this enables creating dialectal speaker profiles of all Swiss German municipalities, which in the long run would have forensic value and could be beneficial for tools of regionalized natural language processing.

Abstract Information

This abstract is already part of a session. View the session here.

To access contact information login