Wednesday, November 25, 2020

Investigating network connectivity for different modes

 Suggested by: Martin Loidl 

Short description: Connectivity is a fundamental spatial parameter. In road networks, connectivity decides on opportunities for moving through space in a comfortable way. Literature suggests that connectivity in networks has a direct effect on mobility behaviour.
Especially in urban environments, connectivity varies between different modes. Cities are planned and built for the predominant mode (from walkable cities to car-oriented cities). Based on OpenStreetMap (OSM), different subsets of urban road networks should be investigated for their connectivity. We hypothesize to find a significant correlation between the connectivity for different modes and the respecitve modal share.


In this research, OSM data should be acquired, processed and analyzed (semi-) automatically, in order to facilitate the calculation of connectivity indices for different modes and to compare these indices among various cities. Moreover, connectivity indices should be related to modal split statistics for the respective regions.

References, suggested reading:
  • LOWRY, M. & LOH, T. H. 2017. Quantifying bicycle network connectivity. Preventive Medicine, 95, S134-S140.
  • ABAD, L. & VAN DER MEER, L. 2018. Quantifying Bicycle Network Connectivity in Lisbon Using Open Data. Information, 9.
  • PORTA, S., CRUCITTI, P. & LATORA, V. 2006. The network analysis of urban streets: A dual approach. Physica A: Statistical Mechanics and its Applications, 369, 853-866.
  • HOCHMAIR, H. 2020. Von A nach B? – Erreichbarkeits- und Konnektivitätsanalysen in Verkehrsnetzwerken. In: ZAGEL, B. & LOIDL, M. (Eds.) Geo-IT in Mobilität und Verkehr. Berlin und Offenbach: Wichmann Verlag / VDE.
Related to projects: POSITIM

Start/finish: anytime

Prerequisites/qualifications: Profound skills in data modellingand processing. Scripting, database and spatial analytics (network analysis) skills.

Tuesday, November 10, 2020

Crowdsourcing data from fitness devices

 Suggested by: Martin Loidl

Short description: An increasing number of persons is using fitness devices for monitoring their health status, training and physical condition. This so-called quantified self movement generates enormous amounts of data, which are useful for research in different data-heavy domains, such as mobility research, digital health, urban science or public health. However, most of these data are locked-in proprietary environments of fitness device producers. Depending on their business model, they are offering analysis services to consumers or run secondary data-driven businesses. Currently, research that is based on these data can only be done in close collaboration with producers.
A crowdsourcing approach would unlock the potential of these individual data for research purposes independently from proprietary systems. For this, users would need to donate their data to the crowdsourcing platform. This could be done by exporting data from the respective proprietary system and upload the extract to the crowdsourcing platform.

We are interested in the conceptual, technical, organizational and legal framework for setting up a crowdsourcing platform for data from fitness devices. Thus, we invite students to address the following aspects in their research:
  • Defining a common data (base) schema for harmonizing data from different vendors.
  • Migrating different export formats into a common data base.
  • (Semi-) automatical data processing (validity check, cleaning).
  • Ensuring privacy (GDPR) and data security. 
  • Motivation for data donators.
We can provide a test data set from a popular fitness watch vendor.

References, suggested reading:
  • SWAN, M. 2013. The Quantified Self: Fundamental Disruption in Big Data Science and Biological Discovery. Big Data, 1, 85-99.
  • GRASER, A. 2019. MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum, 54-68.
  • LOIDL, M., STUTZ, P., FERNANDEZ LAPUENTE DE BATTRE, M. D., SCHMIED, C., REICH, B., BOHM, P., SEDLACEK, N., NIEBAUER, J. & NIEDERSEER, D. 2020. Merging self-reported with technically sensed data for tracking mobility behavior in a naturalistic intervention study. Insights from the GISMO study. Scandinavian Journal of Medicine & Science in Sports, 30, 41-49.
  • KOUNADI, O. & RESCH, B. 2018. A Geoprivacy by Design Guideline for Research Campaigns That Use Participatory Sensing Data. Journal of Empirical Research on Human Research Ethics, 13, 203-222.
Related to projects: Bicycle Observatory (

Start/finish: anytime

Prerequisites/qualifications: Profound skills in data modelling, management and processing. Scripting and database skills.

Monday, July 20, 2020

Spatio-temporal Patterns in Epidemiological Spread of COVID-19

Correlating User-generated Data with Public Health Data

Suggested by: Bernd Resch and Stefan Kienberger

Short description: Predicting epidemiological spreads in space and time is challenging because official health data is oftentimes not publicly available and partly outdated and often limited in its spatial and temporal resolution. Thus, this thesis will develop methods for identifying spatio-temporal patterns in COVID-19, dengue and chikungunya cases. The developed methodology will identify hot spots in space and time based on user-generated data (e.g., Twitter) and correlate the results with official and ancillary data (e.g. socio-economic & environmental data). Depending on the progress of the thesis, additional goals comprise the implementation of a tool for dynamic visualisation and the development of a method for forecasting the further spread of a disease, and investigating relationships with socio-economic information. The results can potentially support health institutions in planning vaccination and preparing for a rapidly spreading epidemic. The thesis has extraordinary practical relevance due to the COVID-19 pandemic and because chikungunya is currently spreading over South, Middle and North America, and Dengue fever is prevalent in many tropical and sub-tropical regions across the globe and increasingly impacts urban agglomerations.

The master thesis will be carried out together with Harvard University’s School of Public Health (HSPH) and the Center for Geographic Analysis (CGA).


Hagenlocher, M., Delmelle, E., Casas, I., and Kienberger, S. (2013) Assessing socioeconomic vulnerability to dengue fever in Cali, Colombia: statistical vs expert-based modeling. International journal of health geographics, 12(1), 36.
Jaenisch, T. and Patz, J. (2002) Assessment of Associations between Climate and Infectious Diseases; a Comparison of the Reports of the Intergovernmental Panel on Climate Change (IPCC), the National Research Council (NRC), and United States Global Change Research Program (USGCRP). Global Change & Human Health 2002, Volume 3(1), 2-7.
Resch, B. (2013) People as Sensors and Collective Sensing - Contextual Observations Complementing Geo-Sensor Network Measurements. In: Krisp, J. (2013) Advances in Location-Based Services, ISBN 978-3-642-34202-8, Springer, Berlin Heidelberg, pp. 391-406.

Start date: ASAP

Prerequisites/qualifications: experience with analysing VGI, data visualisation, and spatio-temporal pattern detection

Micro-economic Innovation Indicators in Web Data

Suggested by: Bernd Resch and Jan Kinne (Leibniz Centre for European Economic Research)

Short description:

The location pattern of any industry is the product of a large number of individual decisions. Industrial location analysis investigates these location decisions and seeks to detect location determinants that trigger and influence such decisions. These determinants are generally referred to as location factors. A thorough understanding of the impact of location factors on firms’ location decisions and firm performance can have important implications for stakeholders. Managers and entrepreneurs can integrate valuable information into the decision-making process when choosing the location of a new venture. Some location factor-firm relationships which are relevant at the macro level (aggregate) may not be so at the micro level (ecological fallacy). Suitable data for microgeographic analysis has become available only recently through the emergence of Volunteered Geographic Information (VGI) and the increasing availability of official (open) geodata.

Kinne and Resch (2018) combined open geodata, Volunteered Geographic Information (VGI), and a comprehensive firm dataset (the Mannheim Enterprise Panel - MUP) containing approximately three million firm observations to empirically estimate the relationship between a set of location factors and the number of local software firms in Germany (see figure).  They concluded that the microgeographic level of analysis provided new insights into the firm site selection process. However, they also pointed out the particular requirements to the statistical model and the data employed in a microgeographic location analysis, like the need for high resolution geodata, which was not available in all domains. They showed that this problem was most severe in cities, which often feature segregated populations and districts with very different socio-economic profiles. In the context of this master thesis, the research conducted by Kinne and Resch in Germany shall be extended to the geo-economic context of the USA, where higher-resolution socio-economic geodata are available. Furthermore, a comparison between the results for Germany and the USA shall be carried out.


Kinne, Jan und Bernd Resch (2018), Analyzing and Predicting Micro-Location Patterns of Software Firms, ISPRS International Journal of Geo-Information 7, 1.

Rammer, Christian, Jan Kinne und Knut Blind (2019), Knowledge Proximity and Firm Innovation: A Microgeographic Analysis for Berlin, Urban Studies.

Start date: ASAP

Prerequisites/qualifications: experience with analysing web data and social media, interest in economic geography/economics, OpenStreetMap, Regression analysis (optional)