Wednesday, October 29, 2025

Modeling Place Vulnerability to Explosive Disease Outbreaks

Suggested by: Christian Neuwirth (Z_GIS – Spatial Simulation)
 
Short description:

In addition to the basic reproduction number, R0, the overdispersion parameter, k, plays a crucial role in characterizing the spread of infectious diseases. Estimates for COVID-19 indicate that the dispersion parameter k is approximately 0.1, suggesting that 80% of transmissions have been caused by only 10% of infectious individuals [1]. 

Observed overdispersion can arise from various factors. For instance, the same pathogen may exhibit different behaviors across individuals, e.g. the infectious period is better represented as a distribution rather than a fixed constant [2]. 

Additionally, observed overdispersion in disease transmission may stem from overdispersion in social contact networks. For example, a French social contact survey caried out by [3] demonstrated that a small number of individuals account for a disproportionately large share of overall social contacts, while many individuals have few or no social interactions.
Modeling experiments indicate that outbreaks within such social networks tend to be particularly explosive (Fig. 1).


 
Figure 1. The blue curves represent simulated outbreaks in empirical social networks exhibiting overdispersion, while the red curves depict outbreaks in networks where every individual has an equal number of social contacts. The basic reproduction numbers are as follows: R0=1.8 (A), R0=2.5 (B), R0=3.1 (C), and R0=3.7 (D).
 

Hypothesis: It is hypothesized that overdispersion in social contact networks is influenced by the physical structures of space, such as transportation infrastructure and other elements of the built environment. For instance, recent investigations showed that hierarchical cities are more vulnerable to the rapid spread of infectious diseases than decentralized cities [4]. In other words, overdispersion in physical structures translates into overdispersion in social contact networks, which in turn leads to overdispersion in disease transmission and explosive outbreaks.

The aim of this thesis is to quantify the vulnerability of locations to epidemic outbreaks by analyzing their structural properties.

Method: (1) Quantify the overdispersion parameter k in physical infrastructures using data from OpenStreetMap or open air travel network data (with the appropriate scale to be determined), (2) Run network simulations in a SIR-model (model is available) using the empirical parameter k as an input, (3) Compare epidemic doubling time in the simulation with empirical COVID-19 excess mortality doubling time at selected sites using a ranking scale approach.

Start: ASAP

Prerequisites/qualifications: 
Interest in spatial simulation and scripting (NetLogo, Python, R or GAMA)

Please contact Christian Neuwirth in case of interest: christian.neuwirth@plus.ac.at

References:

  1. K. Sneppen, B. F. Nielsen, R. J. Taylor, and L. Simonsen, “Overdispersion in COVID-19 increases the effectiveness of limiting nonrepetitive contacts for transmission control,” Proceedings of the National Academy of Sciences, vol. 118, no. 14, p. e2016623118, 2021.
  2. A. L. Lloyd, “Destabilization of epidemic models with the inclusion of realistic distributions of infectious periods,” Proceedings of the Royal Society of London. Series B: Biological Sciences, vol. 268, no. 1470, pp. 985–993, 2001.
  3. G. Béraud et al., “The French connection: the first large population-based contact survey in France relevant for the spread of infectious diseases,” PloS one, vol. 10, no. 7, p. e0133203, 2015.
  4. J. Aguilar et al., “Impact of urban structure on infectious disease spreading,” Scientific reports, vol. 12, no. 1, p. 3816, 2022.
  5. O. Wegehaupt, A. Endo, and A. Vassall, “Superspreading, overdispersion and their implications in the SARS-CoV-2 (COVID-19) pandemic: a systematic review and meta-analysis of the literature,” BMC Public Health, vol. 23, no. 1, p. 1003, 2023.

Thursday, October 16, 2025

Smallholder Farming and Global Crop Masks

Suggested by: Sophia Klaußner, Lorenz Wendt (CDL GEOHUM)

Short description: 
Studies have shown that landcover models are not sufficient in correctly identifying crop land in Sub-Saharan Africa. This has grave implications as early warning systems and the distribution of support in case of emergencies therefore gets significantly delayed.

In this study the student will investigate an area with small-holder farming to evaluate the accuracy of global models for small-holders and develop usability guidance from this. They will create a validation dataset and evaluate several global models for performance with small-scale agriculture and explore the implications related to this. Further they will create a help for deciding what land cover model to use in different contexts.

Suggested Reading:

  • Dlamini, L., Crespo, O., Van Dam, J., & Kooistra, L. (2023). A global systematic review of improving crop model estimations by assimilating remote sensing data: Implications for small-scale agricultural systems. Remote Sensing, 15(16), 4066. https://doi.org/10.3390/rs15164066
  • Kerner, H., Nakalembe, C., Yang, A., Zvonkov, I., McWeeny, R., Tseng, G., & Becker-Reshef, I. (2024). How accurate are existing land cover maps for agriculture in Sub-Saharan Africa? Scientific Data, 11(1), 486. https://doi.org/10.1038/s41597-024-03306-z
  • Ketema, H., Wei, W., Legesse, A., Wolde, Z., Temesgen, H., Yimer, F., & Mamo, A. (2020). Quantifying smallholder farmers’ managed land use/land cover dynamics and its drivers in contrasting agro-ecological zones of the East African Rift. Global Ecology and Conservation, 21, e00898. https://doi.org/10.1016/j.gecco.2019.e00898

Interpretable Multimodal Machine Learning for Predicting and Explaining Livelihood Vulnerability to Drought in East Africa

Suggested By : Leizel De la Cruz, Lorenz Wendt

Objective:
To develop an interpretable machine learning framework for assessing livelihood vulnerability to drought in the arid regions of East Africa by integrating multimodal data (including Earth Observation, socio-economic and ancillary data) to enhance vulnerability prediction and identify the most influential underlying factors using SHapley Additive exPlanations (SHAP), thereby providing actionable insights for enhanced early warning systems.

Short Description:
Assessing livelihood vulnerability to drought is critical for proactive resilience-building in arid and semi-arid areas in East Africa. The traditional index-based methods usually depend on linear assumptions and expert-weighted indicators that might have overlooked the complex and non-linear relationships among factors contributing to vulnerability. This study proposes a data-driven approach that uses machine learning (ML) for prediction and the SHapley Additive exPlanations (SHAP) framework to explain and interpret the model output and identify the most influential predictors.

This research will first integrate different datasets including satellite drought indices; household survey data and other information related to livelihoods to develop a comprehensive dataset of predictors. Different ML models (like regression, XGBoost, random forest) will be trained on the historical vulnerability of livelihoods in district-level and identify the best model performance. The novelty of this research is in the application of SHAP analysis for an attempt to move beyond the "black box" nature of ML, as this approach quantifies the contribution of each identified factors (e.g. rainfall anomaly, SPEI, NDVI, market price fluctuation, distance to market) to the model’s ability to predict vulnerability. By making complex model outputs transparent and actionable, this research may provide a strong decision-support tool for enhancing drought resilience and decrease impacts to livelihoods in highly climate-sensitive regions.

Start: Anytime

Relevant Studies:

  1. Crausbay, Shelley D., Kimberly R. Hall, Molly S. Cross, Meghan Halabisky, Imtiaz Rangwala, Jesse Anderson, and Ann Schwend. 2024. “A Flexible Data-Driven Approach to Co-Producing Drought Vulnerability Assessments.” Ecosphere 15(10): e70040. https://doi.org/10.1002/ecs2.70040
  2. Enenkel, M., Steiner, C., Mistelbauer, T., Dorigo, W., Wagner, W., See, L., Atzberger, C., Schneider, S., & Rogenhofer, E. (2016). A Combined Satellite-Derived Drought Indicator to Support Humanitarian Aid Organizations. Remote Sensing, 8(4), 340. https://doi.org/10.3390/rs8040340
  3. IPCC, 2022: Climate Change 2022: Impacts, Adaptation, and Vulnerability. Contribution of Working Group II to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [H.-O. Pörtner, D.C. Roberts, M. Tignor, E.S. Poloczanska, K. Mintenbeck, A. Alegría, M. Craig, S. Langsdorf, S. Löschke, V. Möller, A. Okem, B. Rama (eds.)]. Cambridge University Press. Cambridge University Press, Cambridge, UK and New York, NY, USA, 3056 pp., doi:10.1017/9781009325844.
  4. Lu, R., Liu, S., Duan, H., Kang, W., & Zhi, Y. (2024). Combining the SHAP Method and Machine Learning Algorithm for Desert Type Extraction and Change Analysis on the Qinghai–Tibetan Plateau. Remote Sensing, 16(23), 4414. https://doi.org/10.3390/rs16234414

Disaggregating Aggregated [Socioeconomic] Data into Grid-Level Representations Using PyInterpolate and Complementary Approaches for Humanitarian Applications

Suggested By: Khizer Zakir, Lorenz Wendt
 

Objective:
To investigate how geostatistical interpolation methods, as implemented in PyInterpolate, can be combined with complementary disaggregation approaches (dasymetric mapping, population weighting, machine learning-based downscaling) to transform aggregated socioeconomic data, such as demographics, education, health, or disease prevalence into fine-grained, grid-level representations. These representations may take the form of regular pixels or hexagonal H3 cells, depending on the resolution and structure most suitable for machine learning models. The thesis will quantitatively evaluate how such disaggregation improves the applicability and performance of ML models in humanitarian scenarios. In addition, this activity could include the uncertainties and explainability of such approaches.
 

Short Description:
Many critical datasets relevant to humanitarian decision-making, such as population demographics, education indicators, healthcare access, or disease spread are typically available only at coarse administrative levels (country, province, district). However, state-of-the-art machine learning models for spatial analysis generally operate on high-resolution gridded data, especially when integrating with environmental or remote sensing datasets. This mismatch in spatial resolution poses a barrier to building comprehensive, data-driven humanitarian models.

This thesis proposes to bridge this gap by studying the use of PyInterpolate and related interpolation/disaggregation systems to generate grid-level approximations of aggregated socioeconomic data. Both pixel grids and H3 hexagonal grids will be evaluated for their suitability in integrating heterogeneous datasets. The study will further assess the uncertainty of disaggregation outputs and their downstream impact on ML-based predictions.

Such an approach can be particularly important in humanitarian applications, where access to high-resolution socioeconomic data is scarce or delayed. Potential applications include:

  • Disease spread modeling, where fine-scale integration of demographic and health data can improve outbreak prediction.
  • Migration and human mobility studies, where disaggregated socioeconomic data can be compared and integrated with environmental drivers (floods, droughts, land degradation) at grid level to better understand displacement dynamics and population movements during crises.
  • Disaster preparedness and response, where combining socioeconomic vulnerability layers with hazard data enables better risk assessments.
  • Resource allocation and crisis monitoring, where timely, high-resolution information supports more equitable and effective humanitarian interventions.

 

Suggested Methodology:

  • Geostatistical interpolation with PyInterpolate (kriging-based techniques).
  • Covariate-driven disaggregation, using auxiliary layers such as land use, night-time lights, road networks, or population density as predictors of within-unit variation.
  • Uncertainty quantification, with a particular focus on Bayesian approaches (Bayesian kriging, Bayesian hierarchical models, or probabilistic ML) to explicitly model uncertainty in disaggregation outputs and evaluate their downstream impact on ML-based predictions.

Start: Anytime

Relevant Studies:

  1. Moliński, S., (2022). Pyinterpolate: Spatial interpolation in Python for point measurements and aggregated datasets. Journal of Open Source Software, 7(70), 2869, https://doi.org/10.21105/joss.02869 
  2. Stevens, Forrest R., Andrea E. Gaughan, Catherine Linard, and Andrew J. Tatem., (2015). “Disaggregating Census Data for Population Mapping Using Random Forests with Remotely-Sensed and Ancillary Data.” PLoS ONE 10 (2): e0107042. https://doi.org/10.1371/journal.pone.0107042 
  3. Wardrop, N. A., et al., (2018). “Spatially Disaggregated Population Estimates in the Absence of National Population and Housing Census Data.” Proceedings of the National Academy of Sciences 115 (14): 3529–37. https://doi.org/10.1073/pnas.1715305115 


Using Location Embeddings for Improving Transferability of Flood Mapping Models

Suggested By: Bruno Menini Matosak, Getachew Gella (CDL GEOHUM)

Keywords: Deep Learning; Flood Mapping; Transferability; Foundation Models

Objective: To quantitatively evaluate how the utilization of location embeddings during training and inference may improve the transferability of flood mapping models.

Short Description: Floods are hazardous events that impact millions of people annually. In this context, it is essential that humanitarian and hazard response are well informed about the extent of affected areas in a timely manner. To achieve this, reducing the time needed to select and train a model is crucial. If a flood mapping model is highly transferable, it can be used successfully in a wider variety of regions, making the process of mapping floods in these areas more straightforward. This approach could be particularly important for regions where there is little or no reference data available for training.

In this proposed master thesis, the student will study how the inclusion of location embeddings generated from the SatCLIP foundation model affect the transferability of common flood mapping models based on convolutional neural networks and image transformers.


Start: Anytime

Relevant Studies:

  1. Klemmer, Konstantin, Esther Rolf, Caleb Robinson, Lester Mackey, and Marc Rußwurm. 2024. “SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery.” arXiv:2311.17179. Preprint, arXiv, April 12. https://doi.org/10.48550/arXiv.2311.17179.
  2.  Bentivoglio, Roberto, Elvin Isufi, Sebastian Nicolaas Jonkman, and Riccardo Taormina. 2022. “Deep Learning Methods for Flood Mapping: A Review of Existing Applications and Future Research Directions.” Hydrology and Earth System Sciences 26 (16): 4345–78. https://doi.org/10.5194/hess-26-4345-2022.

Monday, September 22, 2025

Design for Discovery: Enhancing the Usability of MAP-VERSE

 

Would you like to contribute to open science? Do you like UI/UX design? Are you maybe interested in user testing? 


MAP-VERSE (MAP Usability – Validated Empirical Research by Systematic Evaluation) metadata repository (https://map-verse.github.io/) was established by an international research initiative (https://map-verse.github.io/Repository/page/about/) that believes in open science. This platform provides researchers with structured access to best-practice datasets, specifically from eye tracking, neuroimaging (EEG, fMRI), and human sensing (EDA, cardiovascular activity, skin temperature) collected across various geospatial tasks in-lab, online, in virtual environments, or in real-world scenarios (see Keskin et al. 2025, https://doi.org/10.5194/agile-giss-6-30-2025) 

The BSc thesis will focus on the usability of the MAP-VERSE platform through: 

  • User Interface (UI): adding new functionalities such as (but not limited to): 

  • Enabling search by researcher names, research areas, or cartographic stimulus type 

  • Visualizing links between studies using the same datasets or follow-up studies 

  • User Experience (UX): usability testing with researchers via online questionnaires 

  • Experimental design, data collection, and analysis of user studies 

 

The student is expected to improve the UI of the platform and gather insights on the UX, which could be further implemented. 

This research is planned to be co-supervised by Assoc. Prof. Dr. Vassilios Krassanakis from the University of West Attica, and (when necessary) in collaboration with Tong Qin and Bing He from the MAP-VERSE initiative. 

 

For more information: 
Contact: Dr. Merve Keskin, merve.keskin@plus.ac.at 
Start: As soon as possible 
Prerequisites/qualification: (not necessary but useful) familiarity with website creation on GitHub (e.g. Hugo), interest in empirical user testing  
Keywords: usability, website creation, metadata repository, open science 

Making Research Data Discoverable: Building MAP-VERSE’s Metadata Search Tools

 Would you like to contribute to open science? Do you like a bit of coding? Are you maybe interested in empirical studies? 

MAP-VERSE (MAP Usability – Validated Empirical Research by Systematic Evaluation) metadata repository (https://map-verse.github.io/) was established by an international research initiative (https://map-verse.github.io/Repository/page/about/) that believes in open science. This platform provides researchers with structured access to best-practice datasets, specifically from eye tracking, neuroimaging (EEG, fMRI), and human sensing (EDA, cardiovascular activity, skin temperature) collected across various geospatial tasks in-lab, online, in virtual environments, or in real-world scenarios (see Keskin et al. 2025, https://doi.org/10.5194/agile-giss-6-30-2025) 

The MSc thesis will focus on systematizing the data discovery functionality of MAP-VERSE by developing  


    • Strategies for expanding dataset diversity (e.g. including studies using thematic maps, dashboards, mobile maps, etc.), and  
    • Tools for relevant data collection such as API querying and, when necessary, web crawling on large open-access research data repositories (e.g., Harvard Dataverse, Zenodo, arXiv).  

The student is expected to develop a metadata search and dataset validation tool to ensure metadata consistency and accuracy before inclusion in MAP-VERSE. 

This research is planned to be co-supervised by Assoc. Prof. Dr. Vassilios Krassanakis from the University of West Attica and (when necessary) in collaboration with Tong Qin and Bing He from the MAP-VERSE initiative.  

 

For more information: 
Contact: Dr. Merve Keskin, merve.keskin@plus.ac.at 
Start: As soon as possible 
Prerequisites/qualification: Python and HTML basics  
Keywords: knowledge discovery, development, metadata repository, open science