Last year, I received a grant from the Marie Skłodowska-Curie Actions Postdoctoral Fellowships (MSCA-PF) program for a project called PRISM: PReservation and RecognItion of Spatial patterns using Machine learning. Between August 2024 and August 2026, I am in the Remote Sensing and Spatial Modeling group at the University of Muenster, Germany. The project’s primary goal is to develop and compare methods for validating and including spatial patterns in machine learning. At the same time, MSCA-PF is an opportunity to meet new people, learn new things, and share my experiences with others.
Before starting the project, I wrote a blog post about my insights from the application process. This post is a summary of my first eight months at the University of Muenster, including the project’s progress and various activities I took part in.
Starting the project
There is a significant overhead of moving to a new country (especially with a family and without knowing the language) and starting a position in another institution. Thus, the first several weeks are spent on settling in, getting to know the new environment, and meeting new people. It is also about getting the necessary paperwork done (including anmeldung, bank account, health insurance, and so on), registering the kid in kindergarten, furnishing the apartment, getting an internet connection, and so on. Gladly, people in my research group are friendly and very helpful (which matters a lot – thank you!) and Muenster is a beautiful city. Now, after a few months, those initial challenges feel more distant and less daunting…
I started my stay at the University of Muenster by giving two talks for my research group. The first was a general introduction to myself, my research interests, previous work, and the skills I can bring to the group. The second one was more about the PRISM project, where I presented two elements: a research project and other planned activities.
Late summer meetings
Next, I took part in an Earth Observation Network (EON) summer school in 2024 in the Harz mountains in early September. It is a joint initiative of the University of Muenster, the University of Marburg, and the HAWK University of Applied Sciences and Arts in Göttingen. Students of landscape ecology, ecological informatics, and forestry learn about forest management (both from the perspective of managing areas inside and outside the national park), how to perform field measurements (including traditional methods, drones, sensors, and satellite data), and how to analyze the data using programming languages. During the summer school, I gave a short workshop about “Describing and comparing spatial patterns”.
Later that month, I participated in the Spatial Data Science Across Languages (SDSL) workshop in Prague. This was the second edition of this event that focuses on gathering researchers and developers from different programming languages used in data science for geographical applications, including R, Python, Julia, and Rust. It aims to bridge the gap between these communities, talk about their differences and similarities, and find ways to discuss, collaborate, and synchronize efforts.1 There, I gave a talk “Learning resources and teaching methods” about existing open-source resources for teaching spatial data science in R, Python, and Julia and how various methods can be used to teach spatial data science. I also introduced the geocompx project, which is a community-driven initiative to create a collection of open-source resources for learning and teaching geocomputation in multiple programming languages.
Autumn months
A lot of my time in autumn was spent directly on the PRISM project. I started by reading the literature on spatial patterns and machine learning, focusing on the existing methods for validating and including spatial patterns in machine learning. During that time, I also worked on a few deliverables required by the project, including a Data Management Plan (DMP) and a Career Development Plan (CDP).2
I was also invited to give a talk at the “SAIL – Success Factors in Acquiring InternationaL Projects” program at the University of Muenster. The program is aimed at doctoral candidates and postdocs and is focused on providing them with the necessary skills to apply for international grants. There, I spoke about my experience with the MSCA-PF application process and how I prepared for it.3
In autumn, I also published a blog series on comparing spatial patterns in raster data and made a list of Geospatial Conferences that will take place in 2025.
New R packages
During that time, I also started developing two R packages: simsam and spatialexplain. Given my project’s goals, I needed a reproducible, flexible, and efficient way to simulate spatial data of a given structure and properties. Thus, the goal of simsam is to provide tools for simulating and sampling spatial data. It allows users to create sets of spatial raster data of specified properties (e.g., used then as covariates in machine learning) and also blend them with other data (e.g., used as a response variable). Then, the sampling function selects points from the raster data, which can be used to create training and testing datasets for machine learning. That function is a wrapper around some other existing functions, additionally providing the possibility to create clustered points. The simsam package also provides tools for creating various spatial proxies: coordinates, Euclidean Distance Fields (EDFs), and Oblique Geographic Coordinates (OGCs) based on the provided raster object. To learn more about the package, you can check the package’s vignettes.
The spatialexplain package is, on the other hand, based on the needs of the people in my research group. Some of them are trying to model various environmental properties using machine learning, and they need a way to explain the models, particularly including the spatial context.
This package provides model-agnostic tools for mapping how the input variables affect the model predictions based on the methods, such as SHAP, LIME, Break Down, and Oscillations. In short, it provides a wrapper around the DALEX package, which is a well-known package for model agnostic explanations of machine learning models, but then makes the explanation for each cell in the raster data.4 The outcome of each function is a set of maps that show each predictor’s contribution to the model predictions. The package also provides a few vignettes to help users understand how to use the package and how to interpret the results.
Both packages are already used in our current research, but if you have any comments or suggestions on improving them, please let me know.
Talks and workshops
In November, I went to Seville to give a keynote talk and a workshop at the IIIRqueR conference (3rd Spanish Congress of R Users).5 The conference was organized by the Spanish R community and was focused on the use of R in various fields, but it also excelled in making connections between the people from the R community. My talk “R’s Geospatial Kaleidoscope: Exploring Perspectives, Strengths, and Challenges” was a tour of different perspectives on spatial data analysis in R, including geographical, statistical, computational, visual, and a domain-specific one. Each of these perspectives has its own contexts, challenges, and achievements – I tried to show some of them and explain how they together create a mosaic of the R spatial landscape.
The workshop, “Machine learning approaches for working with spatial data”, started by giving a very simplified spatial machine learning workflow, including data preparation, model training, and prediction. Then, step by step, I included more and more details, focusing on how machine learning of spatial data differs from traditional machine learning. This included explicit spatial feature engineering, spatial hyperparameter tuning, feature selection, cross-validation, and spatial model explanations. The code examples were based on the mlr3 workflow, but the general concepts can be applied to other machine learning frameworks.
I started the following month by giving a talk at the GI-Forum – a seminar organized by the Institute for Geoinformatics at the University of Muenster. The talk entitled “Exploration of spatial patterns in raster data” was a summary of my previous work on describing, comparing, and analyzing spatial patterns in raster data. It was an opportunity to introduce my work to the local community and inform them about the PRISM project.6
Winter months
The remaining winter months were spent working on the various ideas related to the PRISM project – I hope to share some of the results in the future. It also included reading related literature, writing software, and submitting abstracts to conferences.7 We were also visited by guests from other institutions, including Lorena Abad from the University of Salzburg (who works on concepts of vector data cubes for multidimensional data) and Martijn Tennekes from Statistics Netherlands (of the tmap package fame). I also tried to be useful to my research group by helping with the ongoing projects, providing some feedback on their work, and advising the local students.8
Teaching and local conferences
After a few gray months, spring finally came to Muenster, and my family and I spent a lot of time outside, also attending various local events. I also had the first opportunity to teach at the University of Muenster. In early March, Hanna Meyer and I gave a week-long course on spatial data analysis in R to students from the Institute for Landscape Ecology and the Institute for Geoinformatics. It started by introducing the students to the basic concepts of spatial data (data models, sources, and visualization) and then moved to more advanced topics, such as spatial data processing (both vector and raster), spatial transformations, and automation of spatial data processing. The course was a mix of lectures and hands-on exercises, where students could practice the concepts they learned, and the last part of the course was dedicated to project work, where students could work on their own projects and present them at the end of the course. I was generally impressed by how much the students learned in such a short time and were able to apply the concepts they learned to their projects. You can find my slides about “Best practices in code organization” and “landscape metrics” online.
The month of March was also marked by other events. The first one was the FOSSGIS-Konferenz 2025 – a conference focused on open-source software for geoinformatics aimed at the German-speaking community. There, we presented a poster “An Inventory of Spatial Machine Learning Packages in R” – it was our group joint effort to summarize and compare the machine learning frameworks caret, mlr3, and tidymodels in R, and a few other packages that can be used for spatial machine learning tasks. The poster was accompanied by code examples demonstrating how to use these packages for spatial machine learning tasks. It was also a chance to meet the local community members and get the good energy from them.
Finally, there was an event that, I think, perfectly concludes my first eight months at the University of Muenster: the Advances in Spatial Machine Learning 2025 workshop organized by Hanna Meyer and myself. It was a two-day event to which we invited leading researchers in spatial machine learning to talk about the current challenges and future directions of this evolving field. We focused on several key topics: validation and preservation of spatial patterns, analysis and communication of prediction uncertainties, comparability between different machine learning algorithms and frameworks, and standardized documentation protocols. The workshop was structured around a series of presentations and discussions, where participants shared their insights and current challenges in spatial machine learning. I learned a ton, and we plan to write down the workshop’s outcomes in a paper. The workshop was also an excellent opportunity to see some familiar faces and meet new people.
Looking ahead
The next few months will build on the progress already made. I plan to continue advancing the PRISM project, focusing on software development and writing reports summarizing my findings. I also plan to attend local courses and do some more teaching.
Moreover, I will attend two conferences in the summer. The first one is the Association of Geographic Information Laboratories in Europe (AGILE) Conference 2025 in Dresden, where I have a talk about spatial autocorrelation in machine learning. The second one is the Living Planet Symposium 2025 in Vienna, where my group has a poster on the standardized documentation of machine learning models. I look forward to meeting some of you there!
This blog post summarizes 1/3 of my MSCA-PF project. Thanks for reading it – I hope you found some of it interesting or useful.
Footnotes
You can read more about the outcomes of these meetings at https://arxiv.org/pdf/2503.16686.↩︎
DMP describes how I will manage the data generated during the project, while the CDP describes my career goals and how I plan to achieve them.↩︎
Again, you can read more about it in my previous blog post.↩︎
This operation can be computationally expensive, thus, the package offers an option of making the explanations on an aggregated version of the raster data.↩︎
Many thanks to Francisco Rodriguez Sanchez and the rest of the organizers for the invitation and their hospitality.↩︎
In my opinion, it was my weakest talk that year, but I hope it was still useful for the audience.↩︎
And the Geocomputation with Python book was published then.↩︎
Additionally, I was doing general academic work, including reviewing conference abstracts, software, and journal papers.↩︎
Reuse
Citation
@online{nowosad2025,
author = {Nowosad, Jakub},
title = {The {PRISM} Project’s Progress Report: 8 Months into the
{MSCA-PF} Journey},
date = {2025-04-13},
url = {https://jakubnowosad.com/posts/2025-04-13-msca-bp2/},
langid = {en}
}