Investigating Moran’s I Properties for Spatial Machine Learning

Jakub Nowosad, Hanna Meyer

the 28th AGILE conference

2025-06-11

Spatial Machine Learning

Traditional machine learning models (e.g., SVM, RF, GBM) lack inherent spatial awareness

Ignoring spatial structure can lead to poor predictive performance, biased predictions, or poor generalization

Incorporating spatial information:

Add spatial proxies (e.g., coordinates, Euclidean distances) as predictors
Use distance-based spatial predictors or spatial weighting matrices
Apply spatially-aware cross-validation for feature selection and tuning
Use spatially-enhanced models (e.g., Geographical RF, RF-GLS)
Use spatially-aware metrics (e.g., Moran’s I) to understand spatial autocorrelation and assess model performance

Moran’s I for SML

Moran’s I is used to assess spatial autocorrelation before and after modeling

Pre-modeling: helps understand spatial structure in the data
Post-modeling: applied to residuals to assess model performance

What are the properties and limitations of Moran’s I when applied to spatial machine learning?

Moran’s I

\[ I = \frac{n}{\sum_{i}^{n} \sum_{j}^{n} w_{ij}} \times \frac{\sum_{i}^{n} \sum_{j}^{n} w_{ij} (x_i - \bar{x}) (x_j - \bar{x})}{\sum_{i}^{n} (x_i - \bar{x})^2} \]

\(n\): number of observations
\(x_i\), \(x_j\): values of the observations at locations \(i\) and \(j\)
\(\bar{x}\): mean value of the observations
\(w_{ij}\): spatial weight between the observations at locations \(i\) and \(j\)

Spatial weight defines which observations are considered neighbors.

Various types of spatial weights can be used — this decision affects the value of Moran’s I.

Simulation Setup

Three ranges of spatial autocorrelation (10, 50, 100 units)

\[ \phantom{x} \]

Simulation Setup

Three ranges of spatial autocorrelation (10, 50, 100 units)

\[ Y = X_1 + X_2 \cdot X_3 + X_4 + X_5 \cdot X_6 + \mathcal{E} \]

All repeated 100 times.

Modeling Setup

Four training set sizes
Two training set sampling types

Modeling Setup

Random Forest modeling approach:

Extracted covariate and outcome values from rasters for training samples
Trained Random Forest (RF) models with 500 trees
Tuned mtry parameter (values: 2, 3, 4, 5, 6)
Selected final model based on lowest RMSE from out-of-bag (OOB) samples

Total number of models: 2400

Validation Setup

Complete validation raster
Four test set sizes
Two test set sampling types

Model Evaluation Metrics

RMSE

\[ RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2} \]

Moran’s I

\[ I = \frac{n}{\sum_{i}^{n} \sum_{j}^{n} w_{ij}} \times \frac{\sum_{i}^{n} \sum_{j}^{n} w_{ij} (x_i - \bar{x}) (x_j - \bar{x})}{\sum_{i}^{n} (x_i - \bar{x})^2} \]

Here, we focus on the residuals of the model predictions, and thus:

\[ x_i = y_i - \hat{y}_i \]

Eight closest cells or point samples were used to calculate the Moran’s I value.

Validation Setup

Model Evaluation Metrics

Model 45: range 100, 500 random training samples, 500 random testing samples

RMSE of the training sample follows the RMSE of the complete raster

Moran’s I of the training sample is much lower than the Moran’s I of the complete raster

(but slightly higher than the Moran’s I of the testing sample)

The variability of RMSE of the testing sample, as compared to the complete RMSE, is getting lower with the increase of the sample size

The variability of Moran’s I, as compared to the complete Moran’s I, is also getting lower with the increase of the sample size, but also its values are changing

More testing samples result in higher values of Moran’s I

Moran’s I values are different between random and cluster sampling of the testing set

For a cluster sampling of the testing set, the correlation between Moran’s I and RMSE values increase with the increase of the sample size

For a random sampling of the testing set, the correlation is low or non-existent

Conclusions

Moran’s I is highly sensitive to spatial weight definitions (e.g., neighborhood choice) – please report it
In spatial ML, Moran’s I can be useful for assessing the spatial autocorrelation of residuals in the testing set
However, unlike RMSE, Moran’s I for the testing set does not reflect overall prediction performance
Instead, it is influenced by the sampling strategy and sample size (sampling density). It indicates how well the model captures spatial structure at the testing set — typically at a much finer scale than the resolution of the complete raster
Therefore, Moran’s I should not be used to compare performance across different studies. However, it may be useful for comparing models within the same study

Contact

Website: https://jakubnowosad.com

Mastodon: fosstodon.org/@nowosad

Resources

Slides: https://jakubnowosad.com/agile-gi2025/

Paper: https://doi.org/10.5194/agile-giss-6-40-2025

Code examples: https://github.com/Nowosad/moran-i-spatial-ml-prelim