
2026-05-06, EGU 2026, Vienna

Moran’s I measures similarity among neighboring residuals
\[ I = \frac{n}{\sum_{i}^{n} \sum_{j}^{n} w_{ij}} \times \frac{\sum_{i}^{n} \sum_{j}^{n} w_{ij} (x_i - \bar{x}) (x_j - \bar{x})}{\sum_{i}^{n} (x_i - \bar{x})^2} \]
Spatial weight defines which observations are considered neighbors.

kNN: fixed number of neighbors, variable geographic extent
Larger k captures broader spatial structure


Distance-based: fixed geographic extent, variable number of neighbors
Smaller distance captures finer spatial structure, larger distance captures broader structure

Semivariogram represents dissimilarity of observations as a function of distance, capturing spatial structure across distances

SSVR (Kerry and Oliver, 2008): share of spatially structured variance
An overall summary; ignores where in distance the structure occurs
Closer to 1 means stronger spatial structure

AUC of variogram (Poggio et al., 2019)
Integrates the spatial structure across distances, providing a single summary metric
Larger AUC values indicate stronger spatial dependence
Simulated rasters with three autocorrelation ranges (10, 50, 100 units)
Random forest models fitted on samples of 500 points

Diagnostic metrics:
\[ RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2} \]
Metrics calculated on testing residuals are mostly comparable to those calculated on complete rasters
An exception is Moran’s I (kNN), with testing values much lower than complete values
For most metrics, variability decreases as test set size grows, while mean values stay stable
Exception is again Moran’s I (kNN), which shows a strong increase in mean values with larger test set size

Clustered sampling affects all metrics, leading to higher variability and often incorrect mean values.1
Based on the results for range = 100 and testing size = 500 with random sampling
| Testing | Complete | |
|---|---|---|
| Moran's I (kNN) | 0.36 | 0.43 |
| Moran's I (distance) | 0.25 | 0.37 |
| SSVR | 0.39 | 0.32 |
| AUC | 0.98 | 0.90 |
Variogram AUC shows the strongest correlation with RMSE – it is a multiscale summary of spatial structure, it captures the overall spatial autocorrelation of residuals. This is is closely related to model performance
| Metric | What it tells | Pitfall |
|---|---|---|
| Moran’s I (kNN) | Autocorrelation at some distance (?) (under fixed neighbor count) | Highly sensitive to k, sample size, and sampling design |
| Moran’s I (distance) | Autocorrelation within a chosen distance range | Sensitive to distance thresholds; more computationally demanding |
| SSVR | Share of spatially structured variance | Requires variogram fit; unstable with small n |
| Variogram AUC | Overall spatial structure across distances | Can track RMSE closely (maybe redundant?) |
Website: https://jakubnowosad.com
Slides: https://jakubnowosad.com/egu2026 Software: http://jakubnowosad.com/sacmetrics/


