EON Summer School 2024
2024-09-05
Website: https://jakubnowosad.com/
Discovering and describing patterns is a vital part of many spatial analysis However, spatial data is gathered in many ways and stored in forms, which requires different approaches to understanding spatial patterns
Discovering and describing spatial patterns is an important part of many geographical studies, and spatial patterns are linked to natural and social processes.
Evaluation of the susceptibility of forest landscapes to agricultural expansion
Bourgoin et al., 2020, 10.1016/j.jag.2019.101958
Reinterpretation of histological images as categorized rasters and their use for disease classification (e.g., liver cancer)
Kendall et al., 2020, 10.1038/s41598-020-74691-9
Spatial patterns can be quantified using landscape metrics (O’Neill et al. 1988; Turner and Gardner 1991; Li and Reynolds 1993; He et al. 2000; Jaeger 2000; Kot i in. 2006; McGarigal 2014).
Software such as FRAGSTATS, GuidosToolbox, or landscapemetrics has proven useful in many scientific studies (> 12,000 citations).
There is a relationship between an area’s pattern composition and configuration and ecosystem characteristics, such as vegetation diversity, animal distributions, and water quality within this area (Hunsaker i Levine, 1995; Fahrig i Nuttle, 2005; Klingbeil i Willig, 2009; Holzschuh et al., 2010; Fahrig et al., 2011; Carrara et al., 2015; Arroyo-Rodŕıguez et al. 2016; Duflot et al., 2017, many others..)
I randomely selected 16 rasters with different proportions of forest (green) areas:
Important considerations:
Helpful resources:
SHDI:
AI:
Chapter “The landscapemetrics and motif packages for measuring landscape patterns and processes”
library(landscapemetrics)
library(terra)
r9 = rast("exdata/r9.tif")
r1 = rast("exdata/r1.tif")
plot(r1); plot(r9)
# A tibble: 1 × 6
layer level class id metric value
<int> <chr> <int> <int> <chr> <dbl>
1 1 landscape NA NA shdi 1.06
# A tibble: 1 × 6
layer level class id metric value
<int> <chr> <int> <int> <chr> <dbl>
1 1 landscape NA NA ai 82.1
# A tibble: 2 × 6
layer level class id metric value
<int> <chr> <int> <int> <chr> <dbl>
1 1 landscape NA NA ai 82.1
2 1 landscape NA NA shdi 1.06
# A tibble: 4 × 6
layer level class id metric value
<int> <chr> <int> <int> <chr> <dbl>
1 1 landscape NA NA ai 98.7
2 1 landscape NA NA shdi 0.0811
3 2 landscape NA NA ai 82.1
4 2 landscape NA NA shdi 1.06
mat_window = matrix(1, nrow = 11, ncol = 11)
w_result = window_lsm(r9, window = mat_window, what = "lsm_l_ai")
plot(r9); plot(w_result$layer_1$lsm_l_ai)
https://r-spatialecology.github.io/landscapemetrics/
# A tibble: 133 × 5
metric name type level function_name
<chr> <chr> <chr> <chr> <chr>
1 area patch area area and edge… patch lsm_p_area
2 cai core area index core area met… patch lsm_p_cai
3 circle related circumscribing circle shape metric patch lsm_p_circle
4 contig contiguity index shape metric patch lsm_p_contig
5 core core area core area met… patch lsm_p_core
6 enn euclidean nearest neighbor distance aggregation m… patch lsm_p_enn
7 frac fractal dimension index shape metric patch lsm_p_frac
8 gyrate radius of gyration area and edge… patch lsm_p_gyrate
9 ncore number of core areas core area met… patch lsm_p_ncore
10 para perimeter-area ratio shape metric patch lsm_p_para
# ℹ 123 more rows
exdata/lc_small.tif
and visualize it. What is the location of the data? What are the extent of the data and its spatial resolution? How many categories it contains?exdata/lc_small2.tif
, calculate AI and TE for this raster, and compare the results with the previous raster.read_sf()
function from the sf package read the exdata/points.gpkg
file. Next, calculate SHDI and AI of an area of 3000 meters from each sampling point (see the sample_lsm()
function).Type | Landscape-level metrics |
---|---|
Shape | PAFRAG; CONTIG AM; CONTIG RA |
Aggregation | AI; CONTAG; IJI; PLATJ; PD; DIVISION; LPI |
Connectivity | COHESION |
Diversity | SHDI; SIDI; MSIDI; SHEI; SIEI; MSIEI |
PC1:
PC2:
The result allows to distinguish between:
However, there are still some problems here…
PC1:
PC2:
Issues with the PCA approach:
Entropy:
Relative mutual information:
2D parametrization of categorical rasters’ configurations based on two weakly correlated IT metrics groups similar patterns into distinct regions of the parameters space
Land cover data:
Parametrization using two IT metrics:
lsm_l_ent()
and lsm_l_relmutinf()
. Calculate both of these metrics for the exdata/lc_small.tif
raster.exdata/lc_europe.tif
raster using rast()
from the terra package and the exdata/polygons.gpkg
vector data using the read_sf()
function from the sf package. Calculate the marginal entropy and relative mutual information for each polygon using the sample_lsm()
function.st_make_grid()
function from the sf package for the area from the exdata/polygons.gpkg
file. Calculate the marginal entropy and relative mutual information for each square using the sample_lsm()
function. Visualize the results.These metrics still leave some questions open…
Parametrization using two IT metrics:
In recent years, the ideas of analyzing spatial patterns have been extended through an approach called pattern-based spatial analysis (Long in in. 2010; Cardille in in. 2010; Cardille in in. 2012; Jasiewicz i in. 2013; Jasiewicz i in. 2015).
The fundamental idea is to divide data into a large number of smaller areas (local landscapes).
Next, represent each area using a statistical description of the spatial pattern - a spatial signature.
Spatial signatures can be compared using a large number of existing distance or dissimilarity measures (Lin 1991; Cha 2007).
This approach enables spatial analyses such as searching, change detection, clustering, or segmentation.
Most landscape metrics are single numbers representing specific features of a local landscape.
Spatial signatures, on the other hand, are multi-element representations of landscape composition and configuration.
The basic signature is the co-occurrence matrix:
agriculture | forest | grassland | water | |
---|---|---|---|---|
agriculture | 272 | 218 | 4 | 0 |
forest | 218 | 38778 | 32 | 12 |
grassland | 4 | 32 | 16 | 0 |
water | 0 | 12 | 0 | 2 |
A spatial signature should allow simplification to the form of a normalized vector.
272 | 218 | 4 | 0 | 218 | 38778 | 32 | 12 | 4 | 32 | 16 | 0 | 0 | 12 | 0 | 2 |
136 | 218 | 19389 | 4 | 32 | 8 | 0 | 12 | 0 | 1 |
0.0069 | 0.011 | 0.9792 | 0.0002 | 0.0016 | 0.0004 | 0 | 0.0006 | 0 | 0.0001 |
Measuring the distance between two signatures in the form of normalized vectors allows determining dissimilarity between spatial structures.
0.0069 | 0.011 | 0.9792 | 0.0002 | 0.0016 | 0.0004 | 0 | 0.0006 | 0 | 0.0001 |
0.1282 | 0.0609 | 0.8105 | 0.0002 | 0.0002 | 0.0001 | 0 | 0 | 0 | 0 |
\[JSD(A, B) = H(\frac{A + B}{2}) - \frac{1}{2}[H(A) + H(B)]\]
Jensen-Shannon distance between the above rasters: 0.0684
Measuring the distance between two signatures in the form of normalized vectors allows determining dissimilarity between spatial structures.
0.0069 | 0.011 | 0.9792 | 0.0002 | 0.0016 | 0.0004 | 0 | 0 | 0 | 0 | 0 | 0.0006 | 0 | 0 | 0.0001 |
0.2033 | 0.1335 | 0.2944 | 0.1747 | 0.0562 | 0.1307 | 0.0035 | 0.0002 | 0.0004 | 0.0015 | 0.0007 | 0.0005 | 0 | 0 | 0.0005 |
\[JSD(A, B) = H(\frac{A + B}{2}) - \frac{1}{2}[H(A) + H(B)]\]
Jensen-Shannon distance between the above rasters: 0.444
Knowing the distance between spatial signatures can be used in several contexts (Nowosad, 2021, 10.1007/s10980-020-01135-0):
one-to-many
finding similar spatial structures
one-to-one
quantitative assessment of changes in spatial structures
many-to-many
clustering similar spatial structures
Finding areas with similar topography to the Suwalski Landscape Park.
The map above shows that many areas in the Amazon have undergone significant land cover changes between 1992 and 2018.
The challenge now is to determine which areas have changed the most.
Areas with the greatest change have the highest dissimilarity values.
Importantly, changes in both category and spatial configuration are measured.
Areas in Africa with similar spatial structures for two themes have been identified - land cover and landforms.
The quality of each cluster can be assessed using metrics:
library(terra)
library(motif)
r9 = rast("exdata/r9.tif")
r9_sign_coma = lsp_signature(r9, type = "coma")
r9_sign_coma
# A tibble: 1 × 3
id na_prop signature
* <int> <dbl> <list>
1 1 0 <int [7 × 7]>
[[1]]
1 2 3 4 5 6 7
1 7162 1176 998 129 4 5 30
2 1176 21110 984 104 12 2 63
3 998 984 3198 79 0 1 88
4 129 104 79 234 0 0 0
5 4 12 0 0 12 0 0
6 5 2 1 0 0 0 0
7 30 63 88 0 0 0 534
# A tibble: 1 × 3
id na_prop signature
* <int> <dbl> <list>
1 1 0 <dbl [1 × 28]>
[[1]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.1808586 0.05939394 0.5330808 0.05040404 0.04969697 0.08075758
[,7] [,8] [,9] [,10] [,11] [,12]
[1,] 0.006515152 0.005252525 0.003989899 0.005909091 0.0002020202 0.0006060606
[,13] [,14] [,15] [,16] [,17] [,18] [,19]
[1,] 0 0 0.0003030303 0.0002525253 0.0001010101 0.00005050505 0
[,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27]
[1,] 0 0 0.001515152 0.003181818 0.004444444 0 0 0
[,28]
[1,] 0.01348485
exdata/harz_borders.gpkg
file using the read_sf()
function from the sf package.exdata/lc_europe.tif
file using the rast()
function from the terra package. Visualize both datasets.Using spatial signatures to compare and evaluate digital soil maps (Rossiter et al., 2022, 10.5194/soil-8-559-2022)
Changes in spatial patterns resulting from the inclusion of small woody elements in land use maps (Golicz et al., 2021, 10.3390/land10101028)
Depending on the problem:
Challenges:
WorldClim version 2.1 climate data
for 1970-2000
CMIP6 downscaled future climate projection for 2061-2080 [model: CNRM-ESM2-1; ssp: “585”]
Minimum temperature (°C)
https://jakubnowosad.com/spquery/
How to find and compare areas with similar spatial patterns in non-categorical rasters (e.g., raster time-series)?
https://jakubnowosad.com/patternogram/, https://jakubnowosad.com/ecem-2023
How to detect and describe a range of spatial similarity (spatial autocorrelation) for multiple variables?
It can be used to:
https://jakubnowosad.com/supercells/, https://jakubnowosad.com/foss4g-2022/
supercells: an extension of SLIC (Simple Linear Iterative Clustering; Achanta et al. (2012), doi:10.1109/TPAMI.2012.120) that can be applied to non-imagery geospatial rasters that carry:
Segmentation/regionalization: partitioning space into smaller segments while minimizing internal inhomogeneity and maximizing external isolation
A way to improve the output and reduce the cost of segmentation.
Great Britain. WorldClim gridded climate data was normalized to be between 0 and 1.
The goal: to regionalize Great Britain’s climates
Extended SLIC workflow uses the dynamic time warping (DTW) distance function rather than the Euclidean distance.
Extended SLIC: a more homogeneous regionalization.
Original SLIC: more isolated regions.
SLIC | Inhomogeneity | Isolation |
---|---|---|
extended | 0.30 | 0.59 |
original | 0.37 | 0.75 |
The raster of time series compressed from 24 dimensions to three principal components preserving 99% of variability.
library(terra)
library(supercells)
# Version 1
mintemp_zones = supercells(cmip_tmin_pl, k = 150, compactness = 4)
plot(cmip_tmin_pl[[1]]); plot(mintemp_zones, add = TRUE, col = NA)
# Version 2
mintemp_zones = supercells(cmip_tmin_pl, k = 50, compactness = 4)
plot(cmip_tmin_pl[[1]]); plot(mintemp_zones, add = TRUE, col = NA)
# Version 3
mintemp_zones = supercells(cmip_tmin_pl, k = 150, compactness = 1)
plot(cmip_tmin_pl[[1]]); plot(mintemp_zones, add = TRUE, col = NA)
Mastodon: fosstodon.org/@nowosad
Website: https://jakubnowosad.com