Learning outcomes ### Get data 'shipped' with packages ### Get data from data provider packages ### Get data from OSM ### Get data from websites ] --- # Setup: Packages we'll be using .pull-left[ - [spData]( contains example spatial datasets - [osmdata]( gets small OpenStreetMap (OSM) datasets - [osmextract]( gets large OSM datasets - [nzelect]( gets official voting data from New Zealand <img src="" width="50%" /><img src="" width="50%" /> ] .pull-right[ ```r # Install packages # install.packages("remotes") # you'll need the remotes package pkgs = c( "spData", "osmdata", "osmextract", "nzelect" ) ``` ```r remotes::install_cran(pkgs) ``` ```r library(osmextract) # Data (c) OpenStreetMap contributors, ODbL 1.0. # Check the package website,, for more details. ``` ] --- # Getting data that 'ship' with packages .left-column[ ## Why use example datasets? - Reproducibility - Avoid sharing sensitive data <!-- - Speed of execution --> <!-- Documentation --> - Any other reasons? <!-- Encourages generalisation of code to work with multiple datasetsm --> ] ```r # ?datasets # library(help = "datasets") # lots of great example datasets, use them! world_phones_new = datasets::WorldPhones class(world_phones_new) ``` ``` ## [1] "matrix" "array" ``` ```r class(spData::nz) ``` ``` ## [1] "sf" "data.frame" ``` -- ### Documenting datasets See -- ### Exercise Take a look at the help for `datasets` and plot two examples. --- # Plotting air passengers ```r ?AirPassengers ``` ```r AirPassengers ``` ``` ## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ## 1949 112 118 132 129 121 135 148 148 136 119 104 118 ## 1950 115 126 141 135 125 149 170 170 158 133 114 140 ## 1951 145 150 178 163 172 178 199 199 184 162 146 166 ## 1952 171 180 193 181 183 218 230 242 209 191 172 194 ## 1953 196 196 236 235 229 243 264 272 237 211 180 201 ## 1954 204 188 235 227 234 264 302 293 259 229 203 229 ## 1955 242 233 267 269 270 315 364 347 312 274 237 278 ## 1956 284 277 317 313 318 374 413 405 355 306 271 306 ## 1957 315 301 356 348 355 422 465 467 404 347 305 336 ## 1958 340 318 362 348 363 435 491 505 404 359 310 337 ## 1959 360 342 406 396 420 472 548 559 463 407 362 405 ## 1960 417 391 419 461 472 535 622 606 508 461 390 432 ``` ```r class(AirPassengers) ``` ``` ## [1] "ts" ``` ```r plot(AirPassengers) ``` <!-- --> --- # A couple of other datasets ```r plot(UKDriverDeaths) plot(CO2) ```  <!-- --- --> <!-- # Joining non-geographic data to geometries --> <!-- Idea: create animation from the WorldPhones dataset --> --- # Getting data from New Zealand I .pull-left[ ```r library(spData) plot(nz) ``` <!-- --> ] .pull-right[ ```r plot(nz$geom) plot(nz_height, add = TRUE) ``` ``` ## Warning in plot.sf(nz_height, add = TRUE): ignoring all but the first attribute ``` <!-- --> ] --- # Getting data from New Zealand II .pull-left[ ```r library(nzelect) # ?voting_places nz_lonlat = sf::st_transform(nz, 4326) names(voting_places) ``` ``` ## [1] "electorate_number" "electorate" ## [3] "voting_place_id" "voting_place_suburb" ## [5] "northing" "easting" ## [7] "longitude" "latitude" ## [9] "voting_place" "election_year" ## [11] "coordinate_system" "TA2014_NAM" ## [13] "REGC2014_N" "AU2014" ## [15] "MB2014" ``` ```r voting_places_sf = sf::st_as_sf(voting_places, coords = c("longitude", "latitude")) ``` ] .pull-right[ ```r plot(sf::st_geometry(nz_lonlat)) plot(voting_places_sf, add = TRUE) ``` ``` ## Warning in plot.sf(voting_places_sf, add = TRUE): ## ignoring all but the first attribute ``` <!-- --> ] --- # Exercise .left-column[ ### Read the documentation on the site ### Install the nzcensus package and use it to plot nz boundaries ] .right-column[  ] --- # Getting data from New Zealand III ```r library(osmdata) ``` ```r schools_nz_osm = opq(bbox = sf::st_bbox(nz_lonlat)) %>% add_osm_feature(key = "amenity", value = "school") %>% osmdata_sf() ``` ```r schools_nz_osm ``` ``` ## Object of class 'osmdata' with: ## $bbox : -47.2828524160058,166.426302914574,-34.4145187168353,178.550373601283 ## $overpass_call : The call submitted to the overpass API ## $meta : metadata including timestamp and version numbers ## $osm_points : 'sf' Simple Features Collection with 28532 points ## $osm_lines : 'sf' Simple Features Collection with 21 linestrings ## $osm_polygons : 'sf' Simple Features Collection with 2532 polygons ## $osm_multilines : NULL ## $osm_multipolygons : 'sf' Simple Features Collection with 15 multipolygons ``` ```r schools_nz_polygons = schools_nz_osm$osm_polygons ``` --- # Working with messy data ```r library(tidyverse) schools_nz_polygons %>% sf::st_drop_geometry() %>% skimr::skim_without_charts() %>% print(include_summary = FALSE) ``` ``` ## 1 1 1 1 7 1 97 115 1 1 ## 9-12 9-13 9-15 special ## 1 217 2 22 ``` ```r pryr::object_size(schools_nz_polygons) ``` ``` ## 6.92 MB ``` ```r schools_nz_subset = schools_nz_polygons %>% select(osm_id, LINZ2OSM.dataset, MOE.years, name) pryr::object_size(schools_nz_subset) ``` ``` ## 4.59 MB ``` --- # Combining different types of spatial data .pull-left[ ```r osm_combine = function(osm_points, osm_polygons) { require(sf) if(nrow(osm_polygons) > 0) { # deduplicate points osm_points_in_polygons = osm_points[osm_polygons, ] # mapview::mapview(osm_points_in_polygons) osm_points_not_in_polygons = osm_points[!osm_points$osm_id %in% osm_points_in_polygons$osm_id,] osm_polygons_centroids = sf::st_centroid(osm_polygons) # convert polygons to points and join together # setdiff(names(osm_points), names(osm_polygons_centroids)) names_in_both = intersect(names(osm_points), names(osm_polygons_centroids)) osm_points = rbind(osm_points_not_in_polygons[names_in_both], osm_polygons_centroids[names_in_both]) } osm_points } ``` ] .pull-right[ ```r schools_combined = osm_combine( osm_points = schools_nz_osm$osm_points, osm_polygons = schools_nz_subset ) ``` ``` ## Warning in st_centroid.sf(osm_polygons): ## st_centroid assumes attributes are constant over ## geometries of x ``` ```r nrow(schools_combined) / nrow(schools_nz_subset) ``` ``` ## [1] 1.078594 ``` ] ## Exercise Use the example code above to get and clean data representing all supermarkets in New Zealand --- # Getting data from New Zealand IV ```r library(osmextract) ``` ```r place_name = "isle of wight" # place_name = "new zealand" # warning: downloads ~1/4 GB compressed OSM data! et = c("amenity") q_points = "SELECT * FROM points WHERE amenity IN ('school')" oe_school_points = oe_get(place_name, query = q_points, extra_tags = et) ``` ``` ## The input place was matched with: Isle of Wight ``` ``` ## Warning: The query selected a layer which is ## different from layer argument. We will replace the ## layer argument. ``` ``` ## File downloaded! ``` ``` ## Start with the vectortranslate operations on the input file! ``` ``` ## 0...10...20...30...40...50...60...70...80...90...100 - done. ``` ``` ## Finished the vectortranslate operations on the input file! ``` ``` ## Reading layer `points' from data source ## `/tmp/Rtmp1E7844/geofabrik_isle-of-wight-latest.gpkg' ## using driver `GPKG' ## Simple feature collection with 5 features and 11 fields ## Geometry type: POINT ## Dimension: XY ## Bounding box: xmin: -1.304616 ymin: 50.69791 xmax: -1.167812 ymax: 50.7524 ## Geodetic CRS: WGS 84 ``` --- # National OSM datasets ```r mapview::mapview(oe_school_points) ```  --- # Filtering ```r library(tidyverse) nz_north = nz %>% filter(Island == "North") plot(nz$geom) plot(nz_north, add = TRUE, col = "red") ``` ``` ## Warning in plot.sf(nz_north, add = TRUE, col = ## "red"): ignoring all but the first attribute ``` <!-- --> --- # Mutating ```r nz_density = nz %>% mutate(Density = Population / Land_area) nz_density %>% select(Density) %>% plot() ``` <!-- --> --- # Geographic joins ```r schools_nz_polygons_centroids = sf::st_centroid(schools_nz_polygons) ``` ``` ## Warning in st_centroid.sf(schools_nz_polygons): ## st_centroid assumes attributes are constant over ## geometries of x ``` ```r ncol(schools_nz_polygons_centroids) ``` ``` ## [1] 81 ``` ```r schools_nz_polygons_joined = sf::st_join( schools_nz_polygons_centroids, nz_lonlat %>% select(Name) ) ncol(schools_nz_polygons_joined) ``` ``` ## [1] 82 ``` --- # Geometric operations ```r nz_simple = rmapshaper::ms_simplify(nz, keep = 0.02) nz_buffer = sf::st_buffer(sf::st_union(nz), dist = 22000) plot(sf::st_geometry(nz_simple)) plot(sf::st_geometry(nz_buffer)) plot(nz, add = TRUE) ```  --- # Geometry exercises .left-column[ See Attempt the exercises ] .right-column[  ] --- # Excersises ### Based on what you have learned in this workshop find and download data that interests you in an area of the world you are researching or are familiar with ```r gzones = osmextract::geofabrik_zones gzones_france = gzones %>% filter(parent == "france") gzones_france$name ``` ``` ## [1] "Alsace" ## [2] "Aquitaine" ## [3] "Auvergne" ## [4] "Basse-Normandie" ## [5] "Bourgogne" ## [6] "Bretagne" ## [7] "Centre" ## [8] "Champagne Ardenne" ## [9] "Corse" ## [10] "Franche Comte" ## [11] "Guadeloupe" ## [12] "Guyane" ## [13] "Haute-Normandie" ## [14] "Ile-de-France" ## [15] "Languedoc-Roussillon" ## [16] "Limousin" ## [17] "Lorraine" ## [18] "Martinique" ## [19] "Mayotte" ## [20] "Midi-Pyrenees" ## [21] "Nord-Pas-de-Calais" ## [22] "Pays de la Loire" ## [23] "Picardie" ## [24] "Poitou-Charentes" ## [25] "Provence Alpes-Cote-d'Azur" ## [26] "Reunion" ## [27] "Rhone-Alpes" ```