Skip to content

Commit

Permalink
Documentation update of osmenrich (#18)
Browse files Browse the repository at this point in the history
* Add hyperlink to Map Features on the OSM website (#5)

* Add links to sf and osmdata

* Remove comment statement

* Fix undefined variable in example (#7)

* Remove words like 'our' and 'we' (#12)

* Remove irrelevant information to avoid confusion (#8)

* Replace dontrun by donttest (#13)

* Clarify title enrich_osm (#14)

* Move details of enrich_opq to enrich_osm (#10)

* Correct README text (#16)

* 📝 Text changes

* 📝 Update using review

* Update readme, add wastebasket figure

Co-authored-by: Jonathan de Bruin <jonathandebruinos@gmail.com>
Co-authored-by: Leonardo Vida <lleonardovida@gmail.com>
  • Loading branch information
3 people authored Feb 18, 2021
1 parent 3d92db2 commit ed8cd78
Show file tree
Hide file tree
Showing 5 changed files with 85 additions and 79 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# From: https://github.com/github/gitignore/blob/master/R.gitignore

# R osm cache
rosm.cache/

# History files
.Rhistory
.Rapp.history
Expand Down
38 changes: 35 additions & 3 deletions R/enrich.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#' @name enrich
#' @title Enrich OSM Data
#' @title Enrich `sf` object with OSM data
#' @description Perform enriched query on osm and add as new column.
#'
#' The enrichment call works in the following way: an `enriched_overpass_query`
Expand All @@ -18,10 +18,42 @@
#' @param ... `enriched_overpass_query` column or columns to add
#' @param .verbose `bool` whether to print info during enrichment
#'
#' @details
#' `Type` represents the feature type to be considered. Usually this would be
#' points, but polygons and multipolygons are also possible. This argument can
#' also be a vector of multiple types. Non-point types will be converted to
#' points using the st_centroid function from the sf package (NB this does not
#' necessarily work well for all features!) Available options are:
#' - points
#' - lines
#' - polygons
#' - multilines
#' - multipolygons
#'
#' `Distance` represents the metric used to compute the distances between the
#' rows in the dataset and the osm features. `Duration` represents the metric
#' that indicates the average duration to cover the distances between the
#' rows in the dataset and the osm features. The following metrics are
#' available in this package, assuming that the OSRM server is setup as
#' suggested in our guide at:
#' https://github.com/sodascience/osmenrich_docker:
#' - spherical ("as the crow flies")
#' - distance_by_foot
#' - duration_by_foot
#' - distance_by_car
#' - duration_by_car
#' - distance_by_bike
#' - duration_by_bike
#'
#' `Kernel` is a kernel function from the osmenrich package to be used in weighing
#' the features and the radius/distance where features are considered. For
#' simply counting the number of occurrences within a radius, use kernel_uniform
#' with radius r.
#'
#' @examples
#' \dontrun{
#' \donttest{
#'
#' #' # Enrich data creating new column `waste_baskets`
#' # Enrich data creating new column `waste_baskets`
#' sf_enriched <- dataset %>%
#' enrich_osm(
#' name = "waste_baskets",
Expand Down
32 changes: 0 additions & 32 deletions R/opqenrich.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,38 +12,6 @@
#' @param .verbose `bool` whether to print info during enrichment
#' @param ... arguments passed to the kernel function
#'
#' @details
#' `Type` represents the feature type to be considered. Usually this would be
#' points, but polygons and multipolygons are also possible. This argument can
#' also be a vector of multiple types. Non-point types will be converted to
#' points using the st_centroid function from the sf package (NB this does not
#' necessarily work well for all features!) Available options are:
#' - points
#' - lines
#' - polygons
#' - multilines
#' - multipolygons
#'
#' `Distance` represents the metric used to compute the distances between the
#' rows in the dataset and the osm features. `Duration` represents the metric
#' that indicates the average duration to cover the distances between the
#' rows in the dataset and the osm features. The following metrics are
#' available in this package, assuming that the OSRM server is setup as
#' suggested in our guide at:
#' https://github.com/sodascience/osmenrich_docker:
#' - spherical ("as the crow flies")
#' - distance_by_foot
#' - duration_by_foot
#' - distance_by_car
#' - duration_by_car
#' - distance_by_bike
#' - duration_by_bike
#'
#' `Kernel` is a kernel function from the osmenrich package to be used in weighing
#' the features and the radius/distance where features are considered. For
#' simply counting the number of occurrences within a radius, use kernel_uniform
#' with radius r.
#'
#' @importFrom methods is
#' @rdname enrich_opq
#'
Expand Down
91 changes: 47 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,16 @@
</p>
<br/>

# Enrich geocoded data using openstreetmaps
# Enrich geocoded data using OpenStreetMap

![Github Action test](https://github.com/sodascience/osmenrich/workflows/R-CMD-check/badge.svg) [![DOI](https://zenodo.org/badge/337555188.svg)](https://zenodo.org/badge/latestdoi/337555188)


The goal of `osmenrich` is to easily enrich geocoded data
(`latitude`/`longitude`) with geographic features from OpenStreetMap (OSM).
The main language of the package is `R` and this package is designed to work
with the `sf` and `osmdata` packages for collecting and manipulating geodata.
with the [`sf`](https://r-spatial.github.io/sf/) and [`osmdata`](
https://cran.r-project.org/web/packages/osmdata/vignettes/osmdata.html)
packages for collecting and manipulating geodata.

## Installation

Expand Down Expand Up @@ -54,84 +55,87 @@ dataset:
# Import libraries
library(tidyverse)
library(sf)
library(osmdata)
library(osmenrich)

# Create an example dataset to enrich
sf_example <-
tribble(
~person, ~id, ~lat, ~lon, ~val,
"Alice", 1, 52.12, 5.09, 5L,
"Bob", 2, 52.13, 5.08, 2L
~person, ~lat, ~lon,
"Alice", 52.12, 5.09,
"Bob", 52.13, 5.08,
) %>%
sf::st_as_sf(coords = c("lon", "lat"), crs = 4326)
sf::st_as_sf(
coords = c("lon", "lat"),
crs = 4326
)

# Print it
sf_example
#> Simple feature collection with 2 features and 3 fields
#> Simple feature collection with 2 features and 1 field
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 5.08 ymin: 52.12 xmax: 5.09 ymax: 52.13
#> CRS: EPSG:4326
#> # A tibble: 2 x 4
#> person id val geometry
#> * <chr> <dbl> <int> <POINT [°]>
#> 1 Alice 1 5 (5.09 52.12)
#> 2 Bob 2 2 (5.08 52.13)
#> # A tibble: 2 x 2
#> person geometry
#> * <chr> <POINT [°]>
#> 1 Alice (5.09 52.12)
#> 2 Bob (5.08 52.13)
```

To enrich the `sf_example` dataset with "waste baskets" in a 100m radius, we
create a query using the `enrich_osm()` function. This function uses the
To enrich the `sf_example` dataset with "waste baskets" in a 100m radius, you
can create a query using the `enrich_osm()` function. This function uses the
bounding box created by the points present in the example dataset and searches
for the specified `key = "amenity"` and `value = "waste_basket`. We also add a
for the specified `key = "amenity"` and `value = "waste_basket`. You can also add a
custom `name` for the newly created column and specify the radius (`r`) used
in the search.
in the search. See
[Map Features on the website of OSM](https://wiki.openstreetmap.org/wiki/Map_features)
for a complete list of `key` and `value` combinations.

```r
# Simple OSMEnrich query
sf_example_simple <- sf_example %>%
sf_example_enriched <- sf_example %>%
enrich_osm(
name = "waste_baskets",
name = "n_waste_baskets",
key = "amenity",
value = "waste_basket",
r = 100
r = 500
)
#> Downloading data for waste_baskets... Done.
#> Downloaded 26 points, 0 lines, 0 polygons, 0 mlines, 0 mpolygons.
#> Computing distance matrix for wastebaskets...Done.
#> Adding waste_baskets to data.

#> Downloaded 147 points, 0 lines, 0 polygons, 0 mlines, 0 mpolygons.
#> Computing distance matrix for waste_baskets...Done.
```

The resulting enriched dataset is a `sf` object and can be printed as usual
and we can inspect the newly added column `waste_baskets`.
The resulting enriched dataset `sf_example_enriched` is a `sf` object and can be printed as usual
to inspect the newly added column `n_waste_baskets`.

```r
sf_example_enriched
#> Simple feature collection with 2 features and 4 fields
#> Simple feature collection with 2 features and 2 fields
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 5.08 ymin: 52.12 xmax: 5.09 ymax: 52.13
#> CRS: EPSG:4326
#> A tibble: 2 x 5
#> person id val geometry waste_baskets
#> * <chr> <dbl> <int> <POINT [°]> <int>
#> 1 Alice 1 5 (5.09 52.12) 3
#> 2 Bob 2 2 (5.08 52.13) 0
#> geographic CRS: WGS 84
#> # A tibble: 2 x 3
#> person geometry waste_baskets
#> * <chr> <POINT [°]> <int>
#> 1 Alice (5.09 52.12) 75
#> 2 Bob (5.08 52.13) 1
```

The waste baskets column is now the result of summing all the wastebaskets in a 500 meter radius for Alice and Bob:
![](man/figures/example_wastebaskets_r500.png)

## Local API setup

OSM enrichment can ask for a lot of data, which can overload public APIs. If
you intend to enrich large amounts of data or compute routing distances (e.g.,
driving duration) between many points, you should set up a local API endpoint.

We provide a `docker-compose` workflow for this in the separate
Multiple `docker-compose` workflows for doing this are avaialble in the separate
[osmenrich_docker
repository](https://github.com/sodascience/osmenrich_docker). Use the `README`
on the repository for setup instructions.

in the repository to select the workflow that fits your desired outcome.

<img src="man/figures/docker.png" width="250px"></img>

Expand All @@ -142,15 +146,14 @@ Contributions are what make the open source community an amazing place to
learn, inspire, and create. Any contributions you make are **greatly
appreciated**.

In this project we use the
[Gitflow workflow](https://nvie.com/posts/a-successful-git-branching-model/)
to help us with continious development. Instead of having a single
`master`/`main` branch we use two branches to record the history of the
In this project, the [Gitflow workflow](https://nvie.com/posts/a-successful-
git-branching-model/) is used. Instead of having a single `master`/`main`
branch, the project makes use of two branches to record the history of the
project: `develop` and `master`. The `master` branch is used only for the
official releases of the project, while the `develop` branch is used to
integrate the new features developed. Finally, `feature` branches are used to
develop new features or additions to the project that will be `rebased and
squash` in the `develop` branch.
develop new features or additions to the project that will be `rebased and squashed`
in the `develop` branch.

The workflow to contribute with Gitflow becomes:

Expand All @@ -176,7 +179,7 @@ Enrich sf Data with Geographic Features from OpenStreetMaps (Version v1.0). Zeno
This package is developed and maintained by the [ODISSEI Social Data Science
(SoDa)](https://odissei-data.nl/nl/soda/) team.

Do you have questions, suggestions, or remarks? File an issue in our issue
Do you have questions, suggestions, or remarks? File an issue in the issue
tracker or feel free to contact [Erik-Jan van
Kesteren](https://github.com/vankesteren)
([@ejvankesteren](https://twitter.com/ejvankesteren)) or [Leonardo
Expand Down
Binary file added man/figures/example_wastebaskets_r500.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit ed8cd78

Please sign in to comment.