diff --git a/docs/articles/joss/paper.md b/docs/articles/joss/paper.md index bb080ed..3b13f1e 100644 --- a/docs/articles/joss/paper.md +++ b/docs/articles/joss/paper.md @@ -46,29 +46,30 @@ editor_options: # Summary Climate change is a growing concern for city planners as Urban Heat Islands have an impact on -mortality @clarke1972some, health in general [@lowe2016energy] and consumption of energy +mortality [@clarke1972some], health in general [@lowe2016energy] and consumption of energy for building cooling [@malys2012microclimate] among other effects. A first step towards large scale study of urban climate is to define classes based on logical division of the landscape, such as Local Climate Zones (LCZ) defined by [@stewart2012local]. -The lczexplore package aims at comparing different LCZ classifications, -but can be used to compare any pair of classifications on geographical units. - A spatial comparison is performed by producing agreement maps between classifications, agreement statistics and a confusion matrix to help qualify and quantify the misclassifications. +The lczexplore package aims at comparing different LCZ classifications, +but can be used to compare any pair of classifications on geographical units. + This software is available as a free and opensource R package. # Statement of need ## Comparing maps As stated in [@visser2006map] comparing maps is an important issue in environmental research. -The four main reasons to compare categorical variables on geographical units are: -- to assess the differences between maps generated by models under different scenarios and assumptions, -- to detect temporal changes, -- to calibrate or validate models, -- to perform uncertainty and sensitivity analysis. +The four main reasons to compare categorical variables on geographical units are: + +- to assess the differences between maps generated by models under different scenarios and assumptions, +- to detect temporal changes, +- to calibrate or validate models, +- to perform uncertainty and sensitivity analysis. ## Comparing specifically LCZ maps @@ -85,16 +86,17 @@ apprehend the intensity of the Urban Heat Island [@kotharkar2018evaluating]. Several methods aim to classify a territory into LCZ, but only few workflows allow an automatic classification for any given area. -@quan2021systematic distinguishes two main streams of production of these LCZ: +@quan2021systematic distinguishes two main streams of production of these LCZ: + - the raster stream processes remotely sensed information, and applies machine learning algorithms trained using local experts' knowledge. In this way, the WUDAPT community [@chingWUDAPTUrbanWeather2018a] produced thousands of city-based LCZ maps (accessible via the LCZ Generator ([@demuzereLCZGeneratorWeb2021])) but also large-scale maps for Europe, the continental United States and the whole world ([@demuzere2019mapping], - [@demuzere2020combining], [@demuzere2022global]). + [@demuzere2020combining], [@demuzere2022global]). - the vector stream uses Geographic Information System (GIS) layers that represent the main topographic features, defines spatial units, computes urban canopy parameters and uses them to classify spatial units into LCZ. For instance, the GeoClimate geospatial toolbox produces LCZ classifications - from OpenStreetMap or french BDTopo data [@bocher2021geoclimate]. + from OpenStreetMap or french BDTopo data [@bocher2021geoclimate]. The existence of several methods to produce LCZ classifications, or the use of a method with different input data, raises the need for a tool to quickly get: @@ -118,7 +120,7 @@ maps, but it has two main drawbacks: - two totally random maps won't have a value of agreement of zero, as some pixel values may agree by chance, - only raster maps where pixels match perfectly (same size, not translated) can be treated, or - some pre-treatment are needed (like nearest neighbour interpolation for instance). + some pre-treatment are needed (like nearest neighbours interpolation for instance). To prevent the first drawback, @monserudComparingGlobalVegetation1992 proposed the use of Cohen's kappa coefficient of agreement for nominal scales [@cohenCoefficientAgreementNominal1960]. @@ -158,7 +160,7 @@ As far as we know, the scripts and the automation of the method are not publicly The need for automation of map comparison (both raster and vector), and specific features (like standard colors or legends -for LCZs and sensitivity analaysis) justified the development of `lczexplore`. +for LCZs, or sensitivity analysis according to confidence) justified the development of `lczexplore`. ## Features @@ -244,8 +246,8 @@ One can then feed `compareLCZ` function these new groups, setting `repr="alter" ### Sensitivity analysis -The Geoclimate algorithm adds a uniqueness value to the LCZ type it assigns to a spatial unit. -It measures if another LCZ levels could have been assigned to this unit. Thus, it can be seen +Some algorithms add a uniqueness value to the LCZ type it assigns to a spatial unit. +It can measure if another LCZ level could have been assigned to this unit. Thus, it can be seen as a confidence value of the LCZ type. The `lczexplore ` package allows a sensitivity analysis according to this level of confidence, in order to answer the question: @@ -270,10 +272,10 @@ One also needs to notice that on this example, most geometries didn't have a con This package focuses on LCZ maps comparison, but more often than not, people working on LCZ maps also describe their area of interest with other categorical indicators. The workflow of comparison of LCZ maps can be used for any pair of maps of categorical variables, -under certain limitations: -- there must not be more than 36 levels for the categorical variable to explore -- the associated geometries must be (multi) polygons or easily converted to them - (typically, the package would not be suitable to compare road characterization), +under certain limitations: +- there must not be more than 36 levels for the categorical variable to explore, +- the associated geometries must be (multi) polygons or easily converted to them, + (typically, the package would not be suitable to compare road characterization), - the geometries must be topographically valid (this is also true for LCZ). The `importQualVar` function allows the import of such variables on (multi-) polygons maps. @@ -283,12 +285,12 @@ of the package (`showLCZ`, `compareLCZ`, `groupLCZ`...). # Coding implementation `lczexplore` is an R package, all its specific functions are coded in R language. -It relies on state-of-the art packages: -- geographical computation requires the **`sf`** package for vector data and the **`terra`** package for raster data, +It relies on state-of-the art packages: +- geographical computation requires the **`sf`** package for vector data and the **`terra`** package for raster data, - data management mainly requires the following packages: **`dplyr, tidyr, forcats, rlang`** - and **`methods`** packages, -- graphical production uses **`ggplot2, grDevices, cowplot`** and **`RColorBrewer`**, -- tests need the **`tinytest`** package. + and **`methods`** packages, +- graphical production uses **`ggplot2, grDevices, cowplot`** and **`RColorBrewer`**, +- tests need the **`tinytest`** package. Every step corresponds to an R function (see the workflow on figure 1 for the name of the main functions). @@ -541,7 +543,7 @@ as long as you specify the expected levels the same way. The function `importQualVar` allows you to import maps of qualitative variables on (multi)polygons. For instance, classifications of urban tissue called -Urban Typolgy by Random Forest [@bocherGeoprocessingFrameworkCompute2018] are available as examples in this package. +Urban Typology by Random Forest [@bocherGeoprocessingFrameworkCompute2018] are available as examples in this package. After the import, the usual functions can be used, as shown in the following code: