-
Notifications
You must be signed in to change notification settings - Fork 2
/
code_issues.qmd
512 lines (288 loc) · 22.2 KB
/
code_issues.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
---
title: "Code Issues"
subtitle: "The Code Issues chapter provides recipes for addressing common challenges in writing functions, managing file and directory operations, and handling installation and compute resources."
format:
html:
css: style.css
---
# `T`/`F` Instead of `TRUE`/`FALSE`
## Problem
You are writing the abbreviated forms of `TRUE` and `FALSE`, `T` and `F` or you are using `T` or `F` as names.
## Solution
Change the abbreviation to the full words or use a different variable name.
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon=false collapse=true}
Please write TRUE and FALSE instead of T and F. Please don't use "T" or "F" as vector names.
:::
In R, `T` and `F` are variables being set to `TRUE` and `FALSE`, respectively. However, those variables can be redefined by the user since these are not reserved words as `TRUE` and `FALSE` are (see the [R Language Definition](https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Reserved-words) for more details). This can result in unexpected code behavior as users of your package might have variables named `T`/`F`.
### Examples
The first example shows that you cannot overwrite `TRUE`:
```{r, error=TRUE}
TRUE <- "Not TRUE anymore"
print(TRUE)
```
This throws an error as `TRUE` is a reserved word and the value of `TRUE` does not change.
`T` (and `F`) on the other hand can be set to a different value:
```{r}
T <- "Not TRUE anymore"
print(T)
```
To avoid any unexpected behaviors and inconsistencies, CRAN reviewers will ask you to write the reserved words, `TRUE` and `FALSE` instead of their abbreviated forms. For the same reason, `T` or `F` should not be used as variable names in your code, examples, tests or vignettes.
--------------------------------------------------------------------
# Setting a Specific Seed
## Problem
In your functions, you are setting a the random seed to a specific number which cannot be changed.
## Solution
Remove the code that sets the seed. If you want to set it specifically in your functions, users should be able to set no seed if they want.
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon=false collapse=true}
Please do not set a seed to a specific number within a function.
:::
Changing the seed is not necessarily forbidden. However, users should be able to control which seed to use. Therefore, avoid code that changes the seed without user consent (e.g.: `set.seed(123)`). A good solution for allowing to set a seed is to add an argument `seed` to your function and set it conditionally.
```{r, eval = FALSE}
function(... , seed = NULL){
if(!is.null(seed)){
set.seed(seed)
}
#the rest of your function can be written here.
}
```
This allows users to avoid setting a seed if they change the argument to `NULL`. Ideally, the argument is already set to `NULL` per default.
<!--# IDEA: Add a paragraph about messing with the .Random.seed...
Then referenced to the .GlobalEnv Issue. -->
::: {.callout-note}
In your examples, vignettes, demos and tests setting a seed is not only allowed but also recommended to ensure reproducible results.
:::
-----------------------------------------------------------
# Using `print()`/`cat()`
## Problem
You are using functions, like `print()` or `cat()`, to print unsuppressable output to the console.
## Solution
Change `print()`/`cat()` to `message()`, `warning()`, `stop()`, or wrap them in `if(verbose){}`.
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon=false collapse=true}
You write information messages to the console that cannot be easily suppressed.
It is more R like to generate objects that can be used to extract the information a user is interested in, and then print() that object. Instead of print()/cat() rather use message()/warning() or if(verbose)cat(..) (or maybe stop()) if you really have to write text to the console. (except for print, summary, interactive functions)
:::
Information messages, like loop counts or status updates, can clutter the console. While some users prefer this display, others appreciate less information on their console. The use of printing functions for console output is not forbidden on CRAN. However, this output must be suppressable by users.
:::{.callout-note}
Printing in special functions like print, summary, interactive functions or methods for generic functions is accepted.
:::
To allow users to suppress the console output CRAN recommends two different ways:
* exchanging `cat()`/`print()` with other generics
- [`message()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/message.html): for information messages and status updates
- [`warning()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/warning.html): for warning, will print a "Warning: " before the output
- [`stop()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/stop.html): for error messages, will print an "Error: " before the output and halt the execution
This allows to use functions like `suppressMessages()` to avoid unwanted output.
* using an additional function argument
- create one argument in your function to turn off the console output
- CRAN suggests using `verbose`, other names are accepted
This example code shows the use of a `verbose` argument to allow users to suppress printing
```{r, eval = FALSE}
foo <- function(..., verbose = TRUE){
# your code
if(verbose){
print("Whatever you want to say!")
}
# your code
}
```
Functions can print per default, like the example above, as long as the printing can be turned off (here, by setting `verbose = FALSE`).
:::{.callout-note}
`print()` and `cat()` are not the only functions which can write output onto the console. The issue described in the recipe, also applies to the use of other printing function like `writeLines()`.
If you are using loggers to document your functions' process, make sure that users can set their log level such that not messages are displayed.
:::
---------------------------------------------------------
# Change of Options, Graphical Parameters and Working Directory
## Problem
You are changing the `par()`, `options()` or `setwd()` in your functions, examples, demos or vignettes without resetting them.
## Solution
Reset the changed options, in your functions by using [`on.exit()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/on.exit.html) or in examples, demos and vignettes with an additional line of code after the example.
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon=false collapse=true}
Please make sure that you do not change the user's options, par or working directory. If you really have to do so within functions, please ensure with an *immediate* call of on.exit() that the settings are reset when the function is exited.
e.g.:
...
oldpar <- par(no.readonly = TRUE) # code line i
on.exit(par(oldpar)) # code line i + 1
...
par(mfrow=c(2,2)) # somewhere after
...
e.g.:
If you're not familiar with the function, please check ?on.exit. This function makes it possible to restore options before exiting a function even if the function breaks. Therefore it needs to be called immediately after the option change within a function.
Please always make sure to reset to user's options(), working directory or par() after you changed it in examples and vignettes and demos.
e.g.:
oldpar <- par(mfrow = c(1,2))
...
par(oldpar)
:::
Ideally, the user's options are not changed at all. If they really have to be altered, restoring the previous values of user options is mandatory for CRAN packages.
The reason for this rule is this line stated in the [CRAN Repository Policy](https://cran.r-project.org/web/packages/policies.html):
> The code and examples provided in a package should never do anything which might be regarded as malicious or anti-social.
Resetting options is therefore mainly a Quality-of-Life feature for users of your package.
There are different ways of resetting for functions, and examples, demos or vignettes which are recommended by CRAN.
Changing `par()`, `options()` or `setwd()` all invisibly return the previous values and therefore these can be stored in variables using the assignment operator `<-` and later be restored by calling the variable name as argument in the respective function.
**For functions**:
When changing options inside one of your package functions, you can use [`on.exit()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/on.exit.html) for restoring.
```{r, eval = FALSE}
foo <- function(x){
# par():
oldpar <- par(mfrow = c(2,2))
on.exit(par(oldpar))
# options():
oldop <- options(digits = 3)
on.exit(options(oldop))
# setwd():
oldwd <- setwd(".")
on.exit(setwd(oldwd))
# your code which requires a changed option
}
```
This will reset the `par()`, `options()` and `setwd()`. The use of [`on.exit()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/on.exit.html) makes it possible to restore options before exiting a function even if the function breaks. Therefore it needs to be called immediately after the option change within a function.
For more information, call [`?on.exit()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/on.exit.html) in your console.
**For demos, examples and vignettes**:
Since no function is exited when changing options in examples, [`on.exit()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/on.exit.html) cannot be used. CRAN recommends the following way for restoring options:
```{r, eval = FALSE}
oldpar <- par(mfrow = c(2,2))
# your code which requires a changed option
par(oldpar)
```
Here the code will only reset the options if the example runs without breaking. Therefore, try to keep the code between setting and resetting as concise as possible. Restoring the `options()` and `setwd()` can be done using the same principle as for `par()` shown above.
:::{.callout-tip}
If you need to change more than one option in the same function, example, vignette or demo, you can use `oldpar <- par(no.readonly = TRUE)` or `oldop <- options()` to reset all parameters at once. Saving the entire
Note, that for `par()` the `no.readonly` argument must be set to `TRUE` or else warnings will be produced.
:::
:::{.callout-note}
The issue described in the recipe, also applies to the use of other function which change some parameters persistently, like `Sys.setLanguage`.
:::
--------------------------
# Writing Files and Directories to the Home Filespace
## Problem
You are writing, either by default or within your examples, vignettes, or tests, to the user's home filespace, including the package directory and the current working directory (`getwd()`).
## Solution
Omit any default path in your functions and write to `tempdir()` in examples, vignettes or tests.
### Details
<!--adding this is a test for including the CRAN mail "template" for each recipe. This version is a small collapsible text box which I will add for (almost) every Issue. Maybe change the title of the boxes?-->
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon=false collapse=true}
Please ensure that your functions do not write by default or in your examples/vignettes/tests in the user's home filespace (including the package directory and getwd()). This is not allowed by CRAN policies. Please omit any default path in writing functions. In your examples/vignettes/tests you can write to tempdir().
:::
The [CRAN Repository Policy](https://cran.r-project.org/web/packages/policies.html) states that:
> The code and examples provided in a package should never do anything which might be regarded as malicious or anti-social.
and gives as example:
> Packages should not write in the user’s home filespace (including clipboards), nor anywhere else on the file system apart from the R session’s temporary directory (or during installation in the location pointed to by TMPDIR: and such usage should be cleaned up). Installing into the system’s R installation (e.g., scripts to its bin directory) is not allowed.
Packages can write per default to `tempdir()`, or any other temporary/cache directory, or have no default path at all, as long as the default is not the user's home filespace.
<!-- tbd: add section about R_user_dir() -->
---------------------------------------
# Leaving Files in the Temporary Directory
## Problem
Some of your code that runs during ` R CMD check ` creates files in the temporary directory specified by the environment variable `TMPDIR` and doesn't remove them afterwards. As a result, ` R CMD check ` gives your package a NOTE complaining about the "detritus in the temp directory".
## Solution
If you use `tempfile()`/`tempdir()` as the default destination for functions called in your examples or tests, `unlink()` the resulting files as part of the test. In the `\examples{}` section of your help files, clean-up actions can be wrapped in `\dontshow{}` for aesthetic reasons. If you launch a browser, 'Calibre', 'Python' code or other software, and it creates temporary files, find out where they are and remove them once the code is done with them.
### Details
::: {.callout-note title="CRAN NOTE" icon=false}
Check: for detritus in the temp directory, Result: NOTE\
  Found the following files/directories:\
   ‘this\_is\_detritus382e569a7712’
:::
The code responsible for the detritus may include your own tests and examples creating files under `tempdir()`, child processes, or 'Python' code launched using 'reticulate'. For example, R packages that use the 'tensorflow' 'Python' package [find and remove temporary files created by its auto-gradient feature](https://github.com/cran/vetiver/blob/35b24768cf0e84fab96610e001bba377dc777953/tests/testthat/setup.R#L13).
A package that uses the 'testthat' test suite will benefit from [the self-cleaning functions `withr::local_tempfile()` and `withr::local_tempdir()`](https://r-pkgs.org/testing-design.html#sec-tests-files-where-write) to work with temporary files. Even without 'testthat', the 'withr' package has no external strong dependencies and can be used independently.
---------------------------------------
# Writing to the `.GlobalEnv`
## Problem
You are writing per default to the global environment, `.GlobalEnv`, the user's workspace.
## Solution
Omit any default writing to the global environment.
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon=false collapse=true}
Please do not modify the global environment (e.g. by using <<-) in your functions. This is not allowed by the CRAN policies.
:::
The `.GlobalEnv` is the main workspace of users. It can also be accessed by `globalenv()`. Writing to the global environment is forbidden for CRAN packages.
Sometimes package maintainers use the operator `<<-`. This operator not only evaluates the expression in the environment it is called in, checks parent environments for an existing definition of the variable. If such a variable is found then its value is redefined, otherwise assignment takes place in the `.GlobalEnv`.
To avoid writing to the global environment, the variable must be defined in a parent environment.
```{r eval=FALSE}
foo <- function(){
# defines the variable in the foo()-function environment
var <- NULL
foo1 <- function(){
# redefines var but only in the foo() environment as it is the parent of foo1()
var <<- "redefined"
}
# calls the foo1() function to redefine var
foo1()
return(var)
}
```
Part of the `.GlobalEnv` is the `.Random.seed` which should not be changed at all.
Excepted from this rule are 'shiny' packages which build interactive web apps. They sometimes need to modify the `.GlobalEnv`.
:::{.callout-note}
Calls of `rm(list = ls())` to remove variables of the current environment, should not be used in examples, vignettes or demos. In functions `rm(list = ls())` can be used, as the active environment is then the function environment instead of the global environment.
:::
------------------------------------------------------------------------
# Calling `installed.packages()`
## Problem
You are calling `installed.packages()` to check if a package is installed.
## Solution
Instead of using `installed.packages()`, use `requireNamespace("pkg")` or `require("pkg")` to find out if packages are available.
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon="false" collapse="true"}
You are using `installed.packages()` in your code. As mentioned in the notes of `installed.packages()` help page, this can be very slow. Therefore do not use `installed.packages()`.
:::
`installed.packages()` outputs a matrix with details of all packages which are installed in the specified libraries. This can be used to check if a specific package is installed. As mentioned in the help file of the [`installed.packages()` function](https://stat.ethz.ch/R-manual/R-devel/library/utils/html/installed.packages.html), it can be very slow under certain circumstances.
> This needs to read several files per installed package, which will be slow on Windows and on some network-mounted file systems. It will be slow when thousands of packages are installed, ...
Therefore, do not use `installed.packages()` in CRAN packages.
The help file also lists different solutions and the respective, alternative functions.
> ... so do not use it to find out if a named package is installed (use `find.package` or `system.file`) nor to find out if a package is usable (call `requireNamespace` or `require` and check the return value) nor to find details of a small number of packages (use `packageDescription`).
Ideally, use `requireNamespace("pkg")` or `require("pkg")`, both return `FALSE` if a package isn't available, and throw an error conditionally. For more details on package installations in your code see this [recipe](#installing-software).
------------------------------------------------------------------------
# Setting `options(warn = -1)`
## Problem
You are setting `options(warn = -1)`.
## Solution
Consider using `suppressWarnings()` instead of `options(warn = -1)` if you absolutely need to suppress warnings.
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon="false" collapse="true"}
You are setting `options(warn=-1)` in your function. This is not allowed. Please rather use `suppressWarnings()` if really needed.
:::
CRAN doesn't allow negative `warn` options. This setting will turn off all warning messages. Even if the settings are correctly restored, as explained in the [Change of Options recipe](#change-of-options-graphical-parameters-and-working-directory), the submission will be rejected.
CRAN recommends using `suppressWarnings()`, which disables warnings only for the specific expression it's applied to, rather than globally.
------------------------------------------------------------------------
# Installing Software {#installing-software}
## Problem
You are installing software or packages in your functions, examples, tests or vignettes.
## Solution
Create special functions for the purpose of installing software and don't install it in examples, tests, or vignettes.
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon="false" collapse="true"}
Please do not install packages in your functions, examples or vignette. This can make the functions, examples and cran-check very slow.
:::
Packages should usually not be installed within functions, especially since dependencies should already be listed in the DESCRIPTION. For external software this is typically the same. However, if the purpose of your package is to connect to specific APIs or provides easier installation for some programs, installing software or packages is allowed on CRAN.
To ensure shorter check times, those functions should not be called within tests, vignettes or examples. The name of function which install software should ideally indicate that, for example:
- `devtools::install_github()`
- `reticulate::install_python()`
The corresponding help file should also state that software will be installed.
::: callout-note
Even if it is made clear that installations will happen, the function shouldn't write to the user's home file space, as mentioned in this [recipe](#writing-files-and-directories-to-the-home-filespace).
:::
-----------------------------------------
# Using more than 2 Cores
## Problem
You are using more than 2 CPU cores in your examples, vignettes, tests or during installation.
## Solution
Make sure that you set the maximum number of cores used to 2 in examples, vignettes or tests.
During installation, many modern build systems use the `-j` flag.
<!-- This was recommended by 'aitap' in a discussion on GitHub -->
### Details
::: {.callout-note title="CRAN Review Communication" appearance="simple" icon=false collapse=true}
Please ensure that you do not use more than 2 cores in your examples, vignettes, etc.
:::
CRAN checks run on big, multi-core computers, running many `R CMD check` processes in parallel. Every process is allowed to use a maximum of 2 CPU cores to exercise parallel code in examples and unit tests. When using more than 2 cores, your package will get a NOTE from CRAN checks, saying that it took more CPU time than elapsed time.
::: {.callout-note title="CRAN NOTE" icon=false}
Check: whether package can be installed, Result: NOTE\
  Installation took CPU time 6.9 times elapsed time\
\
Check: examples, Result: NOTE\
  Examples with CPU time > 2.5 times elapsed time
:::
Automatic tests will generate a NOTE if more than 2 cores are used in examples, vignettes, tests, or during installation. It’s best to provide an option for setting the number of cores in any function, with the default ideally set to fewer than 2 cores.
Something that ran as part of the installation process might have started more than two child processes (or threads) at the same time. The solution for this is specific to the build system your package is using. Many modern build systems however, can use the `-j` flag which originated from `GNU Make` (for more information, see their [manual](https://devdocs.io/gnu_make/options-summary#-j%20[jobs]){target="_blank"}).
<!-- The target="_blank" opens the link in a new tab as this is a link to a non R related resource and thus might not interest the user too much-->