Skip to content

Commit

Permalink
update spelling
Browse files Browse the repository at this point in the history
  • Loading branch information
ThierryO committed Aug 19, 2024
1 parent 5c03000 commit b6f2e1f
Show file tree
Hide file tree
Showing 9 changed files with 48 additions and 18 deletions.
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ or similar - or if just relates to an issue make sure to mention
it like "#4" -->

## Example
<!--- if introducing a new feature or changing behavior of existing
<!--- if introducing a new feature or changing behaviour of existing
methods/functions, include an example if possible to do in brief form -->

<!--- Did you remember to include tests? Unless you're just changing
Expand Down
2 changes: 1 addition & 1 deletion R/clean_data_path.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#' file extensions
#' @inheritParams write_vc
#' @param normalize Normalize the path? Defaults to TRUE
#' @return A named vector with "raw_file" and "meta_file", refering to the
#' @return A named vector with "raw_file" and "meta_file", referring to the
#' `".tsv"` and `".yml"` files.
#' @noRd
#' @family internal
Expand Down
2 changes: 1 addition & 1 deletion R/meta.R
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ meta.Date <- function(x, optimize = TRUE, ...) {
#' plus an additional `..generic` element. `..generic` is a reserved name for
#' the metadata and not allowed as column name in a `data.frame`.
#'
#' \code{\link{write_vc}} uses this function to prepare a dataframe for storage.
#' `write_vc()` uses this function to prepare a dataframe for storage.
#' Existing metadata is passed through the optional `old` argument. This
#' argument intended for internal use.
#' @rdname meta
Expand Down
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,11 @@ See `vignette("workflow", package = "git2rdata")`.
Use this to check whether an existing analysis is obsolete due to new data.
This allows to not rerun up to date analyses, saving resources.

## Talk About `git2rdata` at useR!2019 in Toulouse, France
## Talk About `git2rdata` at
useR!2019<!-- spell-check: ignore -->
in Toulouse, France

<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/sbRPmakBFqo" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/sbRPmakBFqo" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><!-- spell-check: ignore -->

## Installation

Expand Down
26 changes: 19 additions & 7 deletions checklist.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,22 @@ package: yes
allowed:
warnings: []
notes: []
citation_roles:
- aut
- cre
keywords:
- R package
- reproducible research
- version control
required:
- CITATION
- DESCRIPTION
- R CMD check
- checklist
- codemeta
- documentation
- filename conventions
- folder conventions
- license
- lintr
- repository secret
- spelling
spelling:
default: en-GB
ignore:
- .github/ISSUE_TEMPLATE/feature_request.md
- LICENSE.md
- cran-comments.md
7 changes: 7 additions & 0 deletions inst/en_gb.dic
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Bitbucket
Gitlab
codecov
kiB
rOpenSci
rdata
regex
2 changes: 1 addition & 1 deletion vignettes/efficiency.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ This vignette compares storage and retrieval of data by `git2rdata` with other s
We consider `write.table()` and `read.table()` for data stored in a plain text format.
`saveRDS()` and `readRDS()` use a compressed binary format.

To get some meaningful results, we will use the `nassCDS` dataset from the [DAAG](https://www.rdocumentation.org/packages/DAAG/versions/1.22/topics/nassCDS) package.
To get some meaningful results, we will use the `nassCDS` dataset from the [DAAG](https://www.rdocumentation.org/packages/DAAG/versions/1.22/topics/nassCDS) package. <!-- spell-check: ignore -->
We'll avoid the dependency on the package by directly downloading the data.

```{r download_data, eval = system.file("efficiency", "airbag.rds", package = "git2rdata") == ""}
Expand Down
15 changes: 11 additions & 4 deletions vignettes/plain_text.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,10 @@ These functions determine factor levels based on the observed levels in the plai
Hence factor levels without observations will disappear.
The order of the factor levels is also determined by the available levels in the plain text file, which can be different from the original order.

The `write_vc()` and `read_vc()` functions from `git2rdata` keep track of the class of each variable and, in case of a factor, also of the factor levels and their order. Hence this function pair preserves the information content of the dataframe. The `vc` suffix stands for **v**ersion **c**ontrol as these functions use their full capacity in combination with a version control system.
The `write_vc()` and `read_vc()` functions from `git2rdata` keep track of the class of each variable and, in case of a factor, also of the factor levels and their order.
Hence this function pair preserves the information content of the dataframe. The `vc` suffix stands for
**v**ersion **c**ontrol<!-- spell-check: ignore -->
as these functions use their full capacity in combination with a version control system.

## Efficiency Relative to Storage and Time

Expand All @@ -61,7 +64,9 @@ Store and return timestamps as UTC.
- Store a `Date` as an integer to the data.
Store the class and the origin in the metadata.

Storing the factors, POSIXct and Date as their index, makes them less user readable. The user can turn off this optimization when user readability is more important than file size.
Storing the factors,
POSIXct <!-- spell-check: ignore -->
and Date as their index, makes them less user readable. The user can turn off this optimization when user readability is more important than file size.

### Optimized for Version Control

Expand Down Expand Up @@ -135,11 +140,13 @@ print_file("first_test.yml", path)
Adding `optimize = FALSE` to `write_vc()` will keep the raw data in a human readable format.
The metadata file is slightly different.
The most obvious is the `optimize: no` tag and the different hash.
Another difference is the metadata for POSIXct and Date classes.
Another difference is the metadata for
POSIXct <!-- spell-check: ignore -->
and Date classes.
They will no longer have an origin tag but a format tag.

Another important difference is that we store the data file as comma separated values instead of tab separated values.
We noticed that the csv file format is more easily recognised by a larger audience as a data file.
We noticed that the `csv` file format is more easily recognised by a larger audience as a data file.


```{r write_verbose}
Expand Down
4 changes: 3 additions & 1 deletion vignettes/version_control.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,9 @@ This implies that two observations switching place does not alter the informatio
Nor does switching two variables.

Version control systems like [git](https://git-scm.com/), [subversion](https://subversion.apache.org/) or [mercurial](https://www.mercurial-scm.org/) focus on accurately keeping track of _any_ change in the files.
Two observations switching place in a plain text file _is_ a change, although the information content^[_sensu_ `git2rdata`] doesn't change.
Two observations switching place in a plain text file _is_ a change, although the information content^[
_sensu_ <!-- spell-check: ignore -->
`git2rdata`] doesn't change.
`git2rdata` helps the user to prepare the plain text files in such a way that any change in the version history is an actual change in the information content.

## Sorting Observations
Expand Down

0 comments on commit b6f2e1f

Please sign in to comment.