Skip to content

Commit

Permalink
Merge remote-tracking branch 'github/release'
Browse files Browse the repository at this point in the history
  • Loading branch information
Mark Klik committed Apr 10, 2019
2 parents ade4e32 + 73debf0 commit 8834ea7
Show file tree
Hide file tree
Showing 56 changed files with 653 additions and 2,404 deletions.
10 changes: 3 additions & 7 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,7 @@ before_install:
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install llvm &&
export PATH="/usr/local/opt/llvm/bin:$PATH" &&
export LDFLAGS="-L/usr/local/opt/llvm/lib" &&
export CPPFLAGS="-I/usr/local/opt/llvm/include" &&
export PKG_CXXFLAGS="-O3 -Wall -pedantic"; fi
export CPPFLAGS="-I/usr/local/opt/llvm/include"; fi

r_packages:
- covr
Expand All @@ -31,12 +30,9 @@ r_packages:
- testthat
- data.table

addons:
apt:
update: true

after_success:
- Rscript -e 'library(covr); codecov(quiet = FALSE)'
- test $TRAVIS_OS_NAME == "linux" &&
travis_wait Rscript -e 'library(covr); codecov(quiet = FALSE)'

env:
global:
Expand Down
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ Description: Multithreaded serialization of compressed data frames using the
'fst' format. The 'fst' format allows for random access of stored data and
compression with the LZ4 and ZSTD compressors created by Yann Collet. The ZSTD
compression library is owned by Facebook Inc.
Version: 0.8.10
Date: 2018-12-13
Version: 0.9.0
Date: 2019-04-02
Authors@R: c(
person("Mark", "Klik", email = "markklik@gmail.com", role = c("aut", "cre", "cph")),
person("Yann", "Collet", role = c("ctb", "cph"),
Expand Down
24 changes: 23 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,26 @@

# fst 0.9.0 (April 2, 2019)

Version 0.9.0 of the `fst` package addresses the request from CRAN maintainers to fix issues identified by rchk. These issues result from PROTECT / UNPROTECT pairs called in the constructor / destructor pairs of C++ classes. rchk (rightfully) warns about those because it can't determine from the code if pairs are properly matched. With this submission the relevant SEXP classes are protected by containing them in SEXP classes that are already PROTECTED, which allows for removal of the PROTECT / UNPROTECT pairs in question.

As of `fst` version 0.9.0, support for fst files generated with `fst` package versions lower than 0.8.0 has been deprecated. This significantly reduces the (C++) code base and prepares `fst` for future code changes.

## Library updates

* Library `fstlib` updated to version 0.1.1

## Enhancements

* Method `setnrofthreads` returns invisible result to avoid printing unwanted output (thanks @renkun-ken for the pull request)

## Bugs solved

* Empty subsets can be selected using `fst::fst` (thanks @renkun-ken for reporting)

## Documentation

Various documentation issues have been fixed (thanks @ginberg and @renkun-ken for the pull requests).

# fst 0.8.10 (December 14, 2018)

Version 0.8.10 of the `fst` package is an intermediate release designed to update the incorporated C++ libraries
Expand Down Expand Up @@ -39,7 +61,7 @@ Version 0.8.6 of the `fst` package brings clearer printing of `fst_table` object

* User has more control over the number of threads used by fst. Option 'fst_threads' can now be used to initialize the number of threads when the package is first loaded (issue #132, thanks to @karldw for the pull request).

* Option 'fst_restore_after_fork' can be used to select the threading behaviour after a fork has ended. Like the `data.table` package, `fst` switches back to a single thread when a fork is detected (using OpenMP in a fork can lead to problems). Unlike `data.table`, the `fst` package restores the number of threads to it's previous setting when the fork ends. If this leads to unexpected problems, the user can set the 'fst_restore_after_fork' option to FALSE to disable that.
* Option 'fst_restore_after_fork' can be used to select the threading behavior after a fork has ended. Like the `data.table` package, `fst` switches back to a single thread when a fork is detected (using OpenMP in a fork can lead to problems). Unlike `data.table`, the `fst` package restores the number of threads to it's previous setting when the fork ends. If this leads to unexpected problems, the user can set the 'fst_restore_after_fork' option to FALSE to disable that.

## Bugs solved

Expand Down
8 changes: 4 additions & 4 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ fststore <- function(fileName, table, compression, uniformEncoding) {
.Call(`_fst_fststore`, fileName, table, compression, uniformEncoding)
}

fstmetadata <- function(fileName, oldFormat) {
.Call(`_fst_fstmetadata`, fileName, oldFormat)
fstmetadata <- function(fileName) {
.Call(`_fst_fstmetadata`, fileName)
}

fstretrieve <- function(fileName, columnSelection, startRow, endRow, oldFormat) {
.Call(`_fst_fstretrieve`, fileName, columnSelection, startRow, endRow, oldFormat)
fstretrieve <- function(fileName, columnSelection, startRow, endRow) {
.Call(`_fst_fstretrieve`, fileName, columnSelection, startRow, endRow)
}

fsthasher <- function(rawVec, seed, blockHash) {
Expand Down
23 changes: 14 additions & 9 deletions R/fst.R
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,8 @@ write_fst <- function(x, path, compress = 50, uniform_encoding = TRUE) {
#' Method for checking basic properties of the dataset stored in \code{path}.
#'
#' @param path path to fst file
#' @param old_format use TRUE to read fst files generated with a fst package version lower than v0.8.0
#' @param old_format must be FALSE, the old fst file format is deprecated and can only be read and
#' converted with fst package versions 0.8.0 to 0.8.10.
#' @return Returns a list with meta information on the stored dataset in \code{path}.
#' Has class \code{fstmetadata}.
#' @examples
Expand All @@ -97,13 +98,15 @@ write_fst <- function(x, path, compress = 50, uniform_encoding = TRUE) {
#' metadata_fst("dataset.fst")
#' @export
metadata_fst <- function(path, old_format = FALSE) {
if (!is.logical(old_format)) {
stop("A logical value is expected for parameter 'old_format'.")

if (old_format != FALSE) {
stop("Parameter old_format is depricated, fst files written with fst package version",
" lower than 0.8.0 should be read (and rewritten) using fst package versions <= 0.8.10.")
}

full_path <- normalizePath(path, mustWork = FALSE)

metadata <- fstmetadata(full_path, old_format)
metadata <- fstmetadata(full_path)

if (inherits(metadata, "fst_error")) {
stop(metadata)
Expand Down Expand Up @@ -150,13 +153,14 @@ print.fstmetadata <- function(x, ...) {

#' @rdname write_fst
#'
#' @param columns Column names to read. The default is to read all all columns.
#' @param columns Column names to read. The default is to read all columns.
#' @param from Read data starting from this row number.
#' @param to Read data up until this row number. The default is to read to the last row of the stored dataset.
#' @param as.data.table If TRUE, the result will be returned as a \code{data.table} object. Any keys set on
#' dataset \code{x} before writing will be retained. This allows for storage of sorted datasets. This option
#' requires \code{data.table} package to be installed.
#' @param old_format use TRUE to read fst files generated with a fst package version lower than v0.8.0
#' @param old_format must be FALSE, the old fst file format is deprecated and can only be read and
#' converted with fst package versions 0.8.0 to 0.8.10.
#'
#' @export
read_fst <- function(path, columns = NULL, from = 1, to = NULL, as.data.table = FALSE, old_format = FALSE) {
Expand All @@ -182,11 +186,12 @@ read_fst <- function(path, columns = NULL, from = 1, to = NULL, as.data.table =
to <- as.integer(to)
}

if (!is.logical(old_format)) {
stop("A logical value is expected for parameter 'old_format'.")
if (old_format != FALSE) {
stop("Parameter old_format is depricated, fst files written with fst package version",
" lower than 0.8.0 should be read (and rewritten) using fst package versions <= 0.8.10.")
}

res <- fstretrieve(fileName, columns, from, to, old_format)
res <- fstretrieve(fileName, columns, from, to)

if (inherits(res, "fst_error")) {
stop(res)
Expand Down
98 changes: 70 additions & 28 deletions R/fst_table.R
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,12 @@
#' }
fst <- function(path, old_format = FALSE) {

# old format is deprecated as of v0.9.0
if (old_format != FALSE) {
stop("Parameter old_format is depricated, fst files written with fst package version",
" lower than 0.8.0 should be read (and rewritten) using fst package versions <= 0.8.10.")
}

# wrap in a list so that additional elements can be added if required
ft <- list(
meta = metadata_fst(path, old_format),
Expand Down Expand Up @@ -341,48 +347,76 @@ as.list.fst_table <- function(x, ...) {
}


# drop to lower dimension when drop = TRUE
return_drop <- function(x, drop) {

if (!drop | ncol(x) > 1) return(x)

x[[1]]
}


#' @export
`[.fst_table` <- function(x, i, j, drop = FALSE) {
if (drop) {
warning("drop ignored", call. = FALSE)
`[.fst_table` <- function(x, i, j, drop) {

# check for old_format in case an 'old' fst_table object was deserialized
if (.subset2(x, "old_format") != FALSE) {
stop("fst files written with fst package version",
" lower than 0.8.0 should be read (and rewritten) using fst package versions <= 0.8.10.")
}

meta_info <- .subset2(x, "meta")

# when only i is present, we do a column subsetting
# no additional arguments provided

if (missing(i) && missing(j)) {
return(read_fst(meta_info$path, old_format = .subset2(x, "old_format")))

# never drop as with data.frame
return(read_fst(meta_info$path))
}


if (nargs() <= 2) {

# return full table
# result is never dropped with 2 arguments

if (missing(i)) {
# we have a j
# we have a named argument j
j <- .column_indexes_fst(meta_info, j)
return(read_fst(meta_info$path, j, old_format = .subset2(x, "old_format")))
return(read_fst(meta_info$path, j))
}

# i is interpreted as j
j <- .column_indexes_fst(meta_info, i)
return(read_fst(meta_info$path, j, old_format = .subset2(x, "old_format")))
return(read_fst(meta_info$path, j))
}

# drop dimension if single column selected and drop != FALSE
drop_dim <- FALSE

if (!missing(j) && length(j) == 1) {

if (!(!missing(drop) && drop == FALSE)) {
drop_dim <- TRUE
}
}

# return all rows
# special case where i is interpreted as j: select all rows, never drop

# special case where i is interpreted as j: select all rows
if (nargs() == 3 && !missing(drop) && !missing(i)) {
j <- .column_indexes_fst(meta_info, i)
return(read_fst(meta_info$path, j, old_format = .subset2(x, "old_format")))
return(read_fst(meta_info$path, j))
}

# i and j not reversed

# full columns
if (missing(i)) {
j <- .column_indexes_fst(meta_info, j)
return(read_fst(meta_info$path, j, old_format = .subset2(x, "old_format")))
x <- read_fst(meta_info$path, j)

if (!drop_dim) return(x)
return(x[[1]])
}


Expand All @@ -397,29 +431,37 @@ as.list.fst_table <- function(x, ...) {

# cast to integer and determine row range
i <- as.integer(i)
min_row <- min(i)
max_row <- max(i)

# boundary check
if (min_row < 0) {
stop("Row selection out of range")
}
# empty row selection
if (length(i) == 0) {
min_row <- 1
max_row <- 1
} else {
min_row <- min(i)
max_row <- max(i)

if (max_row > meta_info$nrOfRows) {
stop("Row selection out of range")
# boundary check
if (min_row < 0) {
stop("Row selection out of range")
}

if (max_row > meta_info$nrOfRows) {
stop("Row selection out of range")
}
}

# column subset

# select all columns
if (missing(j)) {
fst_data <- read_fst(meta_info$path, from = min_row, to = max_row, old_format = .subset2(x, "old_format"))

return(fst_data[1 + i - min_row, ])
fst_data <- read_fst(meta_info$path, from = min_row, to = max_row)
x <- fst_data[1 + i - min_row, ] # row selection, no dropping
} else {
j <- .column_indexes_fst(meta_info, j)
fst_data <- read_fst(meta_info$path, j, from = min_row, to = max_row)
x <- fst_data[1 + i - min_row, , drop = FALSE] # row selection, no dropping
}

j <- .column_indexes_fst(meta_info, j)
fst_data <- read_fst(meta_info$path, j, from = min_row, to = max_row, old_format = .subset2(x, "old_format"))

fst_data[1 + i - min_row, ]
if (!drop_dim) return(x)
return(x[[1]])
}
4 changes: 2 additions & 2 deletions R/openmp.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
#' specific requirements. As a default, \code{fst} uses a number of threads equal to the number of
#' logical cores in the system.
#'
#' The number of threads can also be set with \code{option(fst_threads = N)}.
#' The number of threads can also be set with \code{options(fst_threads = N)}.
#' NOTE: This option is only read when the package's namespace is first loaded, with commands like
#' \code{library}, \code{require}, or \code{::}. If you have already used one of these, you
#' must use \code{threads_fst} to set the number of threads.
Expand Down Expand Up @@ -62,5 +62,5 @@ threads_fst <- function(nr_of_threads = NULL, reset_after_fork = NULL) {
return(getnrofthreads())
}

setnrofthreads(nr_of_threads)
invisible(setnrofthreads(nr_of_threads))
}
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ knitr::opts_chunk$set(
<img src="logo.png" align="right" />

[![Linux/OSX Build Status](https://travis-ci.org/fstpackage/fst.svg?branch=develop)](https://travis-ci.org/fstpackage/fst)
[![WIndows Build status](https://ci.appveyor.com/api/projects/status/6g6kp8onpb26jhnm/branch/develop?svg=true)](https://ci.appveyor.com/project/fstpackage/fst/branch/develop)
[![Windows Build status](https://ci.appveyor.com/api/projects/status/6g6kp8onpb26jhnm/branch/develop?svg=true)](https://ci.appveyor.com/project/fstpackage/fst/branch/develop)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/fst)](https://cran.r-project.org/package=fst)
[![codecov](https://codecov.io/gh/fstpackage/fst/branch/develop/graph/badge.svg)](https://codecov.io/gh/fstpackage/fst)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

[![Linux/OSX Build
Status](https://travis-ci.org/fstpackage/fst.svg?branch=develop)](https://travis-ci.org/fstpackage/fst)
[![WIndows Build
[![Windows Build
status](https://ci.appveyor.com/api/projects/status/6g6kp8onpb26jhnm/branch/develop?svg=true)](https://ci.appveyor.com/project/fstpackage/fst/branch/develop)
[![License: AGPL
v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
Expand Down
7 changes: 4 additions & 3 deletions cran-checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,22 +10,23 @@
- AppVeyor (Windows Server)
- latest R dev version on Windows
* Build packages with dependencies on fst
* Start release branch from develop
* Merge develop branch into release branch
* Bump version to even value in DESCRIPTION and check package startup message
* Update README.Rmd and verify generated README.md on Github (release)
* Update cran_comments.md
* Update NEWS.md and make sure to remove '(in development)' in the version title
and update the version number
* Credit all GitHub contributions in NEWS.md
* Build docs folder using pkgdown::build_site()
* Merge branch release into master
* Submit to CRAN

* Commit the fstpackage.github.io repositry with the latest docs

# After releasing to CRAN

* Merge branch master into release
* Go to the repository release page and create a new release with tag version vx.y.z.
Copy and paste the contents of the relevant NEWS.md section into the release notes.
* Add '(in development)' to version title in NEWS.md and update to odd version number
* Bump version to odd value and check package startup message
* Check package startup message
* Merge release branch into develop
Loading

0 comments on commit 8834ea7

Please sign in to comment.