- The experimental
chop_spikes()
anddissect()
functions give common values ofx
their own singleton intervals. - On Unicode platforms, infinity will be represented as ∞ in breaks. Set
options(santoku.infinity = "Inf")
to use the old behaviour. - Singleton breaks are not labelled specially by default in
chop_quantiles(..., raw = FALSE)
. This means that e.g. if the 10th and 20th percentiles are both the same number, the label will still be[10%, 20%]
. - When multiple quantiles are the same, santoku warns and returns the leftmost quantile interval. Before it would merge the intervals, creating labels that might be different to what the user asked for.
chop_quantiles()
gains arecalc_probs
argument.recalc_probs = TRUE
recalculates probabilities usingecdf(x)
, which may give more accurate interval labels.single = NULL
has been documented explicitly inlbl_*
functions.- Bugfix:
brk_manual()
no longer warns ifclose_end = TRUE
(the default).
- santoku is now considered stable.
chop_quantiles()
andbrk_quantiles()
gain a newweights
argument, letting you chop by weighted quantiles usingHmisc::wtd.quantile()
.brk_quantiles()
may now return singleton breaks, producing more accurate results whenx
has duplicate elements.- Some deprecated functions have been removed, and the
raw
argument tolbl_*
functions now always gives a deprecation warning.
- List arguments to
fmt
inlbl_*
functions will be taken as arguments tobase::format
. This gives more flexibility in formatting, e.g.,units
breaks. chop_n()
gains atail
argument, to deal with a last interval containing less thann
elements. Settail = "merge"
to merge it with the previous interval. This guarantees that all intervals contain at leastn
elements.chop_equally()
may return fewer thangroups
groups when there are duplicate elements. We now warn when this happens.- Bugfix:
chop_n()
could return intervals with fewer thann
elements when there were duplicate elements. The new algorithm avoids this, but may be slower in this case.
endpoint_labels()
methods gain an unused...
argument to satisfy R CMD CHECK.
There are important changes to close_end
.
-
close_end
is nowTRUE
by default inchop()
andfillet()
. In previous versions:chop(1:2, 1:2) ## [1] [1, 2) {2} ## Levels: [1, 2) {2}
Whereas now:
chop(1:2, 1:2) ## [1] [1, 2] [1, 2] ## Levels: [1, 2]
-
close_end
is now always applied afterextend
. For example, in previous versions:chop(1:4, 2:3, close_end = TRUE) ## [1] [1, 2) [2, 3] [2, 3] (3, 4] ## Levels: [1, 2) [2, 3] (3, 4]
Whereas now:
chop(1:4, 2:3, close_end = TRUE) ## [1] [1, 2) [2, 3) [3, 4] [3, 4] ## Levels: [1, 2) [2, 3) [3, 4]
We changed this behaviour to be more in line with user expectations.
-
If
breaks
has names, they will be used as labels:chop(1:5, c(Low = 1, Mid = 2, High = 4)) ## [1] Low Mid Mid High High ## Levels: Low Mid High
Names can also be used for labels in
probs
inchop_quantiles()
andproportions
inchop_proportions()
. -
There is a new
raw
parameter tochop()
. This replaces the parameterraw
inlbl_*
functions, which is now soft-deprecated. -
lbl_manual()
is deprecated. Just use a vector argument tolabels
instead. -
A
labels
argument tochop_quantiles()
now needs to be explicitly named.
I expect these to be the last important breaking changes before we release version 1.0 and mark the package as "stable". If they cause problems for you, please file an issue.
- New
chop_fn()
,brk_fn()
andtab_fn()
chop using an arbitrary function. - Added section on non-standard objects to vignette.
lbl_endpoint()
has been renamed tolbl_endpoints()
. The old version will trigger a deprecation warning.lbl_endpoints()
gainsfirst
,last
andsingle
arguments like other labelling functions.
- New
chop_pretty()
,brk_pretty()
andtab_pretty()
functions usebase::pretty()
to calculate attractive breakpoints. Thanks @davidhodge931. - New
chop_proportions()
,brk_proportions()
andtab_proportions()
functions chopx
into proportions of its range. chop_equally()
now useslbl_intervals(raw = TRUE)
by default, bringing it into line withchop_evenly()
,chop_width()
andchop_n()
.- New
lbl_midpoints()
function labels breaks by their midpoints. lbl_discrete()
gains asingle
argument.- You can now chop
ts
,xts::xts
andzoo::zoo
objects. chop()
is more forgiving when mixing different types, e.g.:Date
objects withPOSIXct
breaks, and vice versabit64::integer64
anddouble
s
- Bugfix:
lbl_discrete()
sometimes had ugly label formatting.
- In labelling functions,
first
andlast
arguments are now passed toglue::glue()
. Variablesl
andr
represent the left and right endpoints of the intervals. chop_mean_sd()
now takes a vectorsds
of standard deviations, rather than a single maximum numbersd
of standard deviations. Write e.g.chop_mean_sd(sds = 1:3)
rather thanchop_mean_sd(sd = 3)
. Thesd
argument is deprecated.- The
groups
argument tochop_evenly()
, deprecated in 0.4.0, has been removed. brk_left()
andbrk_right()
, deprecated in 0.4.0, have been removed.knife()
, deprecated in 0.4.0, has been removed.lbl_format()
, questioning since 0.4.0, has been removed.- Arguments of
lbl_dash()
andlbl_intervals()
have been reordered for consistency with other labelling functions.
- You can now chop many more types, including
units
from theunits
package,difftime
objects,package_version
objects, etc.- Character vectors will be chopped by lexicographic order, with an optional warning.
- If you have problems chopping a vector type, file a bug report.
- The
{glue}
package has become a hard dependency. It is used in many places to format labels. - There is a new
lbl_glue()
function using the{glue}
package. Thanks to @dpprdan. - You can now set
labels = NULL
to return integer codes. - Arguments
first
,last
andsingle
can be used inlbl_intervals()
andlbl_dash()
, to override the first and last interval labels, or to label singleton intervals. lbl_dash()
andlbl_discrete()
use unicode em-dash where possible.brk_default()
throws an error if breaks are not sorted.
- Bugfix:
tab()
and friends no longer display anx
as the variable name. - Bugfix:
lbl_endpoint()
was erroring for some types of breaks.
-
New arguments
first
andlast
inlbl_dash()
andlbl_discrete()
allow you to override the first and last interval labels. -
Fixes for CRAN.
- Negative numbers can be used in
chop_width()
.- This sets
left = FALSE
by default. - Also works for negative time intervals.
- This sets
- Bugfix:
chop(1:4, 1)
was erroring.
The new version has some interface changes. These are based on user experience,
and are designed to make using chop()
more intuitive and predictable.
-
chop()
has two new arguments,left
andclose_end
.- Using
left = FALSE
is simpler and more intuitive than wrapping breaks inbrk_right()
. brk_left()
andbrk_right()
have been kept for now, but cannot be used to wrap other break functions.- Using
close_end
is simpler than passingclose_end
intobrk_left()
orbrk_right()
(which no longer accept this argument directly). left = TRUE
by default, except for non-numeric objects inchop_quantiles()
andchop_equally()
, whereleft = FALSE
works better.
- Using
-
close_end
is nowFALSE
by default.- This prevents user surprises when e.g.
chop(3, 1:3)
puts3
into a different category thanchop(3, 1:4)
. close_end
isTRUE
by default forchop_quantiles()
,chop_n()
and similar functions. This ensures that e.g.chop_quantiles(x, c(0, 1/3, 2/3, 1))
does what you would expect.
- This prevents user surprises when e.g.
-
The
groups
argument tochop_evenly()
has been renamed fromgroups
tointervals
. This should make it easier to remember the difference betweenchop_evenly()
andchop_equally()
. (Chop evenly inton
equal-width intervals, or chop equally inton
equal-sized groups.) -
knife()
has been deprecated to keep the interface slim and focused. Usepurrr::partial()
instead.
-
Date and datetime (
POSIXct
) objects can now be chopped.chop_width()
acceptsdifftime
,lubridate::period
orlubridate::duration
objects- all other
chop_
functions work as well.
-
Many labelling functions have a new
fmt
argument. This can be a string interpreted bysprintf()
orformat()
, or a 1-argument formatting function for break endpoints, e.g.scales::label_percent()
. -
Experimental:
lbl_discrete()
for discrete data such as integers or (most) dates. -
There is a new
lbl_endpoint()
function for labelling intervals solely by their left or right endpoint. -
brk_mean_sd()
now accepts non-integer positive numbers. -
Add
brk_equally()
for symmetry withchop_equally()
. -
Minor tweaks to
chop_deciles()
. -
Bugfix:
lbl_format()
wasn't accepting numeric formats, even whenraw = TRUE
. Thanks to Sharla Gelfand.
-
First CRAN release.
-
Changed
kut()
tokiru()
.kiru()
is an alternative spelling forchop()
, for use when the tidyr package is loaded. -
lbl_sequence()
has becomelbl_manual()
. -
lbl_letters()
and friends have been replaced bylbl_seq()
:- to replace
lbl_letters()
uselbl_seq()
- to replace
lbl_LETTERS()
uselbl_seq("A")
- to replace
lbl_roman()
uselbl_seq("i")
- to replace
lbl_ROMAN()
uselbl_seq("I")
- to replace
lbl_numerals()
uselbl_seq("1")
- for more complex formatting use e.g.
lbl_seq("A:")
,lbl_seq("(i)")
- to replace
-
Added a
NEWS.md
file to track changes to the package. -
Default labels when
extend = NULL
have changed, from[-Inf, ...
and..., Inf]
to[min(x), ...
and..., max(x)]
.