Skip to content

Commit

Permalink
v0.2.3
Browse files Browse the repository at this point in the history
  • Loading branch information
gagolews committed Oct 13, 2022
1 parent c48b161 commit c0c06a9
Show file tree
Hide file tree
Showing 241 changed files with 32,575 additions and 3,485 deletions.
2 changes: 1 addition & 1 deletion .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
^\.Rproj\.user$
^.*\.kdev4$
^\.kdev4
^devel
^.devel
kate-swp$
^README
LICENSE
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file added .devel/sphinx/_build/doctrees/environment.pickle
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/index.doctree
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/news.doctree
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/chartr.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/grepl.doctree
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/gsub.doctree
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/nchar.doctree
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/paste.doctree
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/sort.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/strrep.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/substr.doctree
Binary file not shown.
Binary file added .devel/sphinx/_build/doctrees/rapi/trimws.doctree
Binary file not shown.
4 changes: 4 additions & 0 deletions .devel/sphinx/_build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 90245c3729073b508ce8579ed96408b3
tags: 645f666f9bcd5a90fca523b33c5a78b7
130 changes: 130 additions & 0 deletions .devel/sphinx/_build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
stringx: Drop-in replacements for base R string functions powered by stringi
============================================================================

.. epigraph::
English is the native language for only 5% of the World population.
Also, only 17% of us can understand this text. Moreover, the Latin alphabet
is the main one for merely 36% of the total. The early computer era,
now a very long time ago, was dominated by the US. Due to the proliferation
of the internet, smartphones, social media, and other technologies
and communication platforms, this is no longer the case.
This package replaces base R string functions with ones that fully
support the Unicode standards related to natural language
and date-time processing.
Thanks to `ICU <https://icu.unicode.org/>`_
(International Components for Unicode) and
`stringi <https://stringi.gagolewski.com/>`_,
they are fast, reliable, and portable across different platforms.


`R <https://www.r-project.org/>`_'s ambitions go far beyond being merely the
"free software environment for statistical computing and graphics".
It has proven effective in developing whole data analysis pipelines:
from gathering information through the discovery of knowledge to
the communication of results.

**Modern data science is no longer just about number crunching.**
Text is a rich source of new knowledge — from natural language
processing to bioinformatics. It also gives powerful
means to represent or transfer unstructured data.

**stringx brings R string processing abilities into the 21st century.**
It replaces functions like ``paste()``, ``grep()``, ``tolower()``,
``strptime()``, and ``sprintf()`` with ones that:

* support a wide range of languages and scripts and
fully conform to `Unicode <https://www.unicode.org/>`_ standards
(see also `this video <https://www.youtube.com/watch?v=-n2nlPHEMG8>`_),
* work in the same way on every platform,
* fix some long-standing inconsistencies in the base R functions
(related to vectorisation, handling of missing values,
preservation of attributes, order of arguments, interoperability
with other procedures, etc.;
they are all thoroughly documented in this online manual,
happy reading! 🤓),
* are more forward-pipe (``|>`` or ``magrittr::%>%``) operator-friendly.

Also, a few new, useful operations are introduced.

.. code-block:: r
install.packages("stringx") # install from CRAN
suppressMessages(library("stringx"))
c("ACTGCT", "42", "stringx \U0001f970") |> grepv2("\\p{EMOJI_PRESENTATION}")
## [1] "stringx 🥰"
toupper("gro\u00DF") # replaces base::toupper()
## [1] "GROSS"
l <- c("e", "e\u00b2", "\u03c0", "\u03c0\u00b2", "\U0001f602\U0001f603")
r <- c(exp(1), exp(2), pi, pi^2, NaN)
cat(sprintf("%8s=%+.3f", l, r), sep="\n") # replaces base::sprintf()
## e=+2.718
## e²=+7.389
## π=+3.142
## π²=+9.870
## 😂😃= NaN
.. COMMENT
but we do not aim to fix the whole nam.ING_meSS
99% compatible (cannot be 100% as they use a different regex engine,
for example, and some inconsistencies are quite obvious and can be a push
for a change in the right direction)
* collator - portable (locales), Unicode-correct (normalisation)
* date/time - portable (locales)
* iconv - portable
* regex - Unicode-correct, portable
* speed
TODO: mention https://unicode-org.github.io/icu/userguide/icu/posix.html
**stringx** is a set of wrappers around
`stringi <https://stringi.gagolewski.com/>`_ — a mature
`R <https://www.r-project.org/>`_ package for
fast, consistent, convenient, and portable string/text/natural language
processing in any locale that relies on
`ICU – International Components for Unicode <https://icu.unicode.org/>`_.

*stringx*'s source code is hosted on
`GitHub <https://github.com/gagolews/stringx>`_. Its official releases
are available on `CRAN <https://cran.r-project.org/package=stringx>`_.
It is distributed under the terms of the GNU General Public License,
either Version 2 or Version 3; see
`license <https://raw.githubusercontent.com/gagolews/stringx/master/LICENSE>`_.


.. toctree::
:maxdepth: 2
:caption: stringx
:hidden:

About <self>
Author <https://www.gagolewski.com/>


.. toctree::
:maxdepth: 2
:caption: Reference Manual
:glob:

rapi/*
.. rapi.md
.. toctree::
:maxdepth: 1
:caption: Other

Source Code (GitHub) <https://github.com/gagolews/stringx>
Bug Tracker and Feature Suggestions <https://github.com/gagolews/stringx/issues>
CRAN Entry <https://cran.r-project.org/package=stringx>
news.md

.. COMMENT
.. |downloads1| image:: https://cranlogs.r-pkg.org/badges/grand-total/stringx
.. |downloads2| image:: https://cranlogs.r-pkg.org/badges/last-month/stringx
96 changes: 96 additions & 0 deletions .devel/sphinx/_build/html/_sources/news.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# What Is New in *stringx*

> Note that the date-time processing functions in *stringx* are a work
> in progress. Feature requests/comments/remarks are welcome,
> see https://github.com/gagolews/stringx/issues.



## 0.2.3 (2022-10-13)

* [BUGFIX] Fixed failing checks/tests.


## 0.2.2 (2021-09-03)

* [DOCUMENTATION] ICU Project site has been moved to <https://icu.unicode.org/>.


## 0.2.1 (2021-08-27)

* [BACKWARD INCOMPATIBILITY, BUGFIX] #7: Dates without times are now always
treated as being at midnight in the local (default) time zone.

* [BACKWARD INCOMPATIBILITY] Date-time functions now yield objects
of class `POSIXxt`, which extend upon `POSIXct` (and allow for custom
formatting etc.).

* [BACKWARD INCOMPATIBILITY, BUGFIX] #7: `strftime` uses the `tzone` attribute
by default.

* [NEW FEATURE] Added functions: `as.POSIXxt`, `is.POSIXxt`,
`Sys.time`, `ISOdatetime`, `ISOdate`, `Ops.POSIXxt`,
`c.POSIXxt`, `rep.POSIXxt`, `seq.POSIXxt`.


## 0.1.3 (2021-08-05)

* [BUGFIX] #4: Fixed failing check with ICU 55.

* [BUGFIX] #5: Fixed failing check under POSIX/C locale.


## 0.1.2 (2021-07-27)

* First [CRAN](https://cran.r-project.org/package=stringx) release.


## 0.1.1 (2021-07-15)

* [GENERAL] [On-line manual](https://stringx.gagolewski.com) is now available.

* [GENERAL] Using [*realtest*](https://realtest.gagolewski.com)
for documenting base R behaviour, unit testing, and desired outcomes.

* [NEW FEATURE] Added constants: `letters_greek`, `digits_hex`, etc.

* [NEW FEATURE] Added functions and operators:
`strcat`, `%x+%`, `%x*%`,
`chartr2`, `strtrans`,
`printf`,
`xtfrm2`,
`strftime`, `strptime`,
`strcoll`, `%x==%`, `%x!=%`, `%x<%`, `%x<=%`, `%x>%`, `%x>=%`,
`substrl`, `substrl<-`,
`sub2`, `gsub2`,
`grepl2`, `grepv2`, `grepv2<-`,
`regexpr2`, `gregexpr2`,
`regexec2`, `gregexec2`,
`gsubstrl`, `gsubstrl<-`,
`gsubstr`, `gsubstr<-`,
`regextr2`, `regextr2<-`,
`gregextr2`, `gregextr2<-`.

* [NEW FEATURE] Rewritten functions:
`paste`, `paste0`,
`strrep`,
`chartr`, `tolower`, `toupper`, `casefold`,
`sprintf`,
`strftime`, `strptime`,
`nchar`, `nzchar`,
`strtrim`,
`trimws`,
`startsWith`, `endsWith`,
`sort`,
`strwrap`,
`substr`, `substring`, `substr<-`, `substring<-`,
`strsplit`,
`sub`, `gsub`,
`grep`, `grepl`,
`regexpr`, `gregexpr`,
`regexec`, `gregexec`.


## 0.0.0 (2021-05-07)

* The *stringx* project has been started.
File renamed without changes.
80 changes: 80 additions & 0 deletions .devel/sphinx/_build/html/_sources/rapi/ISOdatetime.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# ISOdatetime: Construct Date-time Objects

## Description

`ISOdate` and `ISOdatetime` construct date-time objects from numeric representations. `Sys.time` returns current time.

## Usage

``` r
ISOdatetime(
year,
month,
day,
hour,
min,
sec,
tz = "",
lenient = FALSE,
locale = NULL
)

ISOdate(
year,
month,
day,
hour = 0L,
min = 0L,
sec = 0L,
tz = "",
lenient = FALSE,
locale = NULL
)

Sys.time()
```

## Arguments

| | |
|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `year, month, day, hour, min, sec` | numeric vectors |
| `tz` | `NULL` or `''` for the default time zone (see [`stri_timezone_get`](https://stringi.gagolewski.com/rapi/stri_timezone_set.html)) or a single string with a timezone identifier, see [`stri_timezone_list`](https://stringi.gagolewski.com/rapi/stri_timezone_list.html) |
| `lenient` | single logical value; should date/time parsing be lenient? |
| `locale` | `NULL` or `''` for the default locale (see [`stri_locale_get`](https://stringi.gagolewski.com/rapi/stri_locale_set.html)) or a single string with a locale identifier, see [`stri_locale_list`](https://stringi.gagolewski.com/rapi/stri_locale_list.html) |

## Value

These functions return an object of class `POSIXxt`, which extends upon [`POSIXct`](https://stat.ethz.ch/R-manual/R-devel/library/base/help/POSIXct.html), [`strptime`](strptime.md).

You might wish to consider calling [`as.Date`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/as.Date.html) on the result yielded by `ISOdate`.

No attributes are preserved (because they are too many).

## Differences from Base R

Replacements for base [`ISOdatetime`](https://stat.ethz.ch/R-manual/R-devel/library/base/help/ISOdatetime.html) and [`ISOdate`](https://stat.ethz.ch/R-manual/R-devel/library/base/help/ISOdate.html) implemented with [`stri_datetime_create`](https://stringi.gagolewski.com/rapi/stri_datetime_create.html).

- `ISOdate` does not treat dates as being at midnight by default **\[fixed here\]**

## Author(s)

[Marek Gagolewski](https://www.gagolewski.com/)

## See Also

The official online manual of <span class="pkg">stringx</span> at <https://stringx.gagolewski.com/>

Related function(s): [`strptime`](strptime.md)

## Examples




```r
ISOdate(1970, 1, 1)
## [1] "1970-01-01T00:00:00+1000"
ISOdatetime(1970, 1, 1, 12, 0, 0)
## [1] "1970-01-01T12:00:00+1000"
```
25 changes: 25 additions & 0 deletions .devel/sphinx/_build/html/_sources/rapi/about_stringx.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# about_stringx: Drop-in Replacements for Base String Functions Powered by Stringi

## Description

<span class="pkg">stringx</span> reimplements the built-in R string processing functions based on <span class="pkg">stringi</span> -- a mature R package for fast, correct, consistent, and convenient text manipulation. Thanks to the <span class="pkg">ICU</span> library, we obtain predictable results on every platform, in each locale, and under any native character encoding.

**Keywords**: R, text processing, character strings, internationalisation, localisation, ICU, ICU4C, i18n, l10n, Unicode

**License**: GNU General Public License version 2 or later

## Author(s)

[Marek Gagolewski](https://www.gagolewski.com/)

## References

*<span class="pkg">stringi</span> Package homepage*, <https://stringi.gagolewski.com/>

*ICU -- International Components for Unicode*, <https://icu.unicode.org/>

*The Unicode Consortium*, <https://home.unicode.org/>

## See Also

The official online manual of <span class="pkg">stringx</span> at <https://stringx.gagolewski.com/>
Loading

0 comments on commit c0c06a9

Please sign in to comment.