sbo

sbo provides utilities for building and evaluating text predictors based on Stupid Back-off N-gram models in R. It includes functions such as:

kgram_freqs(): Extract k-gram frequency tables from a text corpus
sbo_predictor(): Train a next-word predictor via Stupid Back-off.
eval_sbo_predictor(): Test text predictions against an independent corpus.

Installation

Released version

You can install the latest release of sbo from CRAN:

install.packages("sbo")

Development version:

You can install the development version of sbo from GitHub:

# install.packages("devtools")
devtools::install_github("vgherard/sbo")

Example

This example shows how to build a text predictor with sbo:

library(sbo)
p <- sbo_predictor(sbo::twitter_train, # 50k tweets, example dataset
                   N = 3, # Train a 3-gram model
                   dict = sbo::twitter_dict, # Top 1k words appearing in corpus
                   .preprocess = sbo::preprocess, # Preprocessing transformation
                   EOS = ".?!:;" # End-Of-Sentence characters
                   )

The object p can now be used to generate predictive text as follows:

predict(p, "i love") # a character vector
#> [1] "you" "it"  "my"
predict(p, "you love") # another character vector
#> [1] "<EOS>" "me"    "the"
predict(p, 
        c("i love", "you love", "she loves", "we love", "you love", "they love")
        ) # a character matrix
#>      [,1]    [,2]  [,3] 
#> [1,] "you"   "it"  "my" 
#> [2,] "<EOS>" "me"  "the"
#> [3,] "you"   "my"  "me" 
#> [4,] "you"   "our" "it" 
#> [5,] "<EOS>" "me"  "the"
#> [6,] "to"    "you" "and"

Related packages

For more general purpose utilities to work with n-gram models, you can also check out my package {kgrams}.

Help

For help, see the sbo website.

Name		Name	Last commit message	Last commit date
Latest commit History 268 Commits
.circleci		.circleci
.github/workflows		.github/workflows
R		R
data-raw		data-raw
data		data
docs		docs
man		man
src		src
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
appveyor.yml		appveyor.yml
codecov.yml		codecov.yml
cran-comments.md		cran-comments.md
sbo.Rproj		sbo.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sbo

Installation

Released version

Development version:

Example

Related packages

Help

About

Releases 3

Packages

Languages

vgherard/sbo

Folders and files

Latest commit

History

Repository files navigation

sbo

Installation

Released version

Development version:

Example

Related packages

Help

About

Topics

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages