Skip to content

Commit

Permalink
add refs to DESCRIPTION, correct Luce ref
Browse files Browse the repository at this point in the history
  • Loading branch information
hturner committed Dec 7, 2017
1 parent 52c0ba8 commit 2e83f02
Show file tree
Hide file tree
Showing 35 changed files with 416 additions and 466 deletions.
19 changes: 10 additions & 9 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,16 @@ Authors@R: c(person("Heather", "Turner", email =
"Firth", role = "aut"), person("Jacob", "van Etten", role = "ctb"))
URL: https://hturner.github.io/PlackettLuce/
BugReports: https://github.com/hturner/PlackettLuce/issues
Description: Functions to prepare rankings data and fit the Plackett-Luce model.
The standard Plackett-Luce model is generalized to accommodate ties of any
order in the ranking. Partial rankings, in which only a subset of items are
ranked in each ranking, are also accommodated in the implementation.
Disconnected/weakly connected networks implied by the rankings are handled
by adding pseudo-rankings with a hypothetical item. Methods are provided to
estimate standard errors or quasi-standard errors for inference as well as
to fit Plackett-Luce trees. See the package website or vignette for full
details.
Description: Functions to prepare rankings data and fit the Plackett-Luce model
jointly attributed to Plackett (1975) <doi:10.2307/2346567> and Luce
(1959, ISBN:0486441369). The standard Plackett-Luce model is generalized
to accommodate ties of any order in the ranking. Partial rankings, in which
only a subset of items are ranked in each ranking, are also accommodated in
the implementation. Disconnected/weakly connected networks implied by the
rankings are handled by adding pseudo-rankings with a hypothetical item.
Methods are provided to estimate standard errors or quasi-standard errors
for inference as well as to fit Plackett-Luce trees. See the package website
or vignette for full details.
License: GPL-3
Encoding: UTF-8
LazyData: true
Expand Down
154 changes: 114 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,50 @@

PlackettLuce
============
# PlackettLuce

[![Travis-CI Build Status](https://travis-ci.org/hturner/PlackettLuce.svg?branch=master)](https://travis-ci.org/hturner/PlackettLuce) [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/hturner/PlackettLuce?branch=master&svg=true)](https://ci.appveyor.com/project/hturner/PlackettLuce) [![Coverage Status](https://img.shields.io/codecov/c/github/hturner/PlackettLuce/master.svg)](https://codecov.io/github/hturner/PlackettLuce?branch=master)
[![Travis-CI Build
Status](https://travis-ci.org/hturner/PlackettLuce.svg?branch=master)](https://travis-ci.org/hturner/PlackettLuce)
[![AppVeyor Build
Status](https://ci.appveyor.com/api/projects/status/github/hturner/PlackettLuce?branch=master&svg=true)](https://ci.appveyor.com/project/hturner/PlackettLuce)
[![Coverage
Status](https://img.shields.io/codecov/c/github/hturner/PlackettLuce/master.svg)](https://codecov.io/github/hturner/PlackettLuce?branch=master)

Package website: <https://hturner.github.io/PlackettLuce/>.

Overview
--------
## Overview

The **PlackettLuce** package implements a generalization of the model jointly attributed to Plackett (1975) and Luce (1959) for modelling rankings data. Examples of rankings data might be the finishing order of competitors in a race, or the preference of consumers over a set of competing products.
The **PlackettLuce** package implements a generalization of the model
jointly attributed to Plackett (1975) and Luce (1959) for modelling
rankings data. Examples of rankings data might be the finishing order of
competitors in a race, or the preference of consumers over a set of
competing products.

The output of the model is an estimated **worth** for each item that appears in the rankings. The parameters are generally presented on the log scale for inference.
The output of the model is an estimated **worth** for each item that
appears in the rankings. The parameters are generally presented on the
log scale for inference.

The implementation of the Plackett-Luce model in **PlackettLuce**:

- Accommodates ties (of any order) in the rankings, e.g. bananas ≻ {apples, oranges} ≻ pears.
- Accommodates sub-rankings, e.g. pears ≻ apples, when the full set of items is {apples, bananas, oranges, pears}.
- Handles disconnected or weakly connected networks implied by the rankings, e.g. where one item always loses as in figure below. This is achieved by adding pseudo-rankings with a hypothetical or ghost item.
- Accommodates ties (of any order) in the rankings, e.g. bananas
\(\succ\) {apples, oranges} \(\succ\) pears.
- Accommodates sub-rankings, e.g. pears \(\succ\) apples, when the
full set of items is {apples, bananas, oranges, pears}.
- Handles disconnected or weakly connected networks implied by the
rankings, e.g. where one item always loses as in figure below. This
is achieved by adding pseudo-rankings with a hypothetical or ghost
item.

![](man/figures/always-loses-1.png) </br>
![](man/figures/always-loses-1.png)<!-- --> </br>

In addition the package provides methods for

- Obtaining quasi-standard errors, that don't depend on the constraints applied to the worth parameters for identifiability.
- Fitting Plackett-Luce trees, i.e. a tree that partitions the rankings by covariate values, such as consumer attributes or racing conditions, identifying subgroups with different sets of worth parameters for the items.
- Obtaining quasi-standard errors, that don’t depend on the
constraints applied to the worth parameters for identifiability.
- Fitting Plackett-Luce trees, i.e. a tree that partitions the
rankings by covariate values, such as consumer attributes or racing
conditions, identifying subgroups with different sets of worth
parameters for the items.

Installation
------------
## Installation

The package may be installed from GitHub via

Expand All @@ -36,12 +53,19 @@ The package may be installed from GitHub via
devtools::install_github("hturner/PlackettLuce")
```

Usage
-----
## Usage

The [Netflix Prize](http://www.netflixprize.com/) was a competition devised by Netflix to improve the accuracy of its recommendation system. To facilitate this they released ratings about movies from the users of the system that have been transformed to preference data and are available from [PrefLib](http://www.preflib.org/data/election/netflix/). Each data set comprises rankings of a set of 3 or 4 movies selected at random. Here we consider rankings for just one set of movies to illustrate the functionality of **PlackettLuce**.
The [Netflix Prize](http://www.netflixprize.com/) was a competition
devised by Netflix to improve the accuracy of its recommendation system.
To facilitate this they released ratings about movies from the users of
the system that have been transformed to preference data and are
available from [PrefLib](http://www.preflib.org/data/election/netflix/).
Each data set comprises rankings of a set of 3 or 4 movies selected at
random. Here we consider rankings for just one set of movies to
illustrate the functionality of **PlackettLuce**.

The data can be read in using the `read.soc` function in **PlackettLuce**
The data can be read in using the `read.soc` function in
**PlackettLuce**

``` r
library(PlackettLuce)
Expand All @@ -54,9 +78,17 @@ head(netflix, 2)
## 1 68 2 1 4 3
## 2 53 1 2 4 3

Each row corresponds to a unique ordering of the four movies in this data set. The number of Netflix users that assigned that ordering is given in the first column, followed by the four movies in preference order. So for example, 68 users ranked movie 2 first, followed by movie 1, then movie 4 and finally movie 3.
Each row corresponds to a unique ordering of the four movies in this
data set. The number of Netflix users that assigned that ordering is
given in the first column, followed by the four movies in preference
order. So for example, 68 users ranked movie 2 first, followed by movie
1, then movie 4 and finally movie 3.

`PlackettLuce`, the model-fitting function in **PlackettLuce** requires that the data are provided in the form of *rankings* rather than *orderings*, i.e. the rankings are expressed by giving the rank for each item, rather than ordering the items. We can create a `"rankings"` object from a set of orderings as follows
`PlackettLuce`, the model-fitting function in **PlackettLuce** requires
that the data are provided in the form of *rankings* rather than
*orderings*, i.e. the rankings are expressed by giving the rank for each
item, rather than ordering the items. We can create a `"rankings"`
object from a set of orderings as follows

``` r
R <- as.rankings(netflix[,-1], input = "ordering")
Expand All @@ -69,9 +101,19 @@ R[1:3, as.rankings = FALSE]
## 2 1 2 4 3
## 3 2 1 3 4

Note that `read.soc` saved the names of the movies in the `"item"` attribute of `netflix`, so we have used these to label the items. Subsetting the rankings object `R` with `as.rankings = FALSE`, returns the underlying matrix of rankings corresponding to the subset. So for example, in the first ranking the second movie (Beverly Hills Cop) is ranked number 1, followed by the first movie (Mean Girls) with rank 2, followed by the fourth movie (Mission: Impossible II) and finally the third movie (The Mummy Returns), giving the same ordering as in the original data.
Note that `read.soc` saved the names of the movies in the `"item"`
attribute of `netflix`, so we have used these to label the items.
Subsetting the rankings object `R` with `as.rankings = FALSE`, returns
the underlying matrix of rankings corresponding to the subset. So for
example, in the first ranking the second movie (Beverly Hills Cop) is
ranked number 1, followed by the first movie (Mean Girls) with rank 2,
followed by the fourth movie (Mission: Impossible II) and finally the
third movie (The Mummy Returns), giving the same ordering as in the
original data.

Various methods are provided for `"rankings"` objects, in particular if we subset the rankings without `as.rankings = FALSE`, the result is again a `"rankings"` object and the corresponding print method is used:
Various methods are provided for `"rankings"` objects, in particular if
we subset the rankings without `as.rankings = FALSE`, the result is
again a `"rankings"` object and the corresponding print method is used:

``` r
R[1:3]
Expand All @@ -95,7 +137,9 @@ print(R[1:3], width = 60)
## 3
## "Beverly Hills Cop > Mean Girls > The Mummy Returns > Mis ..."

The rankings can now be passed to `PlackettLuce` to fit the Plackett-Luce model. The counts of each ranking provided in the downloaded data are used as weights when fitting the model.
The rankings can now be passed to `PlackettLuce` to fit the
Plackett-Luce model. The counts of each ranking provided in the
downloaded data are used as weights when fitting the model.

``` r
mod <- PlackettLuce(R, weights = netflix$n)
Expand All @@ -107,9 +151,13 @@ coef(mod, log = FALSE)
## Mission: Impossible II
## 0.1498342

Calling `coef` with `log = FALSE` gives the worth parameters, constrained to sum to one. These parameters represent the probability that each movie is ranked first.
Calling `coef` with `log = FALSE` gives the worth parameters,
constrained to sum to one. These parameters represent the probability
that each movie is ranked first.

For inference these parameters are converted to the log scale, by default setting the first parameter to zero so that the standard errors are estimable:
For inference these parameters are converted to the log scale, by
default setting the first parameter to zero so that the standard errors
are estimable:

``` r
summary(mod)
Expand All @@ -130,34 +178,60 @@ summary(mod)
## AIC: 3499.5
## Number of iterations: 5

In this way, Mean Girls is treated as the reference movie, the positive parameter for Beverly Hills Cop shows this was more popular among the users, while the negative parameters for the other two movies show these were less popular.
In this way, Mean Girls is treated as the reference movie, the positive
parameter for Beverly Hills Cop shows this was more popular among the
users, while the negative parameters for the other two movies show these
were less popular.

Comparisons between different pairs of movies can be made visually by plotting the log-worth parameters with comparison intervals based on quasi standard errors.
Comparisons between different pairs of movies can be made visually by
plotting the log-worth parameters with comparison intervals based on
quasi standard errors.

``` r
qv <- qvcalc(mod)
plot(qv)
```

![](man/figures/qv-1.png)
![](man/figures/qv-1.png)<!-- -->

If the intervals overlap there is no significant difference. So we can see that Beverly Hills Cop is significantly more popular than the other three movies, Mean Girls is significant more popular than The Mummy Returns or Mission: Impossible II, but there was no significant difference in users' preference for these last two movies.
If the intervals overlap there is no significant difference. So we can
see that Beverly Hills Cop is significantly more popular than the other
three movies, Mean Girls is significant more popular than The Mummy
Returns or Mission: Impossible II, but there was no significant
difference in users’ preference for these last two movies.

Going Further
-------------
## Going Further

The full functionality of **PlackettLuce** is illustrated in the package vignette, along with details of the model used in the package and a comparison to other packages. The vignette can be found on the [package website](https://hturner.github.io/PlackettLuce/) or from within R once the package has been installed, e.g. via
The full functionality of **PlackettLuce** is illustrated in the package
vignette, along with details of the model used in the package and a
comparison to other packages. The vignette can be found on the [package
website](https://hturner.github.io/PlackettLuce/) or from within R once
the package has been installed, e.g. via

vignette("Overview", package = "PlackettLuce")

Code of Conduct
---------------
## Code of Conduct

Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.
Please note that this project is released with a [Contributor Code of
Conduct](CONDUCT.md). By participating in this project you agree to
abide by its terms.

References
----------
## References

Luce, R. Duncan. 1959. *Individual Choice Behavior: A Theoretical Analysis*. doi:[10.2307/2282347](https://doi.org/10.2307/2282347).
<div id="refs" class="references">

Plackett, Robert L. 1975. “The Analysis of Permutations.” *Appl. Statist* 24 (2): 193–202. doi:[10.2307/2346567](https://doi.org/10.2307/2346567).
<div id="ref-Luce1959">

Luce, R. Duncan. 1959. *Individual Choice Behavior: A Theoretical
Analysis*. New York: Wiley.

</div>

<div id="ref-Plackett1975">

Plackett, Robert L. 1975. “The Analysis of Permutations.” *Appl.
Statist* 24 (2):193–202. <https://doi.org/10.2307/2346567>.

</div>

</div>
21 changes: 9 additions & 12 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
## Resubmission

As requested the DESCRIPTION now contains a reference for the Plackett-Luce model.

## Test environments

(Re-tested with the same results)

* Local
- Ubuntu 14.04, R 3.4.3
- Ubuntu 14.04, R Under development (unstable) (2017-12-04 r73829)
Expand All @@ -12,17 +18,8 @@

On Mac/Ubuntu, R CMD check only returns note that this is a new submission.

On Windows, R CMD check returns the same note in the log, however in the
terminal there is the following warning:
On Windows, R CMD check returns the same note in the log, however in the terminal there is the following warning:

* checking CRAN incoming feasibility ...Warning: running command '"pandoc
" "C:\Users\hturner\PlackettLuce.Rcheck\00_pkg_src\PlackettLuce\README.md
" -s --email-obfuscation=references --self-contained -o "C:\Users\hturner
\AppData\Local\Temp\RtmpyQZw3Z\READMEcf025d05abf.html"' had status 99
* checking CRAN incoming feasibility ...Warning: running command '"pandoc" "C:\Users\hturner\PlackettLuce.Rcheck\00_pkg_src\PlackettLuce\README.md" -s --email-obfuscation=references --self-contained -o "C:\Users\hturner\AppData\Local\Temp\RtmpyQZw3Z\READMEcf025d05abf.html"' had status 99

The warning seems to be due to pandoc not finding image files in `man/figures`.
I could successfully run the pandoc command by either changing directory to
"C:\Users\hturner\PlackettLuce.Rcheck\00_pkg_src\PlackettLuce" and calling
pandoc on "README.md", or adding `--resource-path
"C:\Users\hturner\PlackettLuce.Rcheck\00_pkg_src\PlackettLuce` to the call shown
in the warning.
The warning seems to be due to pandoc not finding image files in `man/figures`. I could successfully run the pandoc command by either changing directory to "C:\Users\hturner\PlackettLuce.Rcheck\00_pkg_src\PlackettLuce" and calling pandoc on "README.md", or adding `--resource-path "C:\Users\hturner\PlackettLuce.Rcheck\00_pkg_src\PlackettLuce` to the call shown in the warning.
Loading

0 comments on commit 2e83f02

Please sign in to comment.