-
Notifications
You must be signed in to change notification settings - Fork 8
Scott's pi coefficient
The pi coefficient is a chance-adjusted index for the reliability of categorical measurements. It estimates chance agreement using a distribution-based approach. It assumes that observers have a conspired "quota" for each category that they work together to meet.
Scott's (1955) formulation only applies to data from two raters and nominal categories. Fleiss (1971) extended it for multiple raters, and Gwet (2014) generalized it for multiple raters, any weighting scheme, and missing data. It is also worth noting that several reliability indices are equivalent to Scott's original formulation, including Siegel & Castellan's (1988) revised kappa coefficient and Byrt, Bishop, and Castellan's (1993) bias-adjusted kappa coefficient.
- FAST_PI %Calculates pi using simplified formulas
- FULL_PI %Calculates pi using generalized formulas
Use these formulas with two raters and two (dichotomous) categories:
is the number of items both raters assigned to the first category
is the number of items both raters assigned to the second category
is the total number of items
is the number of items rater A assigned to category 1
is the number of items rater A assigned to category 2
is the number of items rater B assigned to category 1
is the number of items rater B assigned to category 2
Use these formulas with multiple raters, multiple categories, and any weighting scheme:
is the total number of categories
is the weight associated with two raters assigning an item to categories and
is the number of raters that assigned item to category
is the number of items that were coded by two or more raters
is the number of raters that assigned item to category
is the number of raters that assigned item to any category
is the total number of items
- Scott, W. A. (1955). Reliability of content analysis: The case of nominal scaling. Public Opinion Quarterly, 19(3), 321–325.
- Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378–382.
- Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the behavioural sciences. New York, NY: McGraw-Hill.
- Byrt, T., Bishop, J., & Carlin, J. B. (1993). Bias, prevalence and kappa. Journal of Clinical Epidemiology, 46, 423–429.
- Gwet, K. L. (2014). Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters (4th ed.). Gaithersburg, MD: Advanced Analytics.