Assessing the inter-rater agreement between observers, in the case of ordinal variables, is an important issue in both the statistical theory and biomedical applications. Typically, this problem has been dealt with the use of Cohen's weighted kappa, which is a modification of the original kappa statistic, proposed for nominal variables in the case of two observers. Fleiss (1971) put forth a generalization of kappa in the case of multiple observers, but both Cohen's and Fleiss' kappa could have a paradoxical behavior, which may lead to a difficult interpretation of their magnitude. In this paper, a modification of Fleiss' kappa, not affected by paradoxes, is proposed, and subsequently generalized to the case of ordinal variables. Monte Carlo simulations are used both to testing statistical hypotheses and to calculating percentile and bootstrap-t confidence intervals based on this statistic. The normal asymptotic distribution of the proposed statistic is demonstrated. Our results are applied to the classical Holmquist et al.'s (1967) dataset on the classification, by multiple observers, of carcinoma in situ of the uterine cervix. Finally, we generalize the use of s∗ to a bivariate case.
Assessing the inter-rater agreement for ordinal data through weighted indexes
MARASINI, DONATA;RIPAMONTI, ENRICO
2016-01-01
Abstract
Assessing the inter-rater agreement between observers, in the case of ordinal variables, is an important issue in both the statistical theory and biomedical applications. Typically, this problem has been dealt with the use of Cohen's weighted kappa, which is a modification of the original kappa statistic, proposed for nominal variables in the case of two observers. Fleiss (1971) put forth a generalization of kappa in the case of multiple observers, but both Cohen's and Fleiss' kappa could have a paradoxical behavior, which may lead to a difficult interpretation of their magnitude. In this paper, a modification of Fleiss' kappa, not affected by paradoxes, is proposed, and subsequently generalized to the case of ordinal variables. Monte Carlo simulations are used both to testing statistical hypotheses and to calculating percentile and bootstrap-t confidence intervals based on this statistic. The normal asymptotic distribution of the proposed statistic is demonstrated. Our results are applied to the classical Holmquist et al.'s (1967) dataset on the classification, by multiple observers, of carcinoma in situ of the uterine cervix. Finally, we generalize the use of s∗ to a bivariate case.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.