Suppose that two raters (rater A and rater B) each assign physical attractiveness scores (0 = not at all attractive to 10 = extremely attractive) to a set of 7 facial photographs. Pearson r is a common index of inter-rater reliability or agreement on quantitative ratings. A correlation of +1 would indicate perfect rank order agreement between raters, while an r of 0 would indicate no agreement about judgments of relative attractiveness. Generally r’s of .8 to .9 are considered desirable when reliability is assessed. The attractiveness ratings are as follows:
a. Compute the Pearson correlation between the Rater A/Rater B attractiveness ratings. What is the obtained r value?
b. Is your obtained r statistically significant? (Unless otherwise specified, use  = .05 two tailed for all significance tests).
c. Are the rater A and rater B scores “reliable”? Is there good or poor agreement between raters?

