Abstract

We present an experiment comparing performance of 20 novice evaluators of accessibility carrying out Web Content Accessibility Guidelines 2.0 conformance reviews working individually to performance obtained when they work in teams of two. They were asked to first carry out an individual assessment of a web page. Later on, they were matched randomly to constitute a group of two and they were asked to revise their initial assessment and to produce a group assessment of the same page. Results indicate that significant differences were found for sensitivity (inversely related to false negatives: +8%) and agreement (when measured in terms of the majority view: +10%). Members of groups exhibited strong agreement on the evaluation results among them and with the group outcome. Other measures of validity and reliability are not significantly affected by group work. Practical implications of these findings are that, for example, when it is important to reduce the false-negative rate, then employing a group of two people is more useful than having individuals carrying out the assessment. Openings for future research include further explorations of whether similar results hold for groups larger than two or what is the effect of mixing people with different accessibility background.

RESEARCH HIGHLIGHTS

  • When novice accessibility evaluators work in groups, their ability to identify all the true problems increases (by 8%).

  • Likewise, reliability of group evaluations increases (by 10%).

  • Individual or group evaluations can be considered as equivalent methods with respect to false positives (if differences up to 8% in correctness are tolerated).

  • Individual or group evaluations can be considered as equivalent methods with respect to overall effectiveness (if differences up to 11% in F -measure are tolerated).

You do not currently have access to this article.