Abstract

Objectives: Effective health communication is often hindered by a “vocabulary gap” between language familiar to consumers and jargon used in medical practice and research. To present health information to consumers in a comprehensible fashion, we need to develop a mechanism to quantify health terms as being more likely or less likely to be understood by typical members of the lay public. Prior research has used approaches including syllable count, easy word list, and frequency count, all of which have significant limitations.

Design: In this article, we present a new method that predicts consumer familiarity using contextual information. The method was applied to a large query log data set and validated using results from two previously conducted consumer surveys.

Measurements: We measured the correlation between the survey result and the context-based prediction, syllable count, frequency count, and log normalized frequency count.

Results: The correlation coefficient between the context-based prediction and the survey result was 0.773 (p < 0.001), which was higher than the correlation coefficients between the survey result and the syllable count, frequency count, and log normalized frequency count (p ≤ 0.012).

Conclusions: The context-based approach provides a good alternative to the existing term familiarity assessment methods.

You do not currently have access to this article.