Abstract
The novice–expert ratio method (NEM) pinpoints user interface design problems by identifying the steps in a task that have a high ratio of novice to expert completion time. This study tested the construct validity of NEM's ratio measure against common alternatives. Data were collected from 337 participants who separately performed 10 word-completion tasks on a cellular phone interface. The logarithm, ratio, Cohen's d, and Hedges's ĝ measures had similar construct validity, but Hedges's ĝ provided the most accurate measure of effect size. All these measures correlated more strongly with self-reported interface usability and interface knowledge when applied to the number of actions required to complete a task than when applied to task completion time. A weighted average of both measures had the highest correlation. The relatively high correlation between self-reported interface usability and a weighted Hedges's ĝ measure as compared to the correlations found in the literature indicates the usefulness of the weighted Hedges's ĝ measure in identifying usability problems.
Acknowledgments
We thank Davide Bolchini, Anthony Faiola, and Josette Jones for their comments on an earlier version of this article.
Notes
1Self-reported ratings also tend to be easier to interpret than usability measures based on user expectations (CitationMcGee, Rich, & Dumas, 2004).
2The test-level correlation between satisfaction and effectiveness or satisfaction and efficiency is typically low to medium when measured in the manner described in this study: only once for each participant after the completion of all performance testing. However, the task-level correlation between satisfaction and effectiveness or satisfaction and efficiency is much higher when measured after each task (CitationSauro & Lewis, 2009).