Abstract
This paper reports on a study assessing the consistency of usability testing across organisations. Nine independent organisations evaluated the usability of the same website, Microsoft Hotmail. The results document a wide difference in selection and application of methodology, resources applied, and problems reported. The organizations reported 310 different usability problems. Only two problems were reported by six or more organizations, while 232 problems (75%) were uniquely reported, that is, no two teams reported the same problem. Some of the unique findings were classified as serious. Even the tasks used by most or all teams produced very different results – around 70% of the findings for each of these tasks were unique. Our main conclusion is that our simple assumption that we are all doing the same and getting the same results in a usability test is plainly wrong.
Acknowledgements
Thanks to Meeta Arcuri and Rob Aseron of MSN Hotmail for allowing us to use the Hotmail website for the test. Thanks also to Nigel Bevan of Serco Usability Services, Erika Kindlund of Intraspect Software (now working for Intuit), Anker Helms Jørgensen of the IT University of Copenhagen, Joseph S. Dumas of Oracle Corp., and an anonymous CHI2001 reviewer for insightful comments on early drafts of this paper.