8,384
Views
180
CrossRef citations to date
0
Altmetric
Original Articles

Usability: Lessons Learned … and Yet to Be Learned

REFERENCES

  • Abelson, R. P. (1995). Statistics as principled argument. Hillsdale, NJ: Erlbaum.
  • Agarwil, A., & Meyer, A. (2009). Beyond usability: Evaluating emotional response as an integral part of the user experience. In Proceedings of CHI 2009 Extended Abstracts on Human Factors in Computing Systems (pp. 2919–2930). Boston, MA: Association for Computing Machinery.
  • Agresti, A., & Coull, B. (1998). Approximate is better than ‘exact’ for interval estimation of binomial proportions. The American Statistician, 52, 119–126.
  • Al-Awar, J., Chapanis, A., & Ford, R. (1981). Tutorials for the first-time computer user. IEEE Transactions on Professional Communication, 24, 30–37.
  • Albert, B., Tullis, T., & Tedesco, D. (2010). Beyond the usability lab. Burlington, MA: Morgan Kaufmann.
  • Alonso-Ríos, D., Vázquez-Garcia, A., Mosqueira-Rey, E., & Moret-Bonillo. (2010). Usability: A critical analysis and a taxonomy. International Journal of Human-Computer Interaction, 26, 53–74.
  • American National Standards Institute. (2001). Common industry format for usability test reports (ANSI-NCITS 354-2001). Washington, DC: Author.
  • Baecker, R. M. (2008). Themes in the early history of HCI—Some unanswered questions. Interactions, 15(2), 22–27.
  • Bailey, G. (1993). Iterative methodology and designer training in human–computer interface design. In INTERCHI ‘93 Conference Proceedings (pp. 198–205). New York, NY: Association for Computing Machinery.
  • Bailey, R. W., Allan, R. W., & Raiello, P. (1992). Usability testing vs. heuristic evaluation: A head to head comparison. In Proceedings of the Human Factors and Ergonomics Society 36th Annual Meeting (pp. 409–413). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Bangor, A., Kortum, P. T., & Miller, J. T. (2008). An empirical evaluation of the System Usability Scale. International Journal of Human–Computer Interaction, 24, 574–594.
  • Barnette, J. J. (2000). Effects of stem and Likert response option reversals on survey internal consistency: If you feel the need, there is a better alternative to using those negatively worded stems. Educational and Psychological Measurement, 60, 361–370.
  • Barnum, C. M. (2010). Usability testing essentials: Ready, set … test! Burlington, MA: Morgan Kaufmann.
  • Benedek, J., & Miner, T. (2002). Measuring desirability: New methods for evaluating desirability in a usability lab setting. In Proceedings of the Usability Professionals’ Association. Orlando, FL: Usability Professionals Association. Available at: www.microsoft.com/usability/uepostings/desirabilitytoolkit.doc (Accessed June 10, 2014).
  • Bennett, J. L. (1979). The commercial impact of usability in interactive systems. Infotech State of the Art Report: Man/Computer Communications, 2, 289–297.
  • Berry, D. C., & Broadbent, D. E. (1990). The role of instruction and verbalization in improving performance on complex search tasks. Behaviour & Information Technology, 9, 175–190.
  • Bevan, N. (2009). Extending quality in use to provide a framework for usability measurement. In M. Kurosu ( Ed.), Human centered design, HCII 2009 (pp. 13–22). Heidelberg, Germany: Springer-Verlag.
  • Bevan, N., Kirakowski, J., & Maissel, J. (1991). What is usability? In H. J. Bullinger ( Ed.), Human Aspects in Computing, Design and Use of Interactive Systems and Work with Terminals, Proceedings of the 4th International Conference on Human–Computer Interaction (pp. 651–655). Stuttgart, Germany: Elsevier Science.
  • Bias, R. G., & Mayhew, D. J. (1994). Cost-justifying usability. Boston, MA: Academic.
  • Blažica, B., & Lewis, J. R. (2014). A Slovene translation of the System Usability Scale: The SUS-SI. International Journal of Human–Computer Interaction. In Press.
  • Boren, T., & Ramey, J. (2000). Thinking aloud: Reconciling theory and practice. IEEE Transactions on Professional Communications, 43, 261–278.
  • Borsci, S., Federici, S., & Lauriola, M. (2009). On the dimensionality of the system usability scale: A test of alternative measurement models. Cognitive Processes, 10, 193–197.
  • Borsci, S., Londei, A., & Federici, S. (2011). The Bootstrap Discovery Behaviour (BDB): A new outlook on usability evaluation. Cognitive Processes, 12, 23–31.
  • Borsci, S., Macredie, R. D., Barnett, J., Martin, J., Kuljis, J., & Young, T. (2013). Reviewing and extending the five-user assumption: A grounded procedure for interaction evaluation. ACM Transactions on Computer-Human Interaction, 20, 29:01–29:23.
  • Bowers, V., & Snyder, H. (1990). Concurrent versus retrospective verbal protocols for comparing window usability. In Proceedings of the Human Factors Society 34th Annual Meeting (pp. 1270–1274). Santa Monica, CA: Human Factors Society.
  • Bradley, J. V. (1976). Probability; decision; statistics. Englewood Cliffs, NJ: Prentice-Hall.
  • Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144–152.
  • Briand, L. C., El Emam, K., Freimut, B. G., & Laitenberger, O. (2000). A comprehensive evaluation of capture-recapture models for estimating software defect content. IEEE Transactions on Software Engineering, 26, 518–540.
  • Brooke, J. (1996). SUS—A “quick and dirty” usability scale. In P. W. Jordan ( Ed.), Usability evaluation in industry (pp. 189–194). London, UK: Taylor & Francis.
  • Brooke, J. (2013). SUS: A retrospective. Journal of Usability Studies, 8(2), 29–40.
  • Campbell, I. (2007). Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations. Statistics in Medicine, 26, 3661–3675.
  • Capra, M. G. (2007). Comparing usability problem identification and description by practitioners and students. In Proceedings of the Human Factors and Ergonomics Society 51st Annual Meeting (pp. 474–478). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. London, UK: Erlbaum.
  • Caulton, D. A. (2001). Relaxing the homogeneity assumption in usability testing. Behaviour & Information Technology, 20, 1–7.
  • Cavallin, H., Martin, W. M., & Heylighen, A. (2007). How relative absolute can be: SUMI and the impact of the nature of the task in measuring perceived software usability. Artificial Intelligence and Society, 22, 227–235.
  • Chapanis, A. (1981). Evaluating ease of use. Unpublished manuscript prepared for IBM, Boca Raton, FL. (Available from J. R. Lewis, ADDRESS).
  • Chernick, M. R. (2008). Bootstrap methods: A guide for practitioners and researchers. Hoboken, NJ: Wiley.
  • Chin, J. P., Diehl, V. A., & Norman, K. L. (1988). Development of an instrument measuring user satisfaction of the human–computer interface. In Proceedings of CHI 1988 (pp. 213–218). Washington, DC: Association for Computing Machinery.
  • Clemmensen, T., Hertzum, M., Hornbæk, K., Shi, Q., & Yammiyavar, P. (2009). Cultural cognition in usability evaluation. Interacting with Computers, 21, 212–220.
  • Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45, 1304–1312.
  • Cowles, M. (1989). Statistics in psychology: An historical perspective. Hillsdale, NJ: Erlbaum.
  • Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13, 319–339.
  • Davis, F. D., & Venkatesh, V. (1996). A critical assessment of potential measurement biases in the Technology Acceptance Model: Three experiments. International Journal of Human-Computer Studies, 45, 19–45.
  • Dumas, J. S. (2003). User-based evaluations. In J. A. Jacko & A. Sears ( Eds.), The human–computer interaction handbook (pp. 1093–1117). Mahwah, NJ: Erlbaum.
  • Dumas, J. S. (2007). The great leap forward: The birth of the usability profession (1988–1993). Journal of Usability Studies, 2, 54–60.
  • Dumas, J., & Redish, J. C. (1999). A practical guide to usability testing. Portland, OR: Intellect.
  • Ennis, D. M., & Bi, J. (1998). The beta-binomial model: Accounting for inter-trial variation in replicated difference and preference tests. Journal of Sensory Studies, 13, 389–412.
  • Erdinç, O., & Lewis, J. R. (2013). Psychometric evaluation of the T-CSUQ: The Turkish version of the Computer System Usability Questionnaire. International Journal of Human-Computer Interaction, 29, 319–326.
  • Ericsson, K. A., & Simon, H. A. (1980). Verbal reports as data. Psychological Review, 87, 215–251.
  • Evers, V., & Day, D. (1997). The role of culture in interface acceptance. In Proceedings of Interact 1997 (pp. 260–267). Sydney, Australia: Chapman and Hall.
  • Finstad, K. (2006). The System Usability Scale and non-native English speakers. Journal of Usability Studies, 1, 185–188.
  • Finstad, K. (2010). The usability metric for user experience. Interacting with Computers, 22, 323–327.
  • Finstad, K. (2013). Response to commentaries on “The Usability Metric for User Experience”. Interacting with Computers, 25, 327–330.
  • Frandsen-Thorlacius, O., Hornbæk, K., Hertzum, M., & Clemmensen, T. (2009). Non-universal usability? A survey of how usability is understood by Chinese and Danish users. In Proceedings of CHI 2009 (pp. 41–50). Boston, MA: Association for Computing Machinery.
  • Gaito, J. (1980). Measurement scales and statistics: Resurgence of an old misconception. Psychological Bulletin, 87, 564–567.
  • Gould, J. D. (1988). How to design usable systems. In M. Helander ( Ed.), Handbook of human–computer interaction (pp. 757–789). Amsterdam, the Netherlands: North-Holland.
  • Gould, J. D., & Boies, S. J. (1983). Human factors challenges in creating a principal support office system: The Speech Filing System approach. ACM Transactions on Information Systems, 1, 273–298.
  • Gould, J. D., Boies, S. J., Levy, S., Richards, J. T., & Schoonard, J. (1987). The 1984 Olympic Message System: A test of behavioral principles of system design. Communications of the ACM, 30, 758–769.
  • Gould, J. D., & Lewis, C. (1984). Designing for usability: Key principles and what designers think ( IBM Tech. Report No. RC-10317). Yorktown Heights, NY: International Business Machines Corporation.
  • Gray, W. D., & Salzman, M. C. (1998). Damaged merchandise? A review of experiments that compare usability evaluation methods. Human–Computer Interaction, 13, 203–261.
  • Grier, R. A., Bangor, A., Kortum, P., & Peres, S. C. (2013). The System Usability Scale: Beyond standard usability testing. In Proceedings of the Human Factors and Ergonomics Society (pp. 187–191). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Grove, J. W. (1989). In defence of science: Science, technology, and politics in modern society. Toronto, Canada: University of Toronto Press.
  • Harris, R. J. (1985). A primer of multivariate statistics. Orlando, FL: Academic Press.
  • Hassenzahl, M. (2000). Prioritizing usability problems: Data driven and judgment driven severity estimates. Behaviour & Information Technology, 19, 29–42.
  • Hassenzahl, M. (2001). The effect of perceived hedonic quality on product appealingness. International Journal of Human-Computer Interaction, 13, 481–499.
  • Hassenzahl, M. (2004). The interplay of beauty, goodness, and usability in interactive products. Human-Computer Interaction, 19, 319–349.
  • Hertzum, M. (2006). Problem prioritization in usability evaluation: From severity assessments to impact on design. International Journal of Human-Computer Interaction, 21, 125–146.
  • Hertzum, M. (2010). Images of usability. International Journal of Human-Computer Interaction, 26, 567–600.
  • Hertzum, M., Clemmensen, T., Hornbæk, K., Kumar, J., Shi, Q., & Yammiyavar, P. (2007). Usability constructs: A cross-cultural study of how users and developers experience their use of information systems. In Proceedings of HCI International 2007 (pp. 317–326). Beijing, China: Springer-Verlag.
  • Hertzum, M., Hansen, K. D., & Andersen, H. H. K. (2009). Scrutinising usability evaluation: Does thinking aloud affect behaviour and mental workload? Behaviour & Information Technology, 28, 165–181.
  • Høegh, R. T., & Jensen, J. J. (2008). A case study of three software projects: Can software developers anticipate the usability problems in their software? Behaviour & Information Technology, 27, 307–312.
  • Høegh, R. T., Nielsen, C. M., Overgaard, M., Pedersen, M. B., & Stage, J. (2006). The impact of usability reports and user test observations on developers’ understanding of usability data: An exploratory study. International Journal of Human-Computer Interaction, 21, 173–196.
  • Hornbæk, K. (2006). Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human-Computer Studies, 64, 79–102.
  • Hornbæk, K. (2010). Dogmas in the assessment of usability evaluation methods. Behaviour & Information Technology, 29, 97–111.
  • Hornbæk, K., & Law, E. L. (2007). Meta-analysis of correlations among usability measures. In Proceedings of CHI 2007 (pp. 617–626). San Jose, CA: Association for Computing Machinery.
  • Howard, T. W. (2008). Unexpected complexity in a traditional usability study. Journal of Usability Studies, 3, 189–205.
  • Howard, T., & Howard, W. (2009). Unexpected complexity in user testing of information products. In Proceedings of the Professional Communication Conference (pp. 1–5). Waikiki, HI: Institute of Electrical and Electronics Engineers.
  • Hwang, W., & Salvendy, G. (2010). Number of people required for usability evaluation: The 10±2 rule. Communications of the ACM, 53, 130–133.
  • International Organization for Standardization. (1998). Ergonomic requirements for office work with visual display terminals (VDTs), Part 11, Guidance on usability ( ISO 9241-11: 1998(E)). Geneva, Switzerland: Author.
  • Kanis, H. (2011). Estimating the number of usability problems. Applied Ergonomics, 42, 337–347.
  • Karat, C. (1997). Cost-justifying usability engineering in the software life cycle. In M. Helander, T. K. Landauer, & P. Prabhu ( Eds.), Handbook of human–computer interaction (2nd ed., pp. 767–778). Amsterdam, the Netherlands: Elsevier.
  • Kelley, J. F. (1984). An iterative design methodology for user-friendly natural language office information applications. ACM Transactions on Information Systems, 2, 26–41.
  • Kennedy, P. J. (1982). Development and testing of the operator training package for a small computer system. In Proceedings of the Human Factors Society 26th Annual Meeting (pp. 715–717). Santa Monica, CA: Human Factors Society.
  • Kessner, M., Wood, J., Dillon, R. F., & West, R. L. (2001). On the reliability of usability testing. In J. Jacko & A. Sears ( Eds.), Conference on Human Factors in Computing Systems: CHI 2001 Extended Abstracts (pp. 97–98). Seattle, WA: Association for Computing Machinery.
  • Kirakowski, J. (1996). The Software Usability Measurement Inventory: Background and usage. In P. Jordan, B. Thomas, & B. Weerdmeester ( Eds.), Usability evaluation in industry (pp. 169–178). London, UK: Taylor & Francis
  • Kirakowski, J., & Corbett, M. (1993). SUMI: The Software Usability Measurement Inventory. British Journal of Educational Technology, 24, 210–212.
  • Kirakowski, J., & Dillon, A. (1988). The Computer User Satisfaction Inventory (CUSI): Manual and scoring key. Cork, Ireland: Human Factors Research Group, University College of Cork.
  • Kortum, P. T., & Bangor, A. (2013). Usability ratings for everyday products measured with the System Usability Scale. International Journal of Human-Computer Interaction, 29, 67–76.
  • Krahmer, E., & Ummelen, N. (2004). Thinking about thinking aloud: A comparison of two verbal protocols for usability testing. IEEE Transactions on Professional Communication, 47, 105–117.
  • LaLomia, M. J., & Sidowski, J. B. (1990). Measurements of computer satisfaction, literacy, and aptitudes: A review. International Journal of Human–Computer Interaction, 2, 231–253.
  • Landauer, T. K. (1997). Behavioral research methods in human–computer interaction. In M. Helander, K. T. Landauer, & P. Prabhu ( Eds.), Handbook of human–computer interaction (2nd ed., pp. 203–227). Amsterdam, the Netherlands: Elsevier.
  • Larson, R. C. (2008). Service science: At the intersection of management, social, and engineering sciences. IBM Systems Journal, 47, 41–51.
  • Lazar, J., Feng, J. H., & Hochheiser, H. (2010). Research methods in human-computer interaction. Chichester, UK: Wiley.
  • Lewis, J. R. (1982). Testing small system customer set-up. In Proceedings of the Human Factors Society 26th Annual Meeting (pp. 718–720). Santa Monica, CA: Human Factors Society.
  • Lewis, J. R. (1990a). Psychometric evaluation of a post-study system usability questionnaire: The PSSUQ (Tech. Rep. No. 54.535). Boca Raton, FL: International Business Machines Corp.
  • Lewis, J. R. (1990b). Psychometric evaluation of an after-scenario questionnaire for computer usability studies: The ASQ (Tech. Rep. No. 54.541). Boca Raton, FL: International Business Machines Corp.
  • Lewis, J. R. (1992). Psychometric evaluation of the Post-Study System Usability Questionnaire: The PSSUQ. In Proceedings of the Human Factors Society 36th Annual Meeting (pp. 1259–1263). Santa Monica, CA: Human Factors Society.
  • Lewis, J. R. (1993). Multipoint scales: Mean and median differences and observed significance levels. International Journal of Human-Computer Interaction, 5, 383–392.
  • Lewis, J. R. (1994). Sample sizes for usability studies: Additional considerations. Human Factors, 36, 368–378.
  • Lewis, J. R. (1995). IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use. International Journal of Human-Computer Interaction, 7, 57–78.
  • Lewis, J. R. (1996). Reaping the benefits of modern usability evaluation: The Simon story. In G. Salvendy & A. Ozok ( Eds.), Advances in Applied Ergonomics: Proceedings of the 1st International Conference on Applied Ergonomics—ICAE ‘96 (pp. 752–757). Istanbul, Turkey: USA Publishing.
  • Lewis, J. R. (1999). Tradeoffs in the design of the IBM computer usability satisfaction questionnaires. In Proceedings of HCI International 1999 (pp. 1023–1027). Mahwah, NJ: Erlbaum.
  • Lewis, J. R. (2001). Evaluation of procedures for adjusting problem-discovery rates estimated from small samples. International Journal of Human–Computer Interaction, 13, 445–479.
  • Lewis, J. R. (2002). Psychometric evaluation of the PSSUQ using data from five years of usability studies. International Journal of Human–Computer Interaction, 14, 463–488.
  • Lewis, J. R. (2006). Sample sizes for usability tests: Mostly math, not magic. Interactions, 13(6),29–33. (See corrected formula in Interactions, 14(1), 4)
  • Lewis, J. R. (2011a). Human factors engineering. In P. A. LaPlante ( Ed.), Encyclopedia of software engineering (pp. 383–394). New York, NY: Taylor & Francis.
  • Lewis, J. R. (2011b). Practical speech user interface design. Boca Raton, FL: Taylor & Francis.
  • Lewis, J. R. (2012). Usability testing. In G. Salvendy ( Ed.), Handbook of human factors and ergonomics (4th ed., pp. 1267–1312). New York, NY: Wiley.
  • Lewis, J. R. (2013). Critical review of “The Usability Metric for User Experience”. Interacting with Computers, 25, 320–324.
  • Lewis, J. R., Henry, S. C., & Mack, R. L. (1990). Integrated office software benchmarks: A case study. In Proceedings of the 3rd IFIP Conference on Human-Computer Interaction, INTERACT ‘90 (pp. 337–343). Cambridge, UK: Elsevier Science.
  • Lewis, J. R., & Sauro, J. (2009). The factor structure of the system usability scale. In M. Kurosu ( Ed.), Human centered design (pp. 94–103). Heidelberg, Germany: Springer-Verlag.
  • Lewis, J. R., Utesch, B. S., & Maher, D. E. (2013). UMUX-LITE—When there’s no time for the SUS. In Proceedings of CHI 2013 (pp. 2099–2102). Paris, France: Association for Computing Machinery.
  • Lilienfeld, S. O., Wood, J. M., & Garb, H. N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1, 27–66.
  • Lindgaard, G., & Kirakowski, J. (2013). Introduction to the special issue: The tricky landscape of developing rating scales in HCI. Interacting with Computers, 25, 271–277.
  • Lord, F. M. (1953). On the statistical treatment of football numbers. American Psychologist, 8, 750–751.
  • Lord, F. M. (1954). Further comment on “football numbers.” American Psychologist, 9, 264–265.
  • Lottridge, D., Chignell, M., & Jovicic, A. (2011). Affective design: Understanding, evaluating, and designing for human emotion. Reviews of Human Factors and Ergonomics, 7, 197–237.
  • Lund, A. (1998). USE Questionnaire Resource Page. Retrieved from http://usesurvey.com.
  • Lund, A. (2001). Measuring usability with the USE questionnaire. Usability and User Experience Newsletter of the STC Usability SIG, 8(2), 1–4.
  • Lusch, R. F., Vargo, S. L., & O’Brien, M. (2007). Competing through service: Insights from service-dominant logic. Journal of Retailing, 83, 5–18.
  • Lusch, R. F., Vargo, S. L., & Wessels, G. (2008). Toward a conceptual foundation for service science: Contributions from service-dominant logic. IBM Systems Journal, 47, 5–14.
  • MacDonald, S., Edwards, H. M., & Zhao, T. (2012). Exploring think-alouds in usability testing: An international survey. IEEE Transactions on Professional Communication, 55, 2–19.
  • MacDonald, S., McGarry, K., & Willis, L. M. (2013). Thinking-aloud about web navigation: The relationship between think-aloud instructions, task difficulty and performance. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (pp. 2037–2041. Santa Monica: Human Factors and Ergonomics Society.
  • MacKenzie, I. S. (2014). Human-computer interaction: An empirical research perspective. Waltham, MA: Morgan Kaufmann.
  • Marcus, A. (2007). Global/intercultural user-interface design. In J. Jacko & A. Spears ( Eds.), Handbook of human-computer interaction (3rd ed., pp. 355–380). New York, NY: Erlbaum.
  • Marshall, C., Brendan, M., & Prail, A. (1990). Usability of Product X: Lessons from a real product. Behaviour & Information Technology, 9, 243–253.
  • McSweeney, R. (1992). SUMI: A psychometric approach to software evaluation ( Unpublished master’s thesis). Cork, Ireland: University College of Cork.
  • Mitchell, J. (1986). Measurement scales and statistics: A clash of paradigms. Psychological Bulletin, 100, 398–407.
  • Molich, R., Bevan, N., Curson, I., Butler, S., Kindlund, E., Miller, D., & Kirakowski, J. (1998). Comparative evaluation of usability tests. In Usability Professionals Association Annual Conference Proceedings (pp. 189–200). Washington, DC: Usability Professionals Association.
  • Molich, R., & Dumas, J. S. (2008). Comparative usability evaluation (CUE-4). Behaviour & Information Technology, 27, 263–281.
  • Molich, R., Ede, M. R., Kaasgaard, K., & Karyukin, B. (2004). Comparative usability evaluation. Behaviour & Information Technology, 23, 65–74.
  • Molich, R., Jeffries, R., & Dumas, J. S. (2007). Making usability recommendations useful and usable. Journal of Usability Studies, 2, 162–179.
  • Molich, R., Kirakowski, J, Sauro, J., & Tullis, T. (2009). Comparative usability task measurement workshop (CUE-8). Workshop conducted at the UPA 2009 Conference in Portland, OR.
  • Nielsen, J. (2000). Why you only need to test with 5 users. Alertbox. Retrieved from http://www.useit.com/alertbox/20000319.html
  • Nielsen, J., & Landauer, T. K. (1993). A mathematical model of the finding of usability problems. In Proceedings of INTERCHI’93 (pp. 206–213). Amsterdam, the Netherlands: Association for Computing Machinery.
  • Nielsen, J., & Mack, R. L. (1994). Usability inspection methods. New York, NY: Wiley.
  • Nielsen, J., & Molich, R. (1990). Heuristic evaluation of user interfaces. In Proceedings of CHI ’90 (pp. 249–256). New York, NY: Association for Computing Machinery.
  • Nørgaard, M., & Hornbæk, K. (2009). Exploring the value of usability feedback formats. International Journal of Human-Computer Interaction, 25, 49–74.
  • Nunnally, J.C. (1978). Psychometric theory. New York, NY: McGraw-Hill.
  • Ohnemus, K. R., & Biers, D. W. (1993). Retrospective versus thinking aloud in usability testing. In Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting (pp. 1127–1131). Seattle, WA: Human Factors and Ergonomics Society
  • Olmsted-Hawala, E. L., Murphy, E., Hawala, S., & Ashenfelter, K. T. (2010). Think-aloud protocols: A comparison of three think-aloud protocols for use in testing data-dissemination web sites for usability. In Proceedings of CHI 2010 (pp. 2381–2390). Atlanta, GA: Association for Computing Machinery.
  • Perfetti, C., & Landesman, L. (2001). Eight is not enough. Retrieved from http://www.uie.com/articles/eight_is_not_enough/
  • Pilotte, W. J., & Gable, R. K. (1990). The impact of positive and negative item stems on the validity of a computer anxiety scale. Educational and Psychological Measurement, 50, 603–610.
  • Pitkänen, O., Virtanen, P., & Kemppinen, J. (2008). Legal research topics in user-centric services. IBM Systems Journal, 47, 143–152.
  • Redish, J. (2007). Expanding usability testing to evaluate complex systems. Journal of Usability Studies, 2, 102–111.
  • Rubin, J. (1994). Handbook of usability testing: How to plan, design, and conduct effective tests. New York, NY: Wiley.
  • Rubin, J., & Chisnell, D. (2008). Handbook of usability testing: How to plan, design, and conduct effective tests, 2nd ed. New York, NY: Wiley.
  • Sabra, A. I. (2003). Ibn al-Haytham. Harvard Magazine, 106, 54–55.
  • Sauro, J. (2010a). A practical guide to measuring usability. Denver, CO: Create Space.
  • Sauro, J. (2010b). That’s the worst website ever! Effects of extreme survey items. Retrieved from http://LL www.measuringusability.com/blog/extreme-items.php.
  • Sauro, J. (2011). A practical guide to the System Usability Scale (SUS): Background, bench-marks & best practices. Denver, CO: Measuring Usability.
  • Sauro, J., & Lewis, J. R. (2005). Estimating completion rates from small samples using binomial confidence intervals: Comparisons and recommendations. In Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting (pp. 2100–2104). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Sauro, J., & Lewis, J. R. (2009). Correlations among prototypical usability metrics: Evidence for the construct of usability. In Proceedings of CHI 2009 (pp. 1609–1618). Boston, MA: Association for Computing Machinery.
  • Sauro, J., & Lewis, J. R. (2011). When designing usability questionnaires, does it hurt to be positive? In Proceedings of CHI 2011 (pp. 2215–2223). Vancouver, Canada: Association for Computing Machinery.
  • Sauro, J., & Lewis, J. R. (2012). Quantifying the user experience: Practical statistics for user research. Burlington, MA: Morgan-Kaufmann.
  • Schmettow, M. (2008). Heterogeneity in the usability evaluation process. In Proceedings of the 22nd British HCI Group Annual Conference on HCI 2008: People and Computers XXII: Culture, Creativity, Interaction - Volume 1 (pp. 89–98). Liverpool, UK: Association for Computing Machinery.
  • Schmettow, M. (2009). Controlling the usability evaluation process under varying defect visibility. In Proceedings of the 2009 British Computer Society Conference on Human-Computer Interaction (pp. 188–197). Cambridge, UK: Association for Computing Machinery.
  • Schmettow, M. (2012). Sample size in usability studies. Communications of the ACM, 55(4), 64–70.
  • Schmitt, N., & Stuits, D. (1985) Factors defined by negatively keyed items: The result of careless respondents? Applied Psychological Measurement, 9, 367–373.
  • Scholten, A. Z., & Borsboom, D. (2009). A reanalysis of Lord’s statistical treatment of football numbers. Journal of Mathematical Psychology, 53, 69–75.
  • Schriesheim, C. A., & Hill, K. D. (1981). Controlling acquiescence response bias by item reversals: the effect on questionnaire validity. Educational and Psychological Measurement, 41, 1101–1114.
  • Seffah, A., Donyaee, M., Kline, R. B., & Padda, H. K. (2006). Usability measurement and metrics: A consolidated model. Software Quality Journal, 14, 159–178.
  • Shackel, B. (1990). Human factors and usability. In J. Preece & L. Keller ( Eds.), Human–computer interaction: Selected readings (pp. 27–41). Hemel Hempstead, England: Prentice Hall International.
  • Shneiderman, B., & Plaisant, C. (2010). Designing the user interface: Strategies for effective human-computer interaction, 5th ed. Reading, MA: Addison-Wesley.
  • Smith, D. C., Irby, C., Kimball, R., Verplank, B., & Harlem, E. (1982). Designing the Star user interface. Byte, 7, 242–282.
  • Snyder, K. M., Happ, A. J., Malcus, L., Paap, K. R., & Lewis, J. R. (1985). Using cognitive models to create menus. In Proceedings of the Human Factors Society 29th Annual Meeting (pp. 655–658). Baltimore, MD: Human Factors Society.
  • Spector, P., Van Katwyk, P., Brannick, M., & Chen, P. (1997) When two factors don’t reflect two constructs: How item characteristics can produce artifactual factors. Journal of Management, 23, 659–677.
  • Spencer, R. (2000). The streamlined cognitive walkthrough method: Working around social constraints encountered in a software development company. In Proceedings of CHI 2000 (pp. 353–359). New York, NY: Association for Computing Machinery.
  • Spohrer, J., & Maglio, P. P. (2008). The emergence of service science: Toward systematic service innovations to accelerate co-creation of value. Production and Operations Management, 17, 238–246.
  • Spool, J., & Schroeder, W. (2001). Testing websites: Five users is nowhere near enough. In CHI 2001 extended abstracts (pp. 285–286). New York, NY: Association for Computing Machinery.
  • Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680.
  • Stewart, T. J., & Frye, A. W. (2004). Investigating the use of negatively-phrased survey items in medical education settings: Common wisdom or common mistake? Academic Medicine, 79 (Suppl. 10), S1–S3.
  • Stigler, S. M. (1999). Statistics on the table: The history of statistical concepts and methods. Cambridge, MA: Harvard University Press.
  • Theofanos, M., & Quesenbery, W. (2005). Towards the design of effective formative test reports. Journal of Usability Studies, 1(1), 27–45.
  • Thimbleby, H. (2007). User-centered methods are insufficient for safety critical systems. In A. Holzinger ( Ed.), Proceedings of USAB 2007 (pp. 1–20). Heidelberg, Germany: Springer-Verlag.
  • Thurstone, L. L. (1928). Attitudes can be measured. American Journal of Sociology, 33, 529–554.
  • Townsend, J. T., & Ashby, F. G. (1984). Measurement scales and statistics: The misconception misconceived. Psychological Bulletin, 96, 394–401.
  • Travis, D. (2008). Measuring satisfaction: Beyond the usability questionnaire. Retrieved from http://www.userfocus.co.uk/articles/satisfaction.html.
  • Tullis, T. S. (1985). Designing a menu-based interface to an operating system. In Proceedings of CHI 1985 (pp. 79–84). San Francisco, CA: Association for Computing Machinery.
  • Tullis, T. S., & Albert, W. (2008). Measuring the user experience: Collecting, analyzing, and presenting usability data. Waltham, MA: Morgan-Kauffman.
  • Tullis, T. S., & Albert, W. (2013). Measuring the user experience: Collecting, analyzing, and presenting usability data, 2nd ed. Waltham, MA: Morgan-Kauffman.
  • Tullis, T. S., & Stetson, J. N. (2004). A comparison of questionnaires for assessing website usability. Paper presented at the Usability Professionals Association Annual Conference. Minneapolis, MN: Usability Professionals Association.
  • van de Vijver, F. J. R., & Leung, K. (2001). Personality in cultural context: Methodological issues. Journal of Personality, 69, 1007–1031.
  • van den Haak, M. J., & de Jong, D. T. M. (2003). Exploring two methods of usability testing: Concurrent versus retrospective think-aloud protocols. In Proceedings of the International Professional Communication Conference, IPCC 2003 (pp. 285–287). Orlando, FL: Institute of Electrical and Electronics Engineers.
  • Velleman, P. F., & Wilkinson, L. (1993). Nominal, ordinal, interval, and ratio typologies are misleading. The American Statistician, 47, 65–72.
  • Virzi, R. A. (1990). Streamlining the design process: Running fewer subjects. In Proceedings of the Human Factors Society 34th Annual Meeting (pp. 291–294). Santa Monica, CA: Human Factors Society.
  • Virzi, R. A. (1992). Refining the test phase of usability evaluation: How many subjects Is enough? Human Factors, 34, 457–468.
  • Virzi, R. A., Sorce, J. F., & Herbert, L. B. (1993). A comparison of three usability evaluation methods: Heuristic, think-aloud, and performance testing. In Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting (pp. 309–313). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Vredenburg, K., Mao, J. Y., Smith, P. W., & Carey, T. (2002). A survey of user centered design practice. In Proceedings of CHI 2002 (pp. 471–478). Minneapolis, MN: Association for Computing Machinery.
  • Wharton, C., Rieman, J., Lewis, C., & Polson, P. (1994). The cognitive walkthrough method: A practitioner’s guide. In J. Nielsen & R. L. Mack ( Eds.), Usability inspection methods (pp. 105–140). New York, NY: Wiley.
  • Whiteside, J., Bennett, J., & Holtzblatt, K. (1988). Usability engineering: Our experience and evolution. In M. Helander ( Ed.), Handbook of human–computer interaction (pp. 791–817). Amsterdam, the Netherlands: North-Holland.
  • Wildman, D. (1995). Getting the most from paired-user testing. Interactions, 2(3), 21–27.
  • Williams, G. (1983). The Lisa computer system. Byte, 8(2), 33–50.
  • Winter, S., Wagner, S., & Deissenboeck, F. (2008). A comprehensive model of usability. In Engineering Interactive Systems (pp. 106–122). Heidelberg, Germany: International Federation for Information Processing.
  • Wixon, D. (2003). Evaluating usability methods: Why the current literature fails the practitioner. Interactions, 10(4), 28–34.
  • Wong, N., Rindfleisch, A., & Burroughs, J. (2003). Do reverse-worded items confound measures in cross-cultural consumer research? The case of the material values scale. Journal of Consumer Research, 30, 72–91.
  • Woolrych, A., & Cockton, G. (2001). Why and when five test users aren’t enough. In J. Vanderdonckt, A. Blandford, & A. Derycke ( Eds.), Proceedings of IHM–HCI 2001 Conference, Vol. 2 (pp. 105–108). Toulouse, France: Cépadèus Éditions.
  • Wright, R. B., & Converse, S. A. (1992). Method bias and concurrent verbal protocol in software usability testing. In Proceedings of the Human Factors and Ergonomics Society 36th Annual Meeting (pp. 1220–1224). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Xue, M., & Harker, P. T. (2002). Customer efficiency: Concept and its impact on e-business management. Journal of Service Research, 4, 253–267.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.