Abstract
The recent influx in generation, storage, and availability of textual data presents researchers with the challenge of developing suitable methods for their analysis. Latent Semantic Analysis (LSA), a member of a family of methodological approaches that offers an opportunity to address this gap by describing the semantic content in textual data as a set of vectors, was pioneered by researchers in psychology, information retrieval, and bibliometrics. LSA involves a matrix operation called singular value decomposition, an extension of principal component analysis. LSA generates latent semantic dimensions that are either interpreted, if the researcher's primary interest lies with the understanding of the thematic structure in the textual data, or used for purposes of clustering, categorization, and predictive modeling, if the interest lies with the conversion of raw text into numerical data, as a precursor to subsequent analysis. This paper reviews five methodological issues that need to be addressed by the researcher who will embark on LSA. We examine the dilemmas, present the choices, and discuss the considerations under which good methodological decisions are made. We illustrate these issues with the help of four small studies, involving the analysis of abstracts for papers published in the European Journal of Information Systems.
Additional information
Notes on contributors
Nicholas Evangelopoulos
Nicholas Evangelopoulos is an associate professor of Decision Sciences at the University of North Texas and a Fellow of the Texas Center for Digital Knowledge. His research interests include Statistics and Text Mining. His publications include articles appearing in MIS Quarterly, Communications in Statistics, and Computational Statistics & Data Analysis.
Xiaoni Zhang
Xiaoni Zhang is an associate professor of Business Informatics at the Northern Kentucky University. She received her Ph.D. in Business Computer Information Systems from the University of North Texas in 2001. Her publications appear in IEEE Transactions on Engineering Management, Communications of the ACM, International Conference of Information Systems, and Information & Management.
Victor R Prybutok
Victor R. Prybutok is a Regents Professor in the Information Technology and Decision Sciences Department in the College of Business and the Associate Dean of the Toulouse Graduate School at the University of North Texas. Dr. Prybutok is an ASQ certified quality engineer, certified quality auditor, and certified quality manager. Dr. Prybutok has authored over 90 journal articles and more than 70 conference presentations.