ABSTRACT
Information centers are increasingly being confronted with the challenges of shifting information environments. The development of a digital information society has dictated that libraries devise strategies to capture, describe, and provide access to these digital documents in addition to physical formats. This is nowhere more apparent than in the field of government information. With a public access mandate and a distribution model that has forever been destabilized by the development of low barrier Web publishing technologies, libraries providing access to government information face more challenges than ever. This article looks at the possibility of using topic modeling to increase access to the growing number of poorly described digital texts distributed to libraries and archives. The article provides a basic overview of what topic modeling is and its potential applications in libraries, describes some popular tools and potential workflows, and illustrates how the author tested a potential workflow.
About the author
Jonathan O. Cain, Librarian for Data Initiatives & Planning, Public Policy and Management, University of Oregon Libraries, works with subject specialists and technologists to promote awareness of services related to data access, analysis, collection management, and other digital initiatives. In addition, Jonathan provides consultative services and instruction to clients in disciplines including planning, public policy, and management.