453
Views
0
CrossRef citations to date
0
Altmetric
General Session

Do We Have It? A Comparative Analysis of Library Journal Holdings and Works Referenced in Faculty Publications

Abstract

Libraries often wonder if their collections can support their faculty’s research. Large citation databases, such as Scopus or Web of Science, can help automate such a review process. This session will present findings about to what extent faculty were able to complete their journal publications without utilizing interlibrary loan. Using data gathered from Scopus, the range and date of journals referenced by faculty at my home institution will be compared with holdings from my library’s electronic resources knowledge base to determine the level of overlap. Results will be presented by journal discipline to highlight any variation among the science, social science, and arts and humanities divisions. Special emphasis will be given to using this data to inform decisions about purchasing journal backfile collections.

INTRODUCTION AND PROJECT GOALS

Starting in 2009, American University Library (AU Library) began subscribing to comprehensive journal packages from many major academic publishers (Elsevier, Sage, Taylor and Francis, etc.). At about the same time, the university’s Provost announced a push to improve the scholarly impact of AU faculty. This marked a change from the university’s balanced scholar/teacher model to one with a more research intensive expectation. The idea for this project was derived from trying to connect the library’s acquisitions decisions to the Provost’s research initiative. There were three questions I sought to answer:

  • Are our faculty using materials we subscribe to or are they relying on interlibrary loan, the Internet, or other sources to complete their research?

  • What are the date ranges of the resources being used?

  • What is the breakdown of resource type? Specifically, are the journals primarily from publisher packages, aggregated content providers, or single title subscriptions?

METHODOLOGY

To answer the research questions, I used Scopus to harvest journal articles published by AU faculty between 2009 and 2013 based on the Scopus journal subject categories along with their associated references. While this could be achieved using the Web of Science, I found that Scopus was more comprehensive in its coverage with approximately 1,400 articles in Scopus as opposed to 990 in Web of Science. Once I had a spreadsheet with all of the publications for each discipline, I cleaned up the data and compared the referenced sources against AU’s knowledge base, 360 Link. Specifically, I checked if the library had the title in its holdings and how the library subscribed to the source.

Data Gathering Process

A significant portion of this presentation discussed my process for gathering and cleaning data, which I will summarize. There are two ways to do a mass export in Scopus. One involves listing all of the references used in a selection of publications. While this approach would have worked for this project, it has one major downside. The references are disassociated from the articles that referenced them and the number of references is capped at 8,000. If you want to examine individual authors or look at specific articles, this method will not work.

The other way to mass export in Scopus is to export to a comma separated value (CSV) file on the main result page. This has the advantage of not being capped and it produces a spreadsheet where the faculty publications and the works referenced are all on one row. However, all references for each publication are contained within one cell. To break apart the references without breaking Excel, I first used the Text to Column feature (delimiting on the “(“ symbol) on the column containing the references so that each column had a date and then the journal title. This worked well because the citations followed an American Psychological Association (APA)–type citation format so it was not too difficult to isolate the journal year and title. I then used a macro to automate the removal of the extraneous parts of each reference, transfer each reference from columns into rows, and keep them associated with the publication in which they were cited (see for the macro code). While the macro worked well for journals, it tended to splice books and other monographic content, so some cleaning of the data was needed. Finally, once I had a journals referenced list I needed to reconcile it with the knowledge base. Unfortunately, the journals are written in different ways in Scopus and this required me to clean and reconcile them with the knowledge base using Open Refine.

Figure 1. Macro code for parsing data in excel.

Figure 1. Macro code for parsing data in excel.

RESEARCH FINDINGS

Once the Scopus journal titles were reconciled with the titles found in our knowledge base (for titles we owned), I was able to run my analysis. The first finding was that of the 81,684 references used by AU faculty, approximately 59,000 were from journals. Of those cited articles, the library has active subscriptions to 37,102 of the titles. This means that the library subscribes to about 72% of the serials faculty used in their publications.

Data by Discipline

Looking at all of the subject areas, there was high use clustering for about 10–20 titles and then very long “tails” in terms of faculty citing several single publications (See ).

Figure 2. Sample of journals referenced by AU faculty in agricultural sciences.

Figure 2. Sample of journals referenced by AU faculty in agricultural sciences.

Some subject areas, like Social Sciences, showed a strong correlation between journals published in and journal referenced, whereas others, like Medicine, did not. Some cross-disciplinary effects were also noticed in the faculty publications and references that were likely due to a regional effect. For example, Business had Public Administrative Review as the top faculty publication and reference source. This could be explained by AU’s location in Washington, DC, the government being the largest employer in the area, or easy access to research material related to the government in the city.

Data by Access Provider

Examining the data by provider produced some interesting results. First, aggregators like EBSCO and ProQuest provide a significant amount of access to serials titles. Of course, one cannot be sure the faculty used one provider over another or some other access method. However, it does show that if researchers relied solely on aggregators for access they could meet almost one quarter of their research needs (see ).

Table 1. Count by provider of journals referenced by AU faculty

Another interesting finding was the amount of Open Access (OA) content that faculty were utilizing in their publications. Given all of the issues surrounding OA, AU Library debated whether to include these titles in its knowledge base. The data provides evidence that we made the right decision in adding the titles.

NEXT STEPS

Looking ahead, the next level of analysis for this data is to look at specific faculty and their engagement with references. I also plan to look at articles deemed high impact and examine the citations to see if any patterns emerge.

Questions and Answers

Most of the questions from the audience related to the extraction process for the citations. However, there were some questions about methodology. Regarding the methodology, I tried to make clear the caveats that I cannot be sure that the faculty who wrote these article used any materials from AU’s library. Furthermore, just because a faculty member is listed as an author, I cannot be certain that he or she was involved in the research or writing of the article in any meaningful way. In short, the research shows more the potential for support rather than proven support.

Additional information

Notes on contributors

Michael Anton Matos

Michael Anton Matos is Business Librarian and Adjunct Professor of Information Technology, American University, Washington, DC.