537
Views
2
CrossRef citations to date
0
Altmetric
Program Sessions

Teaching Wild Horses to Sing: Managing the Deluge of Electronic Serials

Pages 99-104 | Published online: 08 Apr 2013

Abstract

When the sheer volume of incoming electronic serials threatened to overwhelm us in the University Libraries at Virginia Tech, we embraced the opportunity to examine our entire e-serials management system and options for utilizing services provided by vendors. This resulted in the formation of a collaborative task force composed of people from serials management and cataloging. We describe our process of implementing a vendor supplied service, the ways in which the MARC [machine-readable cataloging] record service changed the way we manage serials cataloging and holdings records for electronic journals, and the way in which some simple scripting in Python helped us overcome some significant obstacles.

When the sheer volume of incoming electronic serials threatened to overwhelm the University Libraries at Virginia Tech, we embraced the opportunity to examine our entire e-serials management system and our options for utilizing services provided by vendors. A collaborative task force composed of members from both the Serials and Cataloging teams evaluated the various vendor supplied services providing ready-made bibliographic records. The task force also studied processes for implementing a MARC [machine-readable cataloging] record service (MRS).

The libraries at Virginia Tech began acquiring remotely accessed electronic journals in the 1990s, with a few earlier titles being acquired prior to that point. Cataloging began in earnest in 1998. At that time, most of the electronic titles were duplicates of titles the library owned in print, therefore the libraries decided to go with the single-record approach—creating one composite record for all the manifestations of a resource.

Then, along came online-only journal titles and batch loading beginning with the NetLibrary™ collection of e-books. The volume of e-journal titles began increasing to the point that available cataloging resources proved insufficient to handle the influx. This growth continues exponentially and traditional cataloging alone cannot keep pace. The pressing and persistent need to synchronize immediate availability with access compelled us to seek and to employ outside assistance from vendors.

There are multiple systems providing entrance to electronic resources. All these systems require maintenance. At Virginia Tech, these systems include:

A–Z Lists

E-journals database

Online catalog

Summon (discovery platform)

Institutional repository (VTworks)

In March 2010, the Cataloging team was charged with investigating automation solutions and use of vendor supplied services and Althea Aschmann submitted a preliminary report to the Director for Technical Services. September 2010 to January 2011 was our first research period with our Senior Serials Cataloger producing the final report with recommendations in the middle of January 2011. The task force was formed and a second period of research followed from January to March 2011.

The task force interviewed people in ten other academic libraries about their use of vendor supplied MRS. We found three libraries who had been using MRS for e-journals for a year or longer and talked to them in detail about their processes.

Our research revealed there were three major players in the provision of these types of services. They were Serials Solutions 360 MARC Update Service™, Ex Libris MARC it!™, used in conjunction with its SFX™ products, and EBSCO MARC™, plus a few smaller companies. Our vendor choice was made because we were already using their other products and this alternative would not present compatibility issues.

Implementation began in September 2011. The task force concluded that the single-record approach was no longer workable with vendor supplied records because of control number matching issues in our integrated library system (ILS). To take advantage of automated holdings updates and coverage loads, there needed to be one bibliographic record and one holdings record for each online title. Splitting composite records apart into separate online and hard copy records was part of the implementation process.

Throughout the entire process of implementing the MRS, the overarching goal was to balance the quality of the bibliographic records with ease of ongoing maintenance. In order to address this, three working groups were formed. The first one was the Crucial Metadata Standards Working Group, which primarily consisted of catalogers. Its charge was to determine the importance of various metadata elements in bibliographic records, especially identifying those that are essential for access to and retrieval of resources. The second group was the Workflow Group, which identified and tested the processes and procedures needed for successful loading of bibliographic records and holdings in the ILS. The Workflow Group primarily consisted of serials management personnel, with a cataloger liaison. This group was instrumental to the implementation of the MRS and was also tasked with the evaluation of the processes on an ongoing basis. The Priorities Working Group consisted of collection development, serials management, and cataloging personnel. This group's focus was to determine the priority order for loading e-journal packages and to ensure that those packages had only a small window of downtime.

The actual implementation of the MRS was split into three phases. Phase 1 dealt with the “low hanging fruit”—those 6,000 brief records that were already in the catalog and could be immediately improved by matching and loading the bibliographic records from the MRS. These “brief bib” records, which had been created by the electronic resource management (ERM) system native to the ILS, usually consisted of a title in the 245 field and occasionally an International Standard Serial Number (ISSN) in the 022 field.

Phase 2 dealt with the 11,000 composite bibliographic records already present in the catalog, representing multiple formats. The purpose of Phase 2 was to provide different bibliographic records for each different format of a serial title. While the single-record approach was still philosophically desirable, it created a conflict with the MRS workflow because of complications with the ERM system built into the ILS. For the loading processes to work properly, each online serial title needed to have its own bibliographic record with the MRS control number in the 001 field. This would allow the ERM system to locate and link e-journal holdings to the correct e-journal record while preserving the print record's link to WorldCat. An effective solution was to make a duplicate copy of the existing bibliographic record, remove irrelevant copy-specific fields, and separate the holdings by format (online versus non-online). Since the existing bibliographic record had been sufficient for the library's needs until this point, we assumed it would still be sufficient for representing the online title. Other very important considerations included the authority work that had already been done on these existing records, and that call numbers on these records were already being used for collections analysis by other library departments in the library.

Naturally, there were some complications in this process. Format-specific notes in the 533 fields needed to be removed. General notes with format-specific references in other 5XX fields needed to be edited. There were even format-specific access points that needed to be adjusted. All of these issues were addressed as they were discovered.

Phase 3 involved everything else that had not been dealt with in phases 1 and 2. The MRS control number had to be added to the bibliographic records for the remaining online serials. This was a labor-intensive process performed by the Serials team, including both full-time staff members and part-time students. Scripting solutions using Python were developed and employed, saving significant amounts of time and manual work. Progress was made collection-by-collection, with the guidance of Collection Management personnel through the Priorities Working Group.

Throughout all three phases of this implementation, quality control was an ongoing concern. With the guidelines established by the Crucial Metadata Standards Working Group, a serials cataloger was continuously “triaging” various problems along the spectrum of minor to major. Obviously, major problems needed to be addressed, while minor problems could be tolerated. Three examples of major problems were typos in the title of a serial, a missing 780 field for an earlier title, and titles that became online-only for which the MRS provided a closed-out print record.

Given the cooperative culture of cataloging, we took a few intentional steps designed to perpetuate our tradition of collaboration. The customer services personnel of the vendor were informed of any major problem found in MRS-supplied records. Other problems were discovered in records that had already been authenticated through the Cooperative Online Serials (CONSER) program. Because Virginia Tech is not a CONSER member, our initial step was to fix the record in the local catalog and make a note of it in a spreadsheet. We hope to correct these records in a way that will benefit other institutions as well.

During implementation of the MRS for our e-serial records we faced two major types of problems: philosophical problems concerning the handling and quality of the records, and labor-intensive work that grew out of merging two formerly disparate systems. Through compromise we were able to solve the philosophical problems arising from the apparently opposing goals of meeting our metadata quality assurance standards and keeping workflow management simple. Despite a tense debate, everyone involved was willing to work together to develop a set of implementation procedures that balanced efficiency and speed with good quality work.

Ironically, solving these philosophical problems was by far the most difficult part of the implementation process; in comparison, using a Python script to automate the labor-intensive work caused by problems arising from the merging of a new collection of records into our ILS was relatively easy.

The balance between preserving good quality metadata records in our catalog and a simple management workflow lay at the core of our implementation process. Although we found that many libraries automatically load e-serial records supplied by their vendors into their ILS, we preferred to pick and choose which records we would load into our catalog in order to avoid duplication of records and to keep the quality metadata produced by our catalogers over the last fifty years. Rather than automating the entire process of loading and removing records, we instead chose to automate the process of selecting which records to include in the catalog, and which records to reject.

The first step in our implementation process was identifying all the records in the initial MRS set that were already in our catalog—about 40,000 records. Rather than manually checking every record in our catalog against the initial MRS set, a simple Python script was created to run through all the MARC records in our local catalog and the MARC records from the service. When a match was found, based on ISSN-L and title (subfield a, n, and p, all non-alphabet characters removed and changed to lowercase), the script would output an almost exact copy of the locally cataloged record—the only difference being the inclusion of the unique identifier of the matching vendor-supplied record. These records were then loaded en masse into our catalog, overlaying the original records. Because the vendor-supplied records have a different identification scheme than our local records, we are able to easily distinguish between the two.

The simple Python script was able to accomplish over 80% of the record matching in a matter of minutes (plus the time it took to write and modify the script itself), saving us weeks, if not months, of work. Because less than 20% of the records remained to be manually matched, we were able to dedicate a large number of our staff resources to this task, and we accomplished the remaining matching work in under two weeks.

From this implementation we learned that a combination of rudimentary programming (also known as scripting) and manual processing works well for many of the large-scale data manipulation projects in e-resource management. We have successfully applied this combined solution to other large-scale tasks, including adding ISSN-Ls to the e-serial records in our catalog and tracking titles that should be subscribed to in our third party knowledge bases.

The implementation of the MRS affected one other system outside of our ILS—our discovery layer. In 2009 we made the decision not to include records for open-access and free content, such as the Directory of Open Access Journals and other freely accessible collections, in our ILS. This decision was due to a combination of extremely dynamic title inclusion and poor quality of metadata found in OCLC. However, the combination of an MRS and the ability to have a second subset of bibliographic records within our discovery layer has created a sustainable vehicle for the inclusion of these records. Twice a month, a Python script runs over the MARC file of new and deleted titles provided by our vendor and creates files of open-access and freely available MARC records that can be automatically uploaded to our discovery layer. This has enhanced user ability to find and retrieve titles that are not in our local online catalog but are accessible in our discovery platform, without compromising the integrity of the metadata within our ILS.

There was a wide range of questions from the audience following the presentation. It was apparent that there is significant interest in this topic and most libraries are grappling with issues caused by the deluge.

Our conclusion is that it is possible to manage the large volume of electronic journals, manipulate the large quantity of bibliographic data with speed and efficiency, while still maintaining the integrity of our cataloging metadata in our local system. Through a combination of employing an MRS from a vendor and programming skills to create scripts that enhance record quality and workflow efficiency, we are able to manage the deluge of electronic resources in a sustainable fashion.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.