528
Views
1
CrossRef citations to date
0
Altmetric
General Session

Finding the Target: An Experiment in Benchmarking

Abstract

In an effort to maximize a dwindling materials budget, the University of North Texas Libraries have been investigating new methods of evaluating purchasing decisions. One such investigation was into the possibility of using cost data to establish internal benchmarks for acceptable price points. The hope was to create benchmarks for specific subjects and formats to aid staff in evaluating new resources, assessing existing resources, and negotiating pricing with vendors. This article will walk through the steps taken to create these benchmarks, the efforts taken to implement them, and conclusions regarding their feasibility as an ongoing tool for collection development.

The University of North Texas (UNT) has a full enrollment of 36,164 students with aspirations of becoming a Tier 1 research institution. Library materials acquisition is handled by the Collection Development department consisting of six librarians, three support staff, two graduate library assistants, and three student assistants. The Monograph Acquisitions unit handles the purchase of most single title purchases, while the Serials and Electronic Resources unit is responsible for the purchase and upkeep of continuing resources and most electronic resources. The UNT Libraries’ materials budget has decreased by 30% over the past 4 years.

With the reality of annual budget cuts, making informed decisions in the acquisition of new materials is a necessity. As a result, the Collection Development department has been forced to explore new and creative methods of evaluating purchases decisions. While analysis of usage statistics has proved valuable in reviewing past purchases, it is not as useful for evaluating new purchases that lack such data. We considered other sorts of data that could assist in purchasing decisions, and decided one option was to explore price data. In particular, we decided to use price data to establish benchmarks for new materials acquisitions of both monographs and continuing resources.

INTERNAL DATA COLLECTION

Our first step was determining the extent of the data we needed to collect. Our initial desire was to collect several years of price data, beginning with the first year of budget cuts. We were also interested in exploring the possibility of establishing subject specific benchmarks, since pricing can vary widely from one subject area to another. We initially chose three subjects as test cases because we were unsure how long it would take to compile the data. We selected chemistry, history, and psychology because we felt they were representative of the broader subjects of science, humanities, and social science. We also wanted to collect format data in case we were able to discern trends for resources purchased in online versus print format types.

For our monographic purchases, we used the Create Lists function of Sierra, our integrated library system (ILS) to compile a list of all firm orders that had been placed between the beginning of fiscal year 2012 and the end of fiscal year 2014, giving us three years of data. We then exported the following data points from Sierra to Excel for each title:

  • ID number (ILS order number)

  • Format (online, print, etc.)

  • Price

While this covered the basic analysis points, it did not address our desire to focus on specific subject areas. In the past, we would have used our subject-specific fund code structure to assign subjects to each title based on what fund paid for the material. However, due to recent changes in our fund code structure, all fund codes from before FY2014 were deleted from the system. This means there was no subject specific data available from Sierra for any titles purchased prior to September 1, 2014. Thankfully, our Collection Assessment Librarian, Karen Harker, had recently devised a collection mapping Access database which assigns subject headings based on Library of Congress (LC) call numbers, so we exported the information again, this time adding the Call Number field to the export. One benefit of this method is that each call number can be mapped to multiple subject areas, providing a more inclusive, multidisciplinary breakdown of purchases. In other words, while our fund code structure would restrict subject assignment to the department which ordered the material, the collection map would recognize that a specific LC call number would be applicable to multiple disciplines.

For our journal titles, we initially planned to pull historical price data from Sierra as well. However, upon further reflection we decided to focus only on active subscriptions. With this criteria we were able to utilize data we had already collected for our annual budget recommendations in progress (BRIP) spreadsheet. One benefit of using the BRIP was that it did include former fund codes for the journals from before our fund restructuring. The collection map was only used for monographs and not for journals because the map requires full LC call numbers to properly perform its matching, and our cataloging practices do not include assigning full LC call numbers to journals or databases. From the BRIP we pulled the following data:

  • ID Number (ILS Order Number)

  • Subject area (ILS fund code)

  • Format (online, print, etc.)

  • Type (journal, database, package, etc.)

  • Price

EXTERNAL DATA COLLECTION

Due to concerns that our internal data might not present a complete picture of the monographic and continuing resource universe, we decided to pull external data for comparison. To keep this comparison manageable, we only collected external data for our three preselected subject areas of chemistry, history, and psychology.

For monographic data, we utilized our subscription to Bowker’s Books in Print. Using the Advanced Search feature, we performed Sears Subject searches based on our three subject areas. We also limited the Audience field to Scholarly & Professional, College Audience, and Adult Education, and a Date Range of 2012–2014. When we attempted to export the data, we discovered that Bowker’s has a limit of one hundred records which can be exported at a time, requiring us to do multiple exports to retrieve all the data we needed.

Once we had imported the data into Excel, we discovered that the price data was formatted in a way that made it difficult for us to utilize initially. A large number of items from major publishers had no pricing information listed, only instructions to contact the publisher for price quotes, making those titles unusable for our purposes. In addition, even titles which did have pricing information did not format that information in a strictly numeric value. Instead, a typical price field contained text along the lines of the following: $ 83.00(USD) Retail Price(John Wiley & Sons, Incorporated)

In order to convert the pricing data into numeric values, we pasted the information into a text document, used find and replace actions to transform the common text (USD) that typically followed the desired numeric value into commas, and then saved as a comma separated values (CSV) file, which we then opened in Excel. The newly delimited pricing information contained all of the costs as numeric values in one column, which we could cut and paste into our original file.

We performed similar actions for our journal titles, this time pulling the data from Ulrichsweb. We used the Advanced search option to find our preselected subjects under Subject (Keyword), limiting the subscription type to Active and the content type to Academic/Scholarly. The journal process included the same pitfalls as the monographic process, only to a greater extent. In the case of the journals, not only did we have download limits and the lack of value only price fields, but the vast majority of the fields included multiple prices, often in multiple currencies. Here is an example of data pulled from a single title’s Price field: EUR 78.00 subscription per year domestic individuals 2014 effective | EUR 94.00 subscription per year foreign individuals 2014 effective | EUR 102.00 subscription per year domestic institutions 2014 effective | EUR 114.00 subscription per year foreign institutions 2014 effective.

The inclusion of multiple price points in a field, each with its own descriptive criteria and occasionally different currencies, made this a much more difficult set of data to normalize. We once again used a text file to transform certain phrases such as “subscription per year” into text delimiters for reopening back in Excel.

We then began the arduous process of determining which of the multiple options associated with each title would be closest to the option our institution would purchase. Our preference is for Online Only via IP recognition when possible, so we selected whichever of the options seemed closest to that. If a value in U.S. dollars was available, that is the option we selected. If there was no U.S. dollar option, we determined the current exchange rates for the indicated currency to U.S. dollars, and used an Excel formula to perform the conversion for us. In cases with multiple foreign currencies listed, we selected the currency that had an exchange rate closest to a one to one ratio. Once we had normalized data for each title, we then pasted the values back into the original spreadsheet and began the data analysis.

DATA ANALYSIS

Once we had the data in hand, we began experimenting with different ways of establishing potential benchmarks. We first looked at monographs, since it was a much larger data set of 14,461 individual titles. The range in prices went from $1.12 to $5000. We attempted to use the average of all prices to set a benchmark, but quickly determined that the value generated by this method ($64.04) was so low that a large number of our purchases (5,045) would be above that benchmark. In practice, this benchmark would have required our acquisitions staff to send approximately 32 items for further review each week, and the amount would have been burdensome for our small acquisitions staff.

We eventually decided to use percentile ranking as the basis of our benchmark decisions. Percentiles measure a value’s place in relationship to an entire range of values; the higher a value’s percentile rank, the more values it is higher than or equal to. For example, if a value of $319 is ranked at the 75th percentile, that means that 75% of the values in the range have a value that is less than or equal to that amount. Our department has utilized percentile ranking to help us make decisions on budget cuts for the last two years, so we were familiar with the general principle. To generate the percentile information, we used Excel’s PERCENTRANK.INC formula.

However, we once again determined that most of the percentile ranks that would be typical breakpoints, such as the 50th or 75th percentile, still left a large number of titles for review. Even going as far as the 90th percentile left a large number of titles. After much experimenting, we eventually settled on the idea of taking the sum of the values in the top 10th percentile, and using the average of that sum as our benchmark. Utilizing this method left a good number of titles which would need to be reviewed, but not so many that it would be an overwhelming burden on the acquisitions staff. We also decided to flip the paradigm and examine the bottom 10th percentile as well, with the thought that items under a certain cost might be candidates for standing orders. This would save time for the ordering staff in the Monograph Acquisitions unit. Once these methods had been decided on, we applied them to our continuing resource analysis as well.

ANALYSIS SUMMARY

For our overall monographic analysis, the sum of the top 10th percentile method lead to a general benchmark of $210.49. Out of our sample set, we had 256 items priced higher than that, which averages out to almost eight items a month that acquisitions staff would have needed to submit for further review. We then applied this method to our three pre-selected subjects, and determined how many items on average would require further review from each set using not only the general benchmark, but also benchmarks generated from the smaller subject specific subset of internal data and from the large Bowker data. As can be seen in , many more Chemistry books would have been sent for review using the general benchmark than would against the subject specific benchmarks. For History and Psychology it is the internal benchmark that would flag the most material for review, followed by the general benchmark, with the external benchmark again registering the smallest amount of reviewable items.

Table 1 Subject benchmark comparisons—monographs

For our continuing resource analysis, we ran into the problem of having multiple “Big Deals” and packages, which caused the pricing to be skewed. Using the type data from the BRIP, we winnowed our set down to only individually ordered journals, excluding “Big Deals,” packages, databases, and so on. This provided us with 646 titles ranging in cost from $11.26 to $11,168.66. For the journals, we calculated a general benchmark of $1,888.62, and determined that eighteen of our journals cost more than this price point. Because we were only looking at the current year’s active subscriptions for a very narrow slice of our continuing resources, it was difficult to determine how useful this benchmark would be for general journal purchases. As with the monographs, we also looked at the three subject areas and compared them to three different benchmarks (see ). There was much less difference between the decisions that would need review based on benchmark type with this smaller dataset.

Table 2 Subject benchmark comparisons—journals

CONCLUSIONS

Although our analysis confirmed that our current acquisitions benchmark of $200 for firm orders requiring approval is in line with our internal data, most of our efforts showed us the futility of using these methods on an ongoing basis. Our internal data is simply not sufficient to provide a large enough data set, and is already partially skewed by our own decisions.

Obtaining additional data in order to address the insufficiency is problematic. First, gaining a global perspective on pricing is difficult due to incompleteness. In many cases prices are not readily available, but instead quotes from the publisher must be obtained. Second, the number of subjects to analyze in order to have sufficient data is difficult to determine. Finally, compiling the data for three subjects was extremely time-consuming and the task to obtain the same data for all subjects covered would be overwhelming. Based on analysis of the sample and the amount of effort needed to compile data, we determined there was no feasibility in pursuing this as an ongoing project.

Additional information

Notes on contributors

Todd Enoch

Todd Enoch is Head of Serials and Electronic Resources, University of North Texas, Denton, Texas.

Mark Henley

Mark Henley is Contracts Librarian, University of North Texas, Denton, Texas.