293
Views
0
CrossRef citations to date
0
Altmetric
Editorial

Impact Factors and research quality

Pages 13-15 | Published online: 09 Jul 2009

The majority of academic departments in the UK are currently working feverishly to collate the data required for the Research Assessment Exercise (RAE) which takes place in 2008. This exercise occurs every few years (the last one was in 2001) and requires expert ‘panels’ to rate the ‘quality’ of each department or group of researchers on three aspects of academic life: research output, research environment, and esteem. The importance of this exercise is that the results, in the form of an ascending rating scale (1, 2, 3b, 3a, 4, 5, 5*), fix the income to the research entity from the higher education funding councils (HEFC) for England, Scotland, Northern Ireland, and Wales. HEFC funding is fundamentally important in providing financial support for universities and paid out more than £1 billion for academic research following the 2001 RAE in which over 48 000 researchers were evaluated in 2598 research group submissions from 173 institutions. Departments with ratings of less than 5 are in serious danger of dramatic reductions in funding. Thus most universities and departments work hard to ensure that the highest grade is awarded. Strategies to do this involve ensuring that the most research-active staff, especially those new ones who have been wooed with the promise of facilities and remuneration, are in post on the due census day and that measures of quality have been carefully reviewed and thoroughly adhered to.

Depending on the panel that carries out the appraisal the three assessment characteristics (output, environment, and esteem) are given different degrees of importance. The Epidemiology and Public Health panel, to which my research group will be submitted, rates the research output as 75% of the total grade, the environment as 20%, and esteem as 5%. Some of my colleagues will be assessed by an Engineering panel which will rate the output as 50% and the environment and esteem as 25% each. There is not always a perfect match between the panel title and the research focus of any particular group. So, whilst the research group that I lead is entitled ‘Human Development and Ageing’ no equivalent assessment panel exists and we are submitting to the nearest match. This imperfect match is true of many research groups and is viewed as a weakness in the system that may be deleterious to some research groups, particularly interdisciplinary groups.

Measures of the quality of the research environment are not too difficult to quantify. They include the number of graduate students supervised by each academic and how many finish their doctoral studies within the recommended 3 years. Within this environment, research income is equally easy to monitor although it is the amount spent within the time line rather than the amount raised that is important. (Obtaining the £10 million mega-grant just prior to the deadline does not impress the assessment panel if you only have the time to spend £10 000.) Research structure, staffing policy, and research strategy are less easy to quantify but most of us can recognize good as opposed to bad practices in these areas and can put together a reasonable case for a quality rating.

Esteem is a little harder to quantify especially when some panels ask for only four indicators of esteem whilst others allow one to boast liberally. Those measures of scientific leadership, Editorial Board membership, an active role in the executive committees of national and international scholarly societies, guest and eponymous lectures, keynote addresses, etc. are all thought estimable.

So far, so good. Clearly one can describe one's research environment and the esteem within which one is held by one's peers, but now comes the tricky bit—what is ‘quality’ in terms of research output? This, let us not forget, is the most important aspect of one's research to be assessed. No matter how good one's environment and esteem, poor output will reduce one's rating of ‘excellent’ down to ‘must try harder’.

It is not surprising, given the financial importance of this exercise to UK university departments and research organizations, that almost all turned to bibliometrics as the objective measure of the quality of output. Those seemingly ubiquitous numbers that provide ‘impact’, ‘half life’, and ‘immediacy indices’ appear to dominate discussions within editorial boards and publishing houses intent upon presiding over the most prestigious journals that most forcefully influence international research. Contrary to popular belief, the science of ‘bibliometrics’ is not new (even though my MS Word UK dictionary does not recognize it!) and did not start with Eugene Garfield and his Institute of Scientific Information (ISI) in Philadelphia in the 1960s. Bibliometrics was conceived in the 1920s with seminal work by Hulme (Citation1923), Lotka (Citation1926) and Gross and Gross (Citation1927) and then lay dormant until ‘Science since Babylon’ and ‘Little Science, Big Science’ by Derek John de Solla Price in 1961 and 1963, respectively. He advocated the use of the Science Citation Index that had originally been developed by Garfield at the ISI in 1955. Using the database at the ISI the Impact Factor (IF) became the tool for the quantitative analysis of supposed quality in scientific publication. The advent of computers and access to large bibliographic databases (Web of Science, Medline, etc.) enhanced the use of bibliometrics to gauge the impact of one's own work and those of one's peers, collaborators, and competitors in the game of science. Bibliometrics is now an accepted and sophisticated science with its own research practitioners or bibliometricians. It also has its target consumers in the form of research scientists, research managers, and those involved in research policy.

The IF is calculated from the number of citations in 1 year of the articles published in the journal in the previous 2 years. The number of citations is divided by the total number of published articles to arrive at the IF. For the Annals of Human Biology that statistic last year was 111/112, resulting in an IF of 0.99. So close to the imaginary hurdle of 1.00 as to make little or no practical difference and yet one more citation would have meant a great deal to publishers, editors, and authors. It is little wonder that the IF receives much criticism because of its abuse by management and policy makers. Richard Monastersky (Citation2005), for instance, describes it as ‘an unyielding yardstick for hiring, tenure, and grants’. Of the 5968 science journals with IFs assessed and listed by the ISI in 2004 over half had an IF of less than 1.0. Its use to assess any one scientist is shown to be a fallacy by the fact that it doesn’t assess the impact of any single paper, or any single author, but of the average paper in a journal. For example, a quarter of the articles in Nature in 2004 drew 89% of the citations to that journal, so a vast majority of the articles received far fewer than the average of 32 citations reflected in the 2005 IF of 37.5. It seems that for the last 15 years we have been obsessed by IFs and the need to submit to journals with the highest IF within one's field of interest. However much the criticism of IFs continues, most scientists have little or no doubt that IFs will continue to be used by management and policy makers.

How refreshing, therefore, to find that the review panels for the RAE in 2008 will not be using bibliometrics to assess the quality of the research output. Instead they will evaluate ‘… criteria including the originality, scientific rigour, contribution to knowledge and conceptual framework of the field, as well as the challenge and logistical difficulty posed by the work.’ Different forms of original research will be entertained including ‘… systematic reviews, meta-analyses, the analysis and interpretation of secondary data and sample collections, qualitative research …’ etc., etc. (RAE Citation2008, 2006). It seems that our concern over journal impact and the desire of many institutions to insist that academics only submit to ‘high IF’ journals has been misplaced—or has it?

How are originality, rigour, etc. to be assessed by the panels? By ‘peer review based on professional judgement’ according to the documentation (RAE Citation2008, 2006). Lead assessors will be identified within each panel and two sub-panel members will independently evaluate each output and submit their ‘quality level scores’ (QLS) to the lead member. We do not know how those scores will be arrived at, and call me cynical if you will, but I will be very surprised if the QLS does not correlate highly with journal IFs.

References

  • Gross PLK, Gross EM. College libraries and chemical education. Science 1927; 66: 385–389
  • Hulme EW. Statistical bibliography in relation to the growth of modern civilization. Grafton, London 1923
  • Lotka AJ. The frequency distribution of scientific productivity. J Washington Acad Sci 1926; 16: 317–323
  • Monastersky R. The number that's devouring science. The Chronicle of Higher Education 2005; 52: A12
  • Price D De Solla. Science since Babylon. Yale University Press, New Haven, CT 1961
  • Price D De Solla. Little science, big science. Columbia University Press, New York 1963
  • RAE 2008, 2006. Panel criteria and working methods. January, 2006

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.