956
Views
15
CrossRef citations to date
0
Altmetric
Original Articles

Why the Provenance of Data Matters: Assessing Fitness for Purpose for Environmental Data

Pages 23-36 | Published online: 23 Jan 2013

Abstract

While fitness for purpose is the principle universally accepted among scientists as the correct approach to obtaining data of appropriate quality, many scientists or end-users of data are not in a position to specify exactly what quality of data are required for a specific analysis. Agencies that collect environmental observations provide data as is offering no guarantee or warranty concerning the accuracy of information contained in the data, in particular, no warranty either expressed or implied is made regarding the condition of the product or its fitness for any particular purpose. While the increasing implementation of ISO 9002 will benefit users in the future, the reality is that many of the existing databases generally contain data that were not gathered with present standards and protocols, or the same methods over the period of record. Usually, long-term records will contain observations that have been made with several different observation techniques, sometimes several locations, and frequently a progression of quality assurance and workup techniques, and these changes may not be well documented. While it is important that hydrometric and climate services focus on capturing data that are fit for their intended purpose, the burden for assessing the actual suitability for use lies entirely with the user. Some general principles for assessing fitness for purpose are proposed.

Bien que le principe de l aptitude l'emploi soit universellement accept parmi les scientifiques en tant qu'approche adquate pour l'obtention de donnes d'une qualit approprie, de nombreux scientifiques ou utilisateurs finaux de donnes ne sont pas en mesure de prciser avec exactitude quelle qualit de donnes s'avre ncessaire pour une analyse spcifique. Les organismes qui recueillent des observations environnementales fournissent les donnes telles quelles sans aucune garantie quant l'exactitude de l'information qu'elles renferment. En particulier, aucune garantie, explicite ou tacite, n'est offerte quant ltat du produit ou quant son aptitude un emploi particulier. Mme si la mise en uvre croissante de la norme ISO 9002 avantagera les utilisateurs l'avenir, le fait est que bon nombre des bases de donnes existantes contiennent en gnral des donnes qui n'ont pas t recueillies l'aide des normes et des protocoles actuels, ni l'aide des mmes mthodes tout au long de la priode de relev. Habituellement, les relevs long terme contiennent des observations faites partir de plusieurs mthodes d'observation diffrentes, parfois plusieurs endroits, et souvent au moyen d'une varit progressive de mthodes d'assurance de la qualit et de traitement conclusif, et ces variations ne sont peut-tre pas bien documentes. Bien qu'il importe que les services hydromtriques et climatiques mettent l'accent sur la saisie de donnes adaptes l'usage prvu, le fardeau de lvaluation de l'adquation de ces donnes en vue de leur utilisation incombe entirement l'utilisateur. Certains principes gnraux pour lvaluation de laptitude l'emploi sont proposs.

Introduction

Hydrology and climatology are based on providing accurate and reliable information about water and climate, since water and climate may be both an asset and a threat to individuals, communities, industries, economies and societies. Observation networks generally provide both operational and strategic information (Marsh, Citation2002). Providing the essential data to meet these needs is constrained by the reality of monitoring conditions that range from moderately challenging to absolutely perverse, physically and socially, particularly in Canada. The needs for these data and the constraints result in many innovative and creative solutions to the problem of how data can be collected. Frequently long-term hydroclimatological records are composed of segments of operational data. Unfortunately, this also results in hydrologic and climatic records that contain inhomogeneities and discontinuities.

In the past decade, environmental data have become commonly accessed from internet sources. While the providers of such data have invested heavily in making data available on the internet, they have largely eliminated the access to professional advice and knowledge regarding the suitability of data and guidance regarding its use, and often have failed to provide access to metadata. During the same period, there has been an increased zeal in the broader community, such as farm weather networks and community watershed groups, towards data sharing. This broadened access to data has certainly increased the frequency with which data are used; however, the user communities are not acting responsibly if they fail to assess the fitness for purpose of the data they choose to use. Further, the users are not well served when they do not have access to proper guidance in that respect, or to adequate metadata. In addition, many users do not understand metadata, or how to use it effectively.

Wang and Strong (Citation1996) argue that poor quality data can have social and economic impacts. Many data providers focus on precision and accuracy; however, users need to have much broader perspectives. Data must: (1) have intrinsic quality; (2) be considered in the context of the task; (3) be representative; and finally, (4) be accessible. High-quality data should be intrinsically good, contextually appropriate, clearly representative and traceable. Accessibility should never be given prominence over the first three properties.

While most users are aware that they need data series that are accurate, relevant, interpretable, and available, many are insufficiently aware of the consequences of changes in methods, locations, and exposures to even ask the correct questions should they have access to metadata. The users have two basic needs, data that are of sufficient precision and accuracy, and data that are consistent over time. Clark and Whitfield (Citation1992) recommend the term bias be used instead of accuracy to reflect systematic errors, and precision be used to reflect random errors. The agencies that provide data could provide a value-added component to the data they share by converting their professional knowledge to a customer service. These agencies should not simply classify their data as fit as they have insufficient information about the intended purpose of the user; however, they could provide information and perhaps some guidance that would support the user's assessment of fitness. This might include access to records of methods with precision and accuracy, changes in instrument locations, etc., and tools that support user assessments.

These are not new issues; Hudson et al. (Citation1999) reported on the 1995 IAHS Workshop on Quality Assurance in Hydrologic Measurements, linking hydrometric programs with ISO 9002 and developments in rating curve theory. Hudson et al. (Citation1999) argue that quality assurance approaches can and should be applied to observation, data processing, archiving and dissemination of environmental data. While agencies have certainly moved to address the quality assurance aspects through programs such as ISO-9000, there remains an almost universal lack of application of the quality assurance (QA) concept of fitness for purpose in the design and operation of hydrometeorological networks and services.

At present we are faced with two realities. First, we have a need to be able to detect and assess anticipated changes in climate, hydrology, land cover, and patterns of water utilization, often in combination. These changes can be quite subtle and detection methods are becoming increasing sensitive. Second, there is an imperative to maximize the utility of monitoring data, for these purposes, from operational networks which have evolved rapidly in the past decades. Maximizing the utility of these data needs to not simply increasing the number of times the data are used, but rather ensuring that the data are used wisely. The reality of network and operational evolution issues is that operational networks actually serve two different purposes short-term real time and long-term retrospective. While the need for observations appears at first glance to be quite similar, they are fundamentally different; short-term real time observations can generally greatly benefit from methodological and operational improvements, while these improvements can be detrimental to the quality of long-term records. The detrimental aspects can only be addressed, and perhaps overcome, by the understanding and use of metadata, a broader understanding of the role of standards, and a higher level of caution when undertaking analysis. The objective of this paper is to review the role of fitness of purpose for environmental data, primarily hydrologic and climatic data, to provide guidance regarding how they can be assessed, and to reinforce the need to create better synergy between information users and those responsible for data acquisition and provision.

Assessing Fitness for Purpose

While their focus was on geospatial data, Vasseur et al. (Citation2003) outlined an ontological approach to fitness for purpose. Their model is more useful if placed in a conceptual framework that can be applied to most types of environmental data, and is described by the following steps:

1.

Frame a research question and a working hypothesis in order to create a conceptual model of the problem. This model includes the components required by the user, the relations between the components, and with the quality of the data needed being defined (Clark and Whitfield, Citation1993). More precisely, users have to consider two things: the suitability of the record for the research question, and its fitness in terms of data quality.

2.

Search for and select sources of potentially suitable data listed in catalogues and databases, hopefully documented by metadata which give their specifications and/or their internal quality characteristics at the dataset level, which can be used to assess the suitability and fitness of the dataset.

3.

Analyze the extent to which these datasets match or miss the suitability and fitness expected by the user. Some guidelines for this assessment are given later in the paper; however, the key is to match the needs of the assessment method with the attributes of the dataset. For matched elements: carry on to a direct translation [step 5]; for missing elements, seek an alternate solution [step 4].

4.

Reformulate the conceptual model of the problem, i.e., relaxing either the suitability or quality level desired by the user, or make explicit knowledge and complementary hypotheses with the aim of replacing missing portions. This reformulation implies a chain of inference being introduced that affects the quality and perhaps may compromise the testability of the hypothesis.

5.

Translate the corresponding (matched) part into a query of the database.

6.

Query the actual databases with the query provided by step 5. The data obtained in this manner are then evaluated with respect to whether the data are suitable for testing the hypothesis and fit with respect to data quality. If the dataset is not suitable, there is the possibility to return to step 4, or step 2 or 3, to broaden a search; if the dataset is suitable, then the analysis may proceed.

7.

The user makes the final decision to accept or reject the result, considering carefully the qualitative and quantitative measures of quality.

This model is useful as it focuses the user on the dual issues: the problem to be resolved, and the quality of data that will be required. For example, the suitability might be related to having a long streamflow time series, while fitness for purpose of river flow data primarily reflects their accuracy but is strongly affected by other factors. These include: the proportion of missing or anomalous data (Marsh, Citation2002), particularly since these types of data tend to cluster during periods of hydrologic extremes; heteroscedasticity (Whitfield and Hendrata, Citation2006), and the need for procedures to estimate missing flows (Marsh, Citation2002).

Quality Assurance

All environmental monitoring programs require investment in formalized quality assurance. Clark and Whitfield (Citation1993) offered a practical model that integrates quality assurance into all aspects of environmental monitoring. Their model ensures that all aspects of the monitoring program are compatible with the project goals and the agencies involved, and focuses efforts on maintaining a high degree of credibility. MacDonald et al. (Citation2009) report on designing monitoring programs that provide the information needed to make informed decisions on the management of aquatic ecosystems. Their process includes eight steps that, together, enable water managers to define the goals of the data-collection program, design a monitoring program that directly supports these goals, and interpret the resultant data in a manner that facilitates effective management of the human activities that affect water resources. While this process is broadly applicable to designing a monitoring program, it was explicitly developed to support assessment of the status and trends in water-quality conditions and management decisions based on such information collecting data that is fit for purpose. Environmental monitoring, particularly water monitoring, is expensive and time consuming, and a systematic approach to designing the monitoring program and how the data so obtained will be used to address the core problem contributes to making the program both effective and efficient (Clark et al., Citation2010). From a fitness for purpose perspective, systematic planning results in clear processes that identify the problems to be resolved and the quality of the data required to address such problems.

Despite such efforts, assessing the fitness for purpose of most environmental data remains a challenge. In hydrology, Dymond and Christian (Citation1982) performed an error analysis on hydrologic rating curves that showed how three types of errors influence the uncertainty of a single discharge estimate. These are the rating curve error, the error of measurement of water level, and an error caused by ignoring all physical parameters that affect discharge. Pelletier (Citation1988) reviewed the existing literature on the uncertainties in the determination of a single discharge measurement, which includes those associated with the cross sectional area and the mean velocity in space and in time, and with the current meter. Pelletier identified a need for additional research into current meter performance in small streams, at low velocities, and under ice conditions. Stations with insensitive control are shown to have a 10% flow error for a 5 mm stage change (Marsh, Citation2002).

In water chemistry, the choices of both the analytical procedure and the sampling strategy can be treated as a decision theory problem (Fearn et al., Citation2002), in which sampling and analysis costs are balanced against end-user losses. The Fearn et al. (Citation2002) approach to these choices is suggested to be an appropriate way to quantify the concept of fitness for purpose. Thompson (Citation2000) and Ramsey and Thompson (Citation2007) argued that in addition to for the need for QA in analytical environmental chemistry there is a need for QA in environmental sampling. Most of the QA processes can be adapted to sampling analogues for trueness, bias, accuracy, precision, uncertainty, traceability, fitness for purpose, reference materials, etc. Fitness for purpose is the property of data produced by a process that enables a user of the data to make technically correct decisions for a stated purpose. More broadly, test measurements support the interpretations based upon them, without compromising the correctness of the decision. Ramsey and Thompson (Citation2007) suggest that while sampling is a dominant contributor to the uncertainty of a measurement result, it remains largely ignored.

In climatology, Metcalfe et al. (Citation1997) report on the errors in the measurement of precipitation in the Canadian observing network. Systematic errors, including wetting loss, wind-induced error, and trace precipitation were quantified, and a method of adjusting archived data was described. These systematic errors can exceed 7%, and using monthly correction factors may not account for all the biases. Tokay et al. (Citation2010) compared rain gauge measurements from a variety of observing networks to pairs or triplicate research rain-gauges as sites which already had multiple gauges. They report that the weighing gauges from the United States National Weather Service and Automated Surface Observing System (ASOS) demonstrated high performance, while those from Automated Weather Observing System performed poorly. Similarly ASOS tipping bucket and cooperative observer programs were reliable for monthly rainfall, but not always for daily observations.

The accuracy, currency, relevance and ease of use of electronic information resources can be measured to provide an indication of the resource's product quality (Klobas, Citation1995). The link between product quality and electronic information resource use is, however, relatively weak. This is because product quality is only one of several influences on use. Use is better explained as a function of fitness for purpose: the extent to which the information resource is of appropriate quality for the situation in which it is to be used. Potential users perceptions of fitness for purpose are formed by convenience and, most significantly, the extent to which potential users believe using the resource will benefit them.

In hydrology, there is a heavy responsibility on system engineers to ensure the fitness for purpose of the river data available to users (Marsh, Citation2002). This requires critical assessment of all the steps involved in data-acquisition-decision making: the recognition of the strategic and operation values of the data, and a continuing dialog with users to ensure changing requirements are being addressed (Marsh, Citation2002). Further, the longer-term impacts of these methodological changes need to be part of that discussion.

Changes in Observation

Karl et al. (Citation1995) suggest that virtually every monitoring system requires better data quality, continuity, and homogeneity if we are to conclusively answer questions that are of interest to scientists and decision-makers. Long-term meteorological data were collected originally for weather forecasting and not to describe the current climate. Long-term climate monitoring requires different strategies. They identify many homogeneity issues with meteorological observations being substituted for climate observations. Similarly, McCulloch (Citation2007) observed that as recently as 1964, most available hydrological instruments had been designed in the 19th Century. Since that time there have been many changes to measurement technologies and to data workup procedures as hydrological agencies have shifted to automated observations and record production.

Sherwood (Citation2007) argues that all instrumental climate records are affected by instrumentation changes and variations in sampling over time. Conrad and Pollack (Citation1950) define a homogenous climatic time series as one where all variation is due only to weather and climate. While others have sought to detect these inhomogeneities, little attention has been paid to these series following adjustment. Simple homogenization techniques remove both the apparent artifacts and also some of the real signal. This worsens when change point timing is not known. Sherwood (Citation2007) suggests that error-free detection of the change points is not realistic; rather the success should be measured by the integrity of the climate signals. Strangeways (Citation2008) reports that technology now exists that lets the user assess if the Global Climate Observing System (GCOS) principles are being met for individual stations; however, the GCOS station positions are only known to one minute of arc resolution about 1 km. To be able to support such assessment, the location needs to be to the nearest second of arc or to four decimal places in decimal degrees. Similarly, Sahin and Cigizoglu (Citation2010) assessed six meteorological variables for 232 stations for the period 19742000. The inhomogeneities were mostly caused by non-natural effects such as relocation. Because of topography, changes in location (even for small distances) have a non-random effect; similarly, exposure has a measureable effect. Trewin (Citation2010) examined the practices of observing temperature on land. He identifies the most common inhomogeneities as changes in instrumentation, local site conditions, site relocations, changes in observing practices. These changes each have the potential to have impacts on temperature records similar to, or greater than, the observed century scale warming trend.

Thompson and Fearn (Citation1996) argue that few analytical scientists or end-users of data are in a position to exactly specify what quality of data is required for a specific task. They define fitness for purpose as the property of the data that enables a user of the data to make technically correct decisions for a stated purpose. Fitness for purpose demands sufficient but only necessary accuracy in analysis, and second, scientific requirements may be constrained by financial considerations. Fitness for purpose depends on uncertainty in measurement plus adequate quality assurance (valid methods, reference materials, proficiency tests, and accreditation).

Beaulieu et al. (Citation2008) compared eight tests to detect unknown inhomogeneities in climatic data. None of the methods were efficient for all types of inhomogeneities, but some perform substantially better than others. Better techniques include: bivariate test, Jaruskova's method, and standard normal homogeneity test. Poor performers were: Student sequential test and two-phase regression. They designed an optimal procedure that takes advantage of the strengths of the best performing tests. Such assessment tools need to be widely available to the user community.

There are a large number of practical issues involved when assessing the impacts of changes in observational programs. While the availability of metadata is one key aspect of these changes, we do have to consider if metadata availability is the entire answer. Are users easily able to access and interpret metadata? Do they have sufficient tools and skills to be able to make a determination of the effect of observational changes? Is newer always better? Standards that clearly define when a change should be recorded in metadata are needed. For example, knowing that the location from which observations are collected is very important for all environmental data, what criteria are used to assess when station identification (name, number, etc.) should be changed? Similarly, what criteria are used to assess when a variable attribute should be changed? For streamflow, are discharges derived from stage measurements from a single daily observation of a staff gauge different from those collected by today's electronic loggers? What constitutes an important change in an analytical chemistry method?

We should take a conservative approach and recognize that it is a simple matter for the user to combine records that might be considered different and do subsequent testing for fitness for purpose. That would, in the author's opinion, be better than providing long records collected from different locations using different methods without sufficient indication of observational changes. At minimum, there should be a recognized standard, since each such change could have a detectable impact, and potentially be confused with an environmental change, or vice-versa.

Uncertainty

Uncertainty is a reality of all measurements and may include random and bias elements. In a simple statistical form, one expects 95% of all repeated measures to be within 2 standard deviations; however, this seldom exists within environmental data as very few time series contain duplicate observations and frequently the observations are not normally distributed. There are two common methods of assessing uncertainty. In the first, the actual uncertainty is assessed through repeated measures. A second approach, involves the calculation of the standard deviation through a process that identifies and quantifies sources of uncertainty.

Agumya and Hunter (Citation1998) suggest that the impact of uncertainty depends upon whether the uncertainty is acceptable or not. Assessments of uncertainty can be standards-based (compared to accepted standards) or risk-based (cost function). Lyn et al. (Citation2007a) separate the uncertainty from the sampling process where larger sample sizes were needed to achieve results that were fit for purpose. Lyn et al. (Citation2007b) compare two methods for estimating measurement uncertainty. The modelling method involves identification, quantification, and summation as variances of sources of uncertainty. In the empirical approach, this involves replicated measurement from duplicate sampling or from inter-laboratory trials.

Modelling and Analysis

Models

Crout et al. (Citation2009) identify some general components of best modelling practice that are relevant to fitness for purpose of model output: (1) definition of purpose; (2) model evaluation, however that should be defined; and (3) transparency of the model and its outputs. They indicated some areas where current work seeks to move the process of model evaluation forward from a simple measure of performance (even a complex measure of performance) to an assessment of how performance relates to the model assumptions and formulation. Such developments are probably important; however, they are academic if the community at large is not routinely as engaged with model evaluation as it is with primary model development.

Chatfield (Citation1995) suggests that model uncertainty is a fact of life and is likely to be more serious than other sources of uncertainty. This is a particular problem when the same data are used to formulate the model, and to improve the model on an iterative basis. It is important to ensure awareness of the problems and address the issues, even though there is no simple fix. Ewen and Parkin (Citation1996) suggest a direct method for testing of the fitness of purpose of catchment simulation models. They recommend a form of blind testing; other papers demonstrate the application of the method. Bathurst et al. (Citation2004) examine the blind validation of a catchment modelling system. Output uncertainty bounds were determined based upon uncertainty of model parameter values. By representing the water balance within reasonable bounds, and by also reproducing event-scale responses, the fitness of the modelling system is demonstrated. Belyavin and Cain (Citation2009) suggest that one of the principle reasons that continuous process models are not validated is the lack of a formal methodology. They suggest having an appropriate strategy to recognize that model validation means fitness for purpose. They echo George Box's comment that All models are wrong, but some models are useful (Box and Draper, Citation1987). Zou and Lung (Citation2004) present a robust approach to calibrating water quality models for water quality management using sparse field data. Their calibration procedure adopts genetic algorithms to inversely solve the governing equations, along with an alternating fitness method to maintain solution diversity. Significantly higher diversity is observed in the solutions obtained by the alternating fitness method rather than by the standard process. A sensitivity assessment of the parameters of the model is also an effective way of addressing uncertainty (Crout et al., Citation2009).

Simulation models are increasingly used in environmental science and output from these must also be fit for purpose. Presently, there are few procedures and fewer standards for assessing uncertainty and fitness for purpose of the output from models. Beven and Freer (Citation2001) suggest one approach assessing model uncertainty that is termed generalized likelihood uncertainty estimation. While there are other approaches possible, a move towards greater emphasis on model validation and evaluation continues to be required.

Analysis and Presentation

Fitness for purpose also applies to the design of studies and assessments, and particularly in reporting results from these types of studies. There are many shortcomings with respect to how these are dealt with when reporting results of studies and analysis. While the most egregious cases hopefully never make it through the review process, there are aspects that we should reflect upon.

Burns et al. (Citation2005) argue that standardization of data analysis approaches minimizes potential uncertainty in results. One solution they propose is an objective statistical test for comparing calibration curves; while a second is the widespread use of ISO guidelines to accepting or rejecting outlying values.

Frequently, authors do not define their criteria for either suitability or fitness. One example where these were clearly stated is Ouarda and Shu (Citation2009). They state to ensure the quality of the low-flow study, catchments selected from the 190 hydrometric stations managed by the Ministry of the Environment of the Province of Quebec (MENV) should meet the following criteria (Ouarda et al., Citation2005).

1.

A historical flow record of at least 10 years is required.

2.

The gauged catchment should present a natural flow regime.

3.

The historical data at the gauging stations should pass the Kendall test of stationarity (Kendall, Citation1975) and the nonparametric independence test by Wald and Wolfowitz (Citation1943).

Authors and their editors should move from identifying their source of data to statements that are more pertinent to establishing both suitability and fitness.

Similarly, the confidence with which we can interpret data depends on their quality and their credibility. Pullin and Knight (Citation2009) argue that to increase effectiveness, predictive power, and resource allocation efficiency, good data are needed. Those data require sufficient credibility in terms of fitness for purpose. Critical appraisal of methodological quality is a key skill to improving retrospective analysis and prospective planning of monitoring (Pullin and Knight, Citation2009). Kang and Lansey (Citation2010) present a series of statistical methods to identify bad data, identify their locations, and to correct the data values using a linear measurement model that relates state variables to field measurements. While their subject was for flows in pipe, the principles are relevant to environmental data. The Analytical Methods Committee (Royal Society of Chemistry) (Citation2002) provides a simple graphical method for assessing and controlling repeatability with a moderate number of duplicated analytical results. The Thompson-Howarth Chart allows for the fact that precision varies with concentration, a result that is common to most types of environmental data. The differences between duplicates based upon the normal distribution should be within bounds defined by fitness for purpose. Assessments of many types of environmental data would benefit from similar methods.

Finally, most environmental data suffer from missing observations, missing periods, and other forms of incompleteness. Authors should be expected to communicate clearly these limitations for their data set. One recent example is Lespinas et al. (Citation2010) where they include a figure that demonstrates the temporal coverage of discharge records; clearly separating periods with complete station records from periods of the time series that were reconstructed and periods where data were missing. Similar methods could be used to illustrate when backwater conditions existed, or where the data were modified in some manner.

Of particular concern in many studies is that there is a lack of a truly independent data source which compromises statistical analysis. Clarke (Citation2010) discusses four areas where statistical methods are misused: (1) the same data set is used to generate a hypothesis and to test it; (2) failure to use appropriate significance level for tests in which a number of hypotheses are tested; (3) failure to account for spatial correlation between variables either explanatory or response variables, and; (4) exaggerated importance given to statistical tests of significance. These present interesting challenges for environmental studies, and each must be considered carefully in the context of each study, as ways of addressing each issue are complex. For example, spatial and temporal variation between variables may need to be accounted for by reducing significance levels, or it may add more power to the analysis as a covariate; the approach that one uses depends entirely on the context of the specific study.

Providers and users of environmental data should endeavour to develop common procedures for reporting on the data fitness; often this is entirely absent, but frequently it is more focused on the suitability of the data for assessment and not for fitness. Reviewers and editors should insist that authors demonstrate both of these when reporting data, and ensure that aspects that are known to compromise fitness for purpose, including changes of location, observation methods, periods where data have been infilled, etc., are clearly reported.

Data Rescue and Data Reconstruction

Another area that compromises fitness for purpose accompanies how one addresses missing data. There are a variety of methods and their use depends upon: (1) the time duration of missing data; (2) the season of the year; (3) the climatic region; and (4) the availability and data characteristics of the records. Infilling methods fall into two general categories of time series and regression methods. These may include statistical relationships within a station, with other variables, with other stations, or using modelling approaches. Increasingly, these methods increase in sophistication; methods such as Feed-forward back-propagating ANN and Chaotic methods are found in recent literature.

However, at present there are no standard procedures for addressing missing data, and there is a lack of common infilling and validation methods. Frequently, the methods used for infilling or reconstruction are left unspecified; Lespinas et al. (Citation2010) being an exception. Studies have shown that infilling frequently results in unintended smoothing, and reduced variability. Some infilling methods may only be suitable for specific hydrologic types and under certain conditions (Whitfield and Spence, Citation2011). Environmental data are known to be homoscedastic (homogeneous in variance), and this property may not be reproduced in data which have been infilled. Similarly, infilling may not faithfully reproduce extremes. Thus, one has to consider if all the points in the reconstructed time series have the same precision and accuracy. Despite the fact that infilling techniques are commonly used to deal with gaps in environmental records, there has been little or not attention given to assessing the validity of the methods; infilling frequently meets suitability concerns but not fitness for purpose. One would expect that with a properly reconstructed or infilled record there would be consistency of confidence limits, and that the Hurst coefficient (Hurst, Citation1951) and AR(1) properties of the time series would be preserved and unaltered by the infilling process; perhaps goodness of fit methods would be appropriate for selecting between alternate reconstruction methods.

Other Thoughts

It is important to mention that we need to recognize that sometimes human observations are simply poorly done. It would be impossible to create an exhaustive list, but human nature suggests that erroneous data points occur on occasion; examples might include: observers not applying thermometer corrections correctly; recording a 2 minute average wind based upon a 2 second sample; assuming that there is has been no change in the past hour and repeating the past hour's observation. While rigorous quality assurance and training seek to eliminate these, detecting instances can be difficult.

Fitness for purpose is not only an issue for data collection, but also an issue for data management and data analysis. Freedland and Carney (Citation1992) discuss the impacts of data management and accountability on fitness for purpose; they identify problems associated with data quality control, documentation, and data retention in ever evolving computer systems. They focus on the deficiencies that exist when computer systems are operated in a manner that makes it difficult to verify the integrity of data or to reproduce statistical analyses. It is increasingly difficult to track the state of a dataset and the results of analysis when live databases are involved; these deficiencies could make it difficult to verify the integrity of research data or to even reproduce statistical analysis. Freedland and Carney (Citation1992) argue that researchers may also be held accountable for such inadvertent deficiencies in data management.

Shliklomanov and Lammers (Citation2009) suggest that the assessment of hydroclimatological analysis is limited by the direct human impacts on rivers. Using reconstructions and naturalized hydrographs, they show that the effects of dams are less for annual series than for seasonal and monthly series, where the impacts of reservoirs overwhelm climatic signals. Shliklomanov and Lammers (Citation2009) argue that there is a need for more methods to naturalize or reconstruct streamflow, and also the need for the identification of a global network of river sites suitable for climate change analysis. Such methods need to include techniques for validation and uncertainty analysis of naturalized or reconstructed records.

Establishing Guidelines for Fitness for Purpose

While there are no universally accepted procedures for assessing fitness for purpose the following are offered as basic guidelines:

1.

Is there a clear statement of the purpose for which the data were collected? Were the data collected with any thought of long-term requirements, or are they a collection of observations made for short-term applications?

2.

Is there adequate documentation of the complete location and operation history? When did instruments, observing practice, locations, sampling rates, etc. change during the collection of this record? When changes were made is there documentation that explains why? It would be fair for such documentation to explain that the previous station was inadequate and was moved to a better place. This needs to respect the realities of hydrometry, climate observations and others environmental data where methods change over time. This is particularly important since we can increasingly detect these subtle changes and the correct attribution is important. These should include changes in: location absolute and relative; exposure; instrument changes (Karl et al., Citation1995); time of day of observations (Karl et al., Citation1995); microclimate (Karl et al., Citation1995); automated vs. human observers (Milewska and Hogg, Citation2002); electronic vs. physical instruments (Quayle et al., Citation1991); and observation units (Zhang et al., Citation2005).

3.

Have changes in the data workup that have taken place during the collection of this record been documented? Have steps that are used to convert the raw observations into the data records, (for example, converting water levels into discharges, or sensor voltages into readings) remained consistent over time? This should also include time step conversions, since the data processing might reduce information at reporting scale (for example, if 5 minute data are reduced to a daily average, the procedure might not preserve any sub-daily signals).

4.

Is there access to support data: rating curves, meter calibrations, measurement validations and maintenance records sufficient to support confidence in the data? This includes records and documents such as: rating curves, meter calibrations, measurement validations, and maintenance records. And when these are available, they should support confidence in the data.

5.

Are you able to verify the actual observations in the record? When methods change, is there any observational overlap? Is there information about intercomparison of methods? In hydrometry, these should form part of rating curve record, and be identifiable in the dataset.

6.

Are you able to verify the record against process similar records? There are methods that allow comparison of a record of interest against other data series (plots of cumulative deviations, double mass curves, time series plots with labeled or flagged data points, etc.); however, these comparisons should be made only with appropriate stations. Stations that are used for comparison should reflect similar processes; in many cases, particularly in hydrology, nearby stations may reflect different processes.

7.

Has the landscape and regional landuse changed over the period of record? Changes in landuse, including watershed changes or channel modification should be considered as a potential mechanism to affect records.

8.

Are the data management systems being used sufficient to protect the integrity of the data? Have the information practices procedures been sufficient to ensure the long-term value of the investment in data creation and collection? Examples of areas that cause problems include changes in observation units and significant digits; cases exist where English to metric conversions were made with ensuing reduction in preserved digits reducing the precision of the original observations. Does the data management system support the ability of the users to understand the purpose and intention of the dataset, particularly how to use the data and assess data fitness for a particular use?

9.

Do the data comply with interchange access formats that support sharing of data and metadata? Increasingly, platforms are available that allow users to extract data from many sources simultaneously. Users of such services should be aware that not all of the information that exists will be accessed by such services, and there should be adequate care made by these protocols to include metadata.

10.

Are the tools adequate to show the metadata with the data and support assessment of fitness for purpose? In many cases, tools exist that could support the user's assessment of fitness for purpose but they are not always being used. This is an area where the users need to be more aware of the tools that exist and how to use them. At the same time, the developers of these tools need to tailor them to the sophistication of the users. While many excellent R packages exist (see Comprehensive R Archive Network, http://cran.r-project.org), users who rely only on basic tools may not be adequately served.

Acknowledgements

Over the past number of years many colleagues have been supportive and encouraging whenever I have brought up my concerns regarding fitness for purpose. I greatly appreciate being able to acknowledge their contributions of information, ideas and their great willingness to discuss and challenge my ideas; many thanks to Stu Hamilton, Patricia Wong, Dave Hutchinson, Malcolm Clark, Ian Okabe, Gerry Whitley, and Alain Pietroniro.

References

  • Agumya, A., and G. J. Hunter. 1998. Fitness for use: Reducing the impact of geographic information uncertainty. In Urban and Regional Information Systems Association(URISA) 1998 annual conference proceedings. Charlotte, NC, July 1998, pp. 245255.
  • Analytical Methods Committee, Royal Society of Chemistry. 2002. A simple fitness-for-purpose control chart based on duplicate results obtained from routine test materials. Analytical Methods Committee 9: 12.
  • Bathurst , J. C. , Ewen , J. , Parkin , G. , O'Connell , P. E. and Cooper , J. D. 2004 . Validation of catchment models for predicting land-use and climate change impacts. 3. Blind validation for internal and outlet responses . Journal of Hydrology , 287 : 74 – 94 .
  • Beaulieu, C., O. Seidou, T. B. M. J. Ouarda, X. Zhang, G. Boulet, and A. Yagouti. 2008. Intercomparison of homogenization techniques for precipitation data. Water Resources Research 44: W02425. doi.10.1029/2006WR005615.
  • Belyavin, A., and B. Cain. 2009. An example of validating models of continuous processes. In Proceedings of the 18th conference on behavior representation in modelling and simulation. Sundance, UT, March 31 April 2, 2009, pp. 113121.
  • Beven , K. and Freer , J. 2001 . Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology . Journal of Hydrology , 249 : 11 – 29 .
  • Box, G. E. P., and N. R. Draper. 1987. Empirical model-building and response surfaces. New York: Wiley, 688 pp., p. 424.
  • Burns, M. J., G. J. Nixon, C. A. Foy, and N. Harris. 2005. Standardisation of data from real-time quantitative PCR methods: Evaluation of outliers and comparison of calibration curves. BMC Biotechnology 5: 31. doi:10.1186/1472-6750-5-31.
  • Chatfield , C. 1995 . Model uncertainty, data mining and statistical inference . Journal of the Royal Statistical Society Series A (Statistics in Society) , 158 ( 3 ) : 419 – 466 .
  • Clark , M. J. R. and Whitfield , P. H. 1992 . Conflicting perspectives about detection limits and about the censoring of environmental data . Water Resources Bulletin , 30 : 1063 – 1079 .
  • Clark , M. J. R. and Whitfield , P. H. 1993 . A practical model integrating quality assurance into environmental monitoring . Water Resources Bulletin , 29 : 119 – 130 .
  • Clark, M. J. R., D. D. MacDonald, P. H. Whitfield, and M. P.Wong. 2010. A framework for designing water quality monitoring studies based on experience in Canada: II. Characterization of problems and data-quality objectives. Trends in Analytical Chemistry 29(5): 385398.
  • Clarke , R. T. 2010 . On the (mis)use of statistical methods in hydro-climatological research . Hydrological Sciences Journal , 55 ( 2 ) : 139 – 144 .
  • Conrad V, and L.W. Pollack. 1950. Methods in climatology, 2nd edition. Cambridge, MA: Harvard University Press, 459 pp.
  • Crout, N., T. Kokkonen, A. J. Jakeman, J. P. Norton, R. Anderson, H. Assaf, B. W. F. Croke, N. Gaber, J. Gibbons, D. Holzworth, J. Mysiak, J. Reichl, R. Seppelt, T. Wagener, P. H. Whitfield. 2009. Chap. 2: Good modelling practice. In Environmental modelling, software and decision support, ed. A. J. Jakeman, A. A. Voinov, A. E. Rizzoli, and S. H. Chen, 1531. Amsterdam. Elsevier.
  • Dymond , J. R. and Christian , R. 1982 . Accuracy of discharge determined from a rating curve . Hydrological Sciences Journal , 27 : 493 – 504 .
  • Ewen , J. and Parkin , G. 1996 . Validation of catchment models for predicting land-use and climate change impacts 1. Method . Journal of Hydrology , 175 : 583 – 594 .
  • Fearn , T. , Fisher , S. A. , Thompson , M. and Ellison , S. L. R. 2002 . A decision theory approach to fitness for purpose in analytical measurement . Analyst , 127 : 818 – 824 .
  • Freedland , K. E. and Carney , R. M. 1992 . Data management and accountability in behavioral and biomedical research . American Psychologist , 47 ( 5 ) : 640 – 645 .
  • Hudson , H. R. , McMillan , D. A. and Pearson , C. P. 1999 . Quality assurance in hydrological measurement . Hydrological Sciences , 44 ( 5 ) : 825 – 834 .
  • Hurst , H. 1951 . Long term storage capacity of reservoirs . Transactions of the American Society of Civil Engineers , 116 : 770 – 799 .
  • Kang , D. and Lansey , K. 2010 . Filtering bad measurement data for water distribution system demand estimation . Journal of Water Resources Planning and Management , 136 ( 4 ) : 512 – 517 .
  • Karl , T. R. , Derr , V. E. , Easterling , D. R. , Folland , C. K. , Hofman , D. J. , Levitus , S. , Nicholls , N. , Parker , D. E. and Withee , G. W. 1995 . Critical issues for long-term climate monitoring . Climatic Change , 31 : 185 – 221 .
  • Kendall, M. G. 1975. Rank correlation methods. London, UK: Griffin, 202 pp.
  • Klobas , J. E. 1995 . Beyond information quality: Fitness for purpose and electronic information resources use . Journal of Information Science , 21 : 95 – 114 .
  • Lespinas , F. , Ludwig , W. and Heussner , S. 2010 . Impact of recent climate change on the hydrology of coastal Mediterranean rivers in Southern France . Climatic Change , 99 : 425 – 456 .
  • Lyn , J. A. , Palestra , I. M. , Ramsey , M. H. , Damant , A. P. and Wood , R. 2007a . Modifying uncertainty from sampling to achieve fitness for purpose: A case study on nitrate in lettuce . Accreditation and Quality Assurance , 12 : 67 – 74 .
  • Lyn , J. A. , Ramsey , M. H. , Damant , A. P. and Wood , R. 2007b . Empirical versus modelling approaches to the estimation of measurement uncertainty caused by primary sampling . Analyst , 132 : 1231 – 1237 .
  • MacDonald , D. D. , Clark , M. J. R. , Whitfield , P. H. and Wong , M. P. 2009 . A framework for designing water quality monitoring studies based on experience in Canada: Part 1 - Theory and Framework . Trends in Analytical Chemistry , 28 : 204 – 213 .
  • McCulloch , J. S. G. 2007 . All our yesterdays: A hydrological retrospective . Hydrology and Earth Systems Science , 11 : 3 – 11 .
  • Marsh , T. J. 2002 . Capitalising on river flow data to meet changing national needs: A UK perspective . Flow Measurement and Instrumentation , 13 : 291 – 298 .
  • Metcalfe , J. R. , Routledge , B. and Devine , K. 1997 . Rainfall measurement in Canada: Changing observational methods and archive adjustment procedures . Journal of Climate , 10 : 92 – 101 .
  • Milewska , E. J. and Hogg , W. D. 2002 . Continuity of climatological observations with automation: Temperature and precipitation amounts from AWOS (automated weather observing system) . AtmosphereOcean , 40 ( 3 ) : 333 – 359 .
  • Ouarda, T. B. M. J., and C. Shu. 2009. Regional low-flow frequency analysis using single and ensemble neural networks. Water Resources Research 45: W11428. doi:10.1029/2008WR007196.
  • Ouarda, T. B. M. J., V. Jourdain, N. Gignac, H. Gingras, H. Herrera, and B. Bobe. 2005. Development of a hydrological model for the regional estimation of low-flows in the province of Quebec (in French). Eau, Terre, et Environ., Institut national de la recherche scientifique, Res. Rep. R-684-f1. Sainte-Foy, QE: Institut national de la recherche scientifique, 174 pp.
  • Pelletier , P. M. 1988 . Uncertainties in the single determination of river discharge: A literature review . Canadian Journal of Civil Engineering , 15 : 834 – 850 .
  • Pullin , A. S. and Knight , T. M. 2009 . Data credibility: A perspective from systematic reviews in environmental management . New Directions for Evaluation , 122 : 65 – 74 .
  • Quayle , R. G. , Easterling , D. R. , Karl , T. R. and Hughes , P. Y. 1991 . Effects of recent thermometer changes in the Cooperative Station Network . Bulletin of the American Meteorological Society , 72 ( 11 ) : 1718 – 1724 .
  • Ramsey , M. H. and Thompson , M. 2007 . Uncertainty from sampling, in the context of fitness for purpose . Accreditation and Quality Assurance , 12 : 503 – 513 .
  • Sahin , S. and Cigizoglu , H. K. 2010 . Homogeneity analysis of Turkish meteorological data set . Hydrological Processes , 24 : 981 – 992 .
  • Sherwood , S. C. 2007 . Simultaneous detection of climate change and observing biases in a network with incomplete sampling . Journal of Climate , 20 : 4047 – 4062 .
  • Shliklomanov, A. I., and R. B. Lammers. 2009. Record Russian river discharge in 2007 and the limits of analysis. Environmental Research Letters 4: 045015. doi:10.1088/1748-9326/4/4/045015.
  • Strangeways , I. 2008 . Using Google Earth to evaluate GCOS weather station sites . Weather , 64 ( 1 ) : 4 – 8 .
  • Thompson , M. 2000 . Recent trends in inter-laboratory precision at ppb and sub-ppb concentrations in relation to fitness and purpose criteria in proficiency testing . Analyst , 125 : 385 – 386 .
  • Thompson , M. and Fearn , T. 1996 . What exactly is fitness for purpose in analytical measurement? . Analyst , 121 : 275 – 278 .
  • Tokay , A. , Bashor , P. G. and McDowell , V. L. 2010 . Comparison of rain gauge measurements in the mid-Atlantic Region . Journal of Hydrometeorology , 11 : 553 – 565 .
  • Trewin , B. 2010 . Exposure, instrumentation, and observing practice effects on land temperature measurements . Wiley Interdisciplinary Reviews: Climate Change , 1 : 490 – 506 .
  • Vasseur, B., R. Devillers, and R. Jeansoulin. 2003. Ontological approach of the fitness of use of geospatial datasets. In Proceedings of the 6thAssociation of Geographic Information Laboratories for Europe (AGILE) Conference. Lyon, France, April 2426, 2003, pp. 18.
  • Wald, A., and J. Wolfowitz. 1943. An exact test for randomness in the non-parametric case based on serial correlation. The Annals of Mathematical Statistics: 378388.
  • Wang , R. Y. and Strong , D. M. 1996 . Beyond accuracy: What data quality means to data consumers . Journal of Management Information Systems , 12 ( 4 ) : 5 – 34 .
  • Whitfield , P. H. and Hendrata , M. 2006 . Assessing detectability of changes in low flows in future climates from stage discharge measurements . Canadian Water Resource Journal , 31 : 1 – 12 .
  • Whitfield, P. H., and C. Spence. 2011. Estimates of Canadian Pacific coast runoff from observed streamflow data. Journal of Hydrology 410: 141149.
  • Zhang , X. , Hegerl , G. , Zwiers , F. W. and Kenyon , J. 2005 . Avoiding inhomogeneity in percentile-based indices of temperature extremes . Journal of Climate , 18 : 1641 – 1650 .
  • Zou , R. and Lung , W. S. 2004 . Robust water quality model calibration using an alternating fitness genetic algorithm . Journal of Water Resources Planning and Management , 130 ( 6 ) : 471 – 479 .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.