1,080
Views
44
CrossRef citations to date
0
Altmetric
Scientific Papers

Analytical procedures for weekly hydrological droughts: a case of Canadian rivers

&
Pages 79-92 | Received 01 Nov 2008, Accepted 27 Jul 2009, Published online: 11 Mar 2010

Abstract

Two entities of importance in hydrological droughts, viz. the longest duration, LT , and the largest magnitude, MT (in standardized terms) over a desired time period (which could also correspond to a specific return period) T, have been analysed for weekly flow sequences of Canadian rivers. Analysis has been carried out in terms of week-by-week standardized values of flow sequences, designated as SHI (standardized hydrological index). The SHI sequence is truncated at the median level for identification and evaluation of expected values of the above random variables, E(LT ) and E(MT ). SHI sequences tended to be strongly autocorrelated and are modelled as autoregressive order-1, order-2 or autoregressive moving average order-1,1. The drought model built on the theorem of extremes of random numbers of random variables was found to be less satisfactory for the prediction of E(LT ) and E(MT ) on a weekly basis. However, the model has worked well on a monthly (weakly Markovian) and an annual (random) basis. An alternative procedure based on a second-order Markov chain model provided satisfactory prediction of E(LT ). Parameters such as the mean, standard deviation (or coefficient of variation), and lag-1 serial correlation of the original weekly flow sequences (obeying a gamma probability distribution function) were used to estimate the simple and first-order drought probabilities through closed-form equations. Second-order probabilities have been estimated based on the original flow sequences as well as SHI sequences, utilizing a counting method. The E(MT ) can be predicted as a product of drought intensity (which obeys the truncated normal distribution) and E(LT ) (which is based on a mixture of first- and second-order Markov chains).

Citation Sharma, T. C. & Panu, U. S. (2010) Analytical procedures for weekly hydrological droughts: a case of Canadian rivers. Hydrol. Sci. J. 55(1), 79–92.

Procédures d'analyse de sécheresses hydrologiques hebdomadaires: le cas de rivières Canadiennes

Résumé Deux caractéristiques importantes des sécheresses hydrologiques, à savoir la plus longue durée, LT , et la plus grande magnitude, MT (en termes standardisés) sur une durée souhaitée (qui pourrait aussi correspondre à une période de retour spécifique) T, ont été analysées pour des séquences hebdomadaires de débits de rivières au Canada. L'analyse a été faite en termes de séquences hebdomadaires des valeurs standardisées des débits, appelées IHS (indices hydrologiques standardisés). La séquence d'IHS est tronquée au niveau de la médiane pour l'identification et l'évaluation des espérances des caractéristiques précédentes considérées comme variables aléatoires, E(LT ) et E(MT ). Les séquences d'IHS ont tendance à être fortement auto-corrélées et sont représentées par des modèles auto-régressifs d'ordre 1, d'ordre 2 ou avec moyenne mobile d'ordre 1,1. Le modèle de sécheresse basé sur le théorème des valeurs extrêmes de variables aléatoires s'est révélé être moins satisfaisant pour la prévision de E(LT ) et de E(MT ) sur une base hebdomadaire. Cependant le modèle a bien fonctionné sur des bases mensuelle (faiblement Markovienne) et annuelle (aléatoire). Une procédure alternative basée sur une chaîne de Markov du deuxième ordre a donné des prévisions de E(LT ) satisfaisantes. Des paramètres comme la moyenne, l'écart type (ou coefficient de variation) et l'auto-corrélation de décalage 1 des séquences originales de débits hebdomadaires (suivant une loi de probabilité Gamma) ont été utilisés pour estimer les probabilités simples et du premier ordre des sécheresses via des équations finies. Les probabilités du deuxième ordre ont été estimées sur la base des séquences originales de débits ainsi que des séquences d'IHS en utilisant une méthode de comptage. E(MT ) peut être prévue comme le produit de l'intensité de la sécheresse (qui suit la distribution normale tronquée) et de E(LT ) (qui est basée sur un mélange de chaînes de Markov de premier et de deuxième ordres).

INTRODUCTION

Analysis of hydrological droughts on a weekly basis offers useful information for scheduling planting and watering activities of cereal crops (growing period 12–15 weeks) and vegetable crops (growing period 3–10 weeks), water rationing during water shortage periods, and other exigency-related planning of water supply and distribution on a short-term basis. For example, consider a river equipped with a reservoir to provide a town with drinking water supplies and to provide the cereal and vegetable crops with supplemental irrigation water. During a drought period, it is desirable to know the likelihood of the drought continuing until the onset of rains. It is known a priori that delayed rains during the up coming season are normally scanty, and thus for drought mitigation purposes an adequate irrigation scheduling plan for cereal crops and a water rationing plan for town dwellers are required. Under such circumstances, formulations of drought statistics (duration and magnitude) based on either an annual or a monthly basis may lend limited information as these statistics cannot be scaled down in a direct multiplicative manner. For example, if a 100-year drought persists for 4 years, it does not mean that all the 208 (= 4 × 52) weeks will be under the grip of drought. It is expected that a much smaller number of weeks shall be affected by persistent drought, a number which can be obtained through drought analysis of weekly flow sequences, as developed in this paper.

Droughts can be identified using several drought indices such as the Standardized Precipitation Index (SPI) or the Palmer Drought Severity Index (PDSI) (Hayes, Citation2002). The SPI operates on the simple principle of standardization and normalization of precipitation values at the time period chosen for analysis, such as monthly, bimonthly, or a 3-, 6-, 12-, 24-, 48-month basis (McKee et al., Citation1993, Citation1995; Edwards & McKee, Citation1997; Guttman, Citation1999; Sirdas & Sen, Citation2003). For example, a monthly SPI series is obtained using month-by-month standardization (subtraction of monthly means and division by monthly standard deviations) of precipitation values and then normalizing them. Such a series can be truncated at the mean level and the resultant periods below the truncation level are indicative of drought conditions. Note that a non-normalized series truncated at the median level is equivalent to a normalized series at the mean level. During a drought period the sum of absolute SPIs below the chosen truncation level is termed the magnitude (Hayes, Citation2002; Sirdas & Sen, Citation2003). It is noted that the above sum (i.e. the magnitude) in the hydrological drought context has been referred to as “severity” since the early days of drought research (Yevjevich, Citation1967) and until recently (Sharma & Panu, Citation2008). Likewise, the term “intensity” (ratio of severity to duration) has been used interchangeably with magnitude (Dracup et al., Citation1980; Panu & Sharma, Citation2002). Use of the term magnitude in the hydrological drought context seems to be gaining momentum (Timilsena et al., Citation2007; Lopez-Moreno et al., Citation2009). In view of the changing trend in the nomenclature of drought parameters, the term magnitude shall be used in place of severity, and the ratio of magnitude to duration shall be termed drought intensity in this study.

Identification of drought events depends on selection of the truncation level. Thus, in view of the adopted practice for the analysis of SPI, the median value of the flow sequence will be used as the desired truncation level in this study. Since this paper deals with streamflow sequences rather than precipitation sequences, the Standardized Hydrological Index, SHI, is defined virtually parallel to SPI. However, it must be noted that the SHI is a standardized sequence (mean 0 and variance 1) but not a normalized one. In the case of SHI, drought probabilities change in tandem with the probability distribution function (pdf) of the data, which is reflected in changed behaviour of the drought duration and magnitude. The data are not transformed so their original form remains undistorted and the pdf is preserved, which is a desirable feature. On an annual basis, the treatment of three common pdfs of SHI, viz. normal, gamma, and lognormal distributions, for drought analysis is well documented in Sharma (Citation2000). The theorem of extremes of random numbers of random variables (Todorovic & Woolhiser, Citation1975) has been found to be a powerful tool for the prediction of the hydrological drought parameters of SHI series on annual and monthly streamflow sequences of Canadian rivers (Panu & Sharma, Citation2009; Sharma & Panu, Citation2008). For the prediction of dry and wet spells on a daily basis, the concepts of Markov chains were found to be promising (Chin, Citation1976, 1977).

Deficits below the chosen truncation level, recognized as drought events, form the runs. Consequently, two entities are associated with each drought event: the duration, L, of the negative run, hereafter referred to as the drought duration; and the corresponding magnitude, D, equal to the cumulative deficit run-sum, hereafter referred to as the drought magnitude. As the drought analysis is carried out in the standardized domain of the flow sequences, the drought magnitude, D, is termed the standardized magnitude M. It can be deduced that drought magnitude, D = σM (Yevjevich, Citation1967), where σ is the standard deviation of the sequence under consideration. This paper attempts the prediction of the average drought duration, E(LT ) over a time period (which could also correspond to a specific return period) T, and corresponding standardized magnitude, E(MT ) (E stands for expected value) on a weekly basis using the flow data of Canadian rivers and involving the conceptual framework of SHI. The main objective of this study is to evaluate, on a weekly basis, the applicability of those drought prediction models that have been built on the premise of the theorem of extremes of random numbers of random variables and are reported in the literature to perform well on annual and monthly bases, and, furthermore, in the case of an anomaly to adapt alternative models based on the concepts of Markov chains.

FLOW DATA ACQUISITION AND PRELIMINARY ANALYSIS

The data for the analysis are natural (i.e. unregulated) and uninterrupted flow records of 18 rivers across Canada () and are listed in . The data for these stations and the corresponding periods were used to study hydrological droughts on monthly and annual bases (Sharma & Panu, Citation2008; Panu & Sharma, Citation2009). These studies were conducted using the existing concepts and notions of stochastic theory for evaluating the drought characteristics in the highly seasonal flow regimes of Canadian rivers. Encouraged by the success of concepts incorporated within models for predicting the behaviour of the annual or monthly hydrological droughts, it was considered logical to use the same database for the characterization of droughts on a weekly basis. The major considerations for the selection of stations and the period of data were the same as reported earlier by Sharma & Panu (Citation2008); for example, the existence of natural flow regimes with continuous records and unaffected by human intervention, with a minimal need for data infilling. Additional statistical analyses involving trend, homogeneity and consistency were used to further statistically screen the data sets. Thus, daily flow data for these 18 rivers were extracted from the Canadian Hydrologic Data Base (HYDAT; Environment Canada, Citation2005). The selected rivers are representative of a wide range of drainage basins (37 km2 to 32 400 km2) and a period of the historical database (1919–2005) which required virtually no data infilling. Daily flows were transformed to weekly flows such that each of the first 51 weeks would be composed of 7 days while the 52nd week would contain the remainder of days. That is, the last week of the year would comprise of 8 or 9 (during a leap year) days.

Fig. 1 Spatial location across Canada of all hydrometric stations used in the analysis (source: Environment Canada).

Fig. 1 Spatial location across Canada of all hydrometric stations used in the analysis (source: Environment Canada).

Table 1  Summary of statistical parameters of flow sequences (weekly) of rivers across Canada

Identification of probability distribution and dependence structure of weekly flow sequences

The analysis of drought parameters using probability theory generally begins with the identification of the probability distribution function (pdf) of the drought variable (weekly flows) and its dependence structure. For identifying the pdf of the weekly flows, the product moment and L-moment relationships (Vogel & Fennessey, Citation1993; Hosking & Wallis, Citation1997) were used. For each river, values of statistics such as the mean (μ), standard deviation (σ) or coefficient of variation (cv) and skewness (γ) were computed () and necessary plots were graphed in terms of product moment and L-moments. The analysis revealed that a two-parameter gamma pdf fits the weekly flow data reasonably well.

Once a suitable pdf fitting the weekly flow sequences is obtained, the underlying dependence structures of these flow sequences can be investigated. That is, if xy,t is the flow in the tth week (t = 1, 2, 3,…, 52) of the yth year (1 ≤ yN) then the standardized series uy,t  = (xy,t – μ t )/σ t with μ t and σ t respectively being the mean and standard deviation of the tth week, is rendered as weakly stationary with mean zero and variance 1. Since the uy,t series is assumed stationary, it can be designated as ei with i ranging from 1 to n = N ×52. It is recognized that ei is a standardized series but not a normalized one, since its generic source xy, t is generally non-normal for weekly flow sequences and, as stated earlier, in the present case it is the gamma distribution. The ei (or alternately SHI) sequence was subjected to autocorrelation analysis to uncover the presence of any Markovian or other higher-order persistence. Such analysis demonstrated that the ei sequence resembles either order-1 autoregressive (AR-1; ei  = Φ1 ei- 1 + ai ), or order-2 autoregressive (AR‐2; ei  = Φ1 ei- 1 + Φ2 ei- 2 + ai )], or an autoregressive moving average order –1,1 (ARMA (1,1); ei – Φ 1 ei- 1 = a i – θ1 ai −1) process, with some notable exceptions in data sets from northern Ontario rivers. In the above relationships, Φs are autoregressive parameters, θ is a moving average parameter in the ARMA model, and ai is a random component with mean zero and constant variance σa 2. The model identification and associated parameters were estimated using a procedure due to Box & Jenkins (Citation1976) wherein the randomness of the ai component was affirmed by Portmanteau statistic involving the first 25 autocorrelations. It should be noted that the concepts of time series analysis advanced by Box & Jenkins (Citation1976) are not constrained on the normal pdf of the data. However, it is preferable that the data be normally distributed for ease in the statistical interpretation of the parameters, and for diagnostic checking of residuals for ascertaining the adequacy of the model. Rivers in northern Ontario () appear to generally show persistence beyond AR-2 in the ei sequences. Such persistence is attributed to significant storage effects caused by the presence of a large number of lakes and wetlands in watersheds of this region. Rivers passing through or originating from lakes would have significant storage-driven carryover effects, so causing flows to fall behind by multiple weeks. In a nutshell, it is prudent to regard the first approximation of the memory component in the weekly flows to lag by up to 2 weeks, and hence AR-1 and AR-2 could be considered to model the SHI or ei sequences.

Table 2  Summary of observed and predicted weekly values of E(L T ) with q = 0.5, gamma pdf of weekly flows

ASSESSING THE PREDICTION OF WEEKLY E(LT )

The theorem of extremes of random numbers of random variables, coupled with the theory of runs, is well documented in the hydrological literature (Todorovic & Woolhiser, Citation1975) and has been successfully used for the analysis and prediction of hydrological droughts on an annual basis (Sen, Citation1980a; Guven, Citation1983; Sharma, Citation1998, 2000). The theorem also provides satisfactory estimates of E(LT ) and E(MT ) on a monthly (Sharma & Panu, Citation2008) and an annual (Panu & Sharma, Citation2009) basis in Canadian environments. One important feature of this theorem is that E(LT ) can be predicted first, and then can be transformed into E(MT ) through the relationship E(MT ) = I E(LT ) (Sharma & Panu, Citation2008). Therefore, it is logical to apply the above theorem to the weekly flow sequences because the monthly and weekly flow sequences can be regarded as nearly similar, both being periodic stochastic, and through the standardization procedure both sequences are transformed into stationary ei (i.e. SHI) series. The probabilistic relationships for the estimation of E(LT ) and the associated parameters emanating from the theorem and theory of runs were developed by Sen (Citation1980a) and are briefly presented as follows:

(1)
(2)
where q is the probability of any week being a drought week and qq is the conditional probability of the present drought week given that the previous week was also a drought week. The value of q is dependent on the pdf, the coefficient of variation, cv, of the flow sequence, and the truncation level, e 0. The value of qq is dependent on the pdf of the flow sequence, ρ1 of the standardized series, ei or ρ of the original weekly flow series, and e 0.

The procedure for determining q entails the use of the standard normal probability integral in which the integration is accomplished up to the truncation level, e 0. That is, if a standardized time series ei is normally distributed then the probability q = P(e ≤ e 0) = P(zz 0) can be evaluated as:

(3)
where z is a standardized normal variate with mean 0 and variance 1 (a notation common in statistical texts). The value of q can also be obtained directly from standard normal probability tables.

For the gamma pdf, which represents a mild skew in the data, the normalized standard variate can be obtained using following equation (Sharma, Citation1998, 2000):

(4)

Succinctly, once the pdf of the weekly flow series is identified (say the gamma pdf), its cv and ρ are estimated from the original weekly flow sequence. The series is then standardized week-by-week and the resulting series (ei series) is truncated at a level such that the probability q = 0.5. The value of e 0 is computed by setting zg  = 0 in Equationequation (4). For example, in for the Fraser River (station BC08KB001), the value of zg  = 0 and cv = 0.89 yields e 0 = −0.27. In other words, the median level of truncation in the ei series of the gamma distributed weekly flows of the Fraser River corresponds to e 0 = −0.27. A week with ei less than or equal to −0.27 shall be recognized as a drought week. If the flows had obeyed a normal pdf, the e 0 would be equal to 0.0 by setting q = 0.5 in Equationequation (3).

The conditional probability, qq or pp , is related to ρ1 or ρ through the following relationship reported by Sen (Citation1976, 1977):

(5)
(6)
where p means the simple probability of any week being wet; pp means conditional probability of a wet week given that the previous week was also wet. Similar notations apply for q and qq . It is noted that qq= q and pp  = p for independent or random streamflow sequences, or when drought spells display characteristics of random occurrences. As an illustration, for the Fraser River with the gamma pdf, cv = 0.87, e 0 = −0.27, ρ = 0.92 and q = 0.5 yields qq  = 0.87 through the numerical integration of Equationequation (5). Truncation levels e 0 corresponding to the median level were determined by setting zg  = 0 in Equationequation (4) for all rivers and are summarized in . The values of ei above the truncation level were designated as 1 (w, i.e. wet) and below as 0 (d, i.e. deficit or drought), thus one obtains a series of 1s and 0s. The longest run of the 0s corresponds to the longest drought duration, which was determined by a counting method (Sen, Citation1980b) and is also equivalent to the observed value of E(LT ). Likewise, the highest sum of negative values (in a negative run) represents the observed E(MT ), which was computed by summing the negative values (ei e 0) in absolute terms. It can be perceived that the sample size n = 52N weeks, where N is the number of years, represents a time period of T = n weeks. To estimate conditional probabilities in Equationequations (5) and (Equation6), both values of lag-1 autocorrelation (ρ as well as ρ1) were used to improve upon the prediction of E(LT ) by matching it to its observed counterpart.

Concluding remarks on the prediction results of weekly E(LT )

At the median level of truncation in all rivers (except for the Fraser River, BC08KB001) with either the value of ρ or ρ1 and regardless of the memory structure embedded in the weekly flow sequences, the values of E(LT ) generally represent under-prediction, as summarized in . Since flow sequences of approximately half of the rivers () were found to follow AR-2 or ARMA(1,1), such a mismatch in model identification is justifiable in view of the inherent complicity of the memory in weekly flow sequences. However, it is difficult to explain the discrepancy (i.e. under-prediction) for the rivers having AR-1 type memory in weekly sequences. It is worth noting that drought parameters obtained based on AR-1 type memory for monthly flow sequences turned out to be satisfactory (Sharma & Panu, Citation2008). Because there is a strong commonality of memory behaviour of AR-1 type between the monthly and weekly flows, a reasonable correspondence between predicted and observed values of E(LT ) is expected for weekly flows in such rivers. The only difference between the AR-1 structure of the stochastic component of the weekly and monthly sequences has been the degree of strength in the persistence (i.e. the parameter Φ1 being >0.45 in weekly series against about 0.20 in monthly flow series). This degree of persistence in the AR-1 type model for monthly and weekly flow sequences can be ascribed to be the major cause of this discrepancy. Stated another way, the model structure based on the theorem of extremes of random numbers of random variables is strictly applicable to random (annual flow) or weak Markovian (monthly flow) drought variables. In a nutshell, an unsatisfactory response for weekly flow sequences, though obeying the Markovian type (AR-1) persistence (nearly half of the rivers) is intriguing to state the least, which warrants further investigation and analysis, as presented in the following sections.

PROPOSED MARKOV CHAIN MODEL

Based on the concept of Markov chains (Chin, Citation1976, 1977), the following methodology is proposed for the estimation of drought duration E(LT ). The relationships based on earlier research efforts (Sharma, Citation2000; Sharma & Panu, Citation2008; Panu & Sharma, Citation2009) have been utilized for estimation of drought magnitude E(MT ).

Estimation of drought duration, E(LT ) in terms of simple and conditional probabilities

From the preceding analysis, each ei series of the analysed rivers has been found to follow either an order-1 or order-2 type of autoregressive process. An alternative approach of Markov chains is also investigated and presented below.

When the ei series is cut off at a truncation level e 0, the values above the truncation level are positive or in the wet (w) state, and those below are negative or in the drought (d) state. So the ei series can be transformed into a sequence of discrete states in terms of w and d (say for example: wwwddddwwwwdd). One can define the following notations for probabilities, designated by P( ); P(d) (simple probability), i.e. the probability of any week being a drought week at a given truncation level = q; P(d│d) (conditional probability) i.e. the probability of any week being a drought week given that the previous week was also a drought week (first-order persistence) = qq ; P(d│d,d) (conditional probability), i.e. the probability of any week being a drought week given that the previous two weeks were also drought weeks (second-order persistence) = qqq . The same applies to the wet state, i.e. p = P(w); pp = P(w│w); and ppp = P(w│w,w). Likewise, pqq = P(w│d,d); qp= P(d│w); and qqp= P(d│d,w). The simple and conditional probabilities sum to 1 as follows:

Consider a sequence wwwwdwwddwdddww---d (which contains k number of letters w and/or d with a minimum value of k = 3), then to designate the occurrence of a wet or a dry week, one needs respectively to express it as: dwd or wdw. The probability distribution of lengths in drought and wet spells can be described by a well known geometric distribution (Mathier et al., Citation1992). Invoking the above geometric law into a second-order Markov chain, the following equations can be derived (Chin, Citation1976, 1977) for L = 1, 2, 3, …, j.

To obtain the probability of just one drought week (L = 1; i.e. wdw), the week should be followed and preceded by at least a wet week; for example, refer to the first occurrence of d in the above sequence:

(7)

Likewise, to obtain the probability of a spell of two drought weeks (L = 2; i.e. wddw), such a drought spell must be preceded and followed by at least a wet week. For example, referring to the drought spell containing two successive ds in the sequence above:

(8)

Similarly, regarding the probability of a spell of three drought weeks (L = 3; i.e. wdddw), such a drought spell must be preceded and followed by at least a wet week. For example, referring to the drought spell containing three successive ds in the above sequence:

(9)

In general, the probability of a drought spell of L = 2, 3, 4, …, j is:

(10)

For a Markov chain of the first order, P(d│d,w) = P(dd); P(dd,d) = P(dd) and P(wd,d) = P(wd) so that Equationequation (10) reduces to:

(11)

Using a counting or enumeration method, sufficiently accurate estimates of p, qp , qqp , qqq , and p q can be obtained from the historical data with sample size n > 1000 (Chin, Citation1977). It should be noted that more rigorous and elaborate equations for determination of P(L = j); j = 2, 3, 4… have also been developed by Sen (Citation1990) involving the concepts of tree diagram, conditional and recursive probabilities, which may provide another route to involving Markov chains in the analysis.

The exceedence probability for a T-week period is 1/T. Therefore if the drought duration, L (when taken as dummy variable) for a T-week period is greater than or equal to j weeks, then for a second-order Markov chain, Equationequation (10) can be expressed as follows:

(12)
where, j has a value such that the area to the right of it in the geometric probability curve is 1/T. EquationEquation (12) can be summed as follows:
(13)

Note that the term (1 – qqq ) cancels out with pqq in the numerator while summing, and since pqq  = 1 – qqq , a solution of Equationequation (13) for j (knowing that p = 1 – q), becomes:

(14)

The value of j in Equationequation (14) corresponds to the T-week period and, therefore, it can be construed as an estimate of the expected value of drought length and can be expressed as E(LT ). It is apparent from Equationequation (14) that only four parameters are needed for a second-order Markov chain process. Parameters p, qp and qq can be estimated from the information on cv, ρ and e 0, using a counting method and the closed-form Equationequations (5) and (Equation6). However, no such closed-form equations can be found for the estimation of second-order parameters, viz. qqp and qqq , hence they are estimated by a counting method described below.

A counting method for estimation of simple and conditional probabilities

The parameters q, q p , q qq and q qp can be estimated by a counting method (Sen, Citation1980b, 1990) using the sequence of 1s and 0s which was obtained by truncating the ei series at the desired truncation level of e 0 (such that the q value is equal to 0.5). It is to be noted that, for convenience, in calculations the letters d and w are respectively replaced by 0 and 1. Consider a number of scenarios involving occurrences of zeros as follows: as isolated single zeros (say n 1), as two consecutive zeros, i.e. “00” (say n 2), and as three consecutive zeros, i.e. “000” (say n 3). Knowing that the total sample size is n, then q = n 1/n, qq  = n 2/n 1, and qqq  = n 3 /n 2. Since there are only two numerals, 1 and 0, so the number of 1s = (nn 1). The number of pairs “11” occurring in succession (i.e. 1 precedes 1) is counted (say n 4). Therefore, pp  = n 4/(nn 1) and hence q p  = (1 – p p ). Similarly the number of pairs such as “01” (0 precedes 1, say n 5) and “001” (00 precedes 1) are counted (say n 6) which can be used to estimate the value of qqp as n 6 /n 5. By following the above method, the second-order parameters (qqq and qpq ) were also estimated from the original flow series truncated at the median level. Thus, there are two estimates for each of the parameters qqq and qpq for subsequent analyses.

DISCUSSION OF RESULTS OF THE MARKOV CHAIN MODEL

Comparative analysis of predicted and observed drought duration, E(LT )

For each river, the truncation level, e 0 corresponding to the probability, q = 0.5 (truncation at the median level) was computed as shown in . Using ρ1 and Equationequations (5) and (Equation6), the first order conditional probabilities qq and qp for each river were computed. Parameters qqq and qqp were estimated by a counting method using the series ei . For each river the values of E(LT ) as summarized in were predicted from Equationequation (14) using parameters qqq and qqp .

Table 3  Summary of observed and predicted weekly values of E(L T ) with q = 0.5, gamma pdf of weekly flows and second-order Markov chain model

Corresponding to ρ1, it is apparent from column 5 in that almost all values are underpredicted. To ameliorate the underprediction, ρ1 was replaced by ρ, which has marginally improved the prediction of values for E(LT ) but underprediction still persists (column 6, ). To further improve upon the persistence of underprediction of LT -pr (week), parameters qqq and qqp were estimated from the original flow sequence by truncating it at the median level. In other words, all parameters are estimated based on the original flow sequences. This new set of parameters resulted in overprediction of E(LT ) as shown in column 7 of . It is clearly evident that a balance can be struck if appropriate values of parameters are sought. The most sensitive parameter in the aforesaid parameters is qqq ; a slight change in its value causes a dramatic change in E(LT ). Therefore, a rational estimate of parameters qqq and qqp can be considered as an average of values obtained from the above two routes (i.e. based on standardized as well as the original sequences, ). Since ρ (based on the original data) proved to be a better estimator of first-order conditional probabilities (qq and qp ), it was retained in the final analysis. It was noted that the value of qq ranged from 0.68 to 0.87 (mean = 0.75) . Likewise, qqq ranged from 0.78 to 0.90 (mean = 0.83) and qqp ranged from 0.66 to 0.83 (mean = 0.74). For prediction purposes, it must be mentioned that a number of sub-samples were considered from the available original data set for each river to increase the sample size for assessing the adequacy of fit. For illustration, such a set of sub-samples is presented for two rivers in . In this manner 62 sub-samples were formulated from the 18 rivers in question, resulting in a large number of points that are plotted in .

Table 4  An illustration for computing the observed and predicted weekly values of E(L T ) at q = 0.5, and using the second-order Markov chain model

Fig. 2 Comparison of predicted and observed drought durations, E(LT ).

Fig. 2 Comparison of predicted and observed drought durations, E(LT ).

The quality of fit in was assessed through the coefficient of efficiency (COE) statistic and mean error as reported in earlier papers by Sharma & Panu (Citation2008), and Panu & Sharma (Citation2009). The value of the COE for the scatter of points (predicted and observed values of E(LT ) ) on the 1:1 line in was about 75% and the mean error of prediction was less than –1.5%. This plot therefore indicates a reasonable ability of the second-order Markov chain model to simulate and predict the T-week drought durations at the median level of truncation in the SHI sequences. A point to be noted in context of the second-order parameter qqq is that the parameter was greater than qq by almost 2% to 20%, with a mean of 10%. In the absence of any other estimate of qqq , one can assume the value of this parameter is equal to 1.10qq , and further, the parameter qpq is far less sensitive in influencing the prediction behaviour, and therefore its conservative estimate can be taken as equal to qq .

Comparison of predicted and observed standardized drought magnitude, E(MT )

As stated earlier, the drought magnitudes were computed from the standardized series and are consequently denoted as standardized drought magnitudes, E(MT ), which were estimated from the following linkage relationship (Sharma, Citation1998, 2000):

(15)
where │ │ means an absolute value and I stands for the drought intensity that can be estimated using the following relationship (Sen, Citation1977):
(16)

Note that Equationequation (16) is based on the truncated normal distribution of deficits. Furthermore, the equation is valid for dependent as well as independent hydrological sequences as it only needs the estimate of simple probability, q, and corresponding e 0 for evaluation of I. In the case of the river identified as BC08KB001, the value of q for the normal pdf is 0.39 (column 4, ) at the median level of truncation (i.e. e 0 = –0.27). The corresponding value of I (Equationequation 16) results as 0.71. The predicted values of E(MT ) were computed (column 6, ) through Equationequation (15) by using I as calculated above and multiplying by LTpr (column 5). The predicted values, designated as MTpr (column 6, ), were compared with the corresponding observed values, MT -ob (column 7, ). The scatter of the points about the 1:1 line () resulted in a significant overprediction (mean error ≈ 16% and COE ≈ 73%) that leaves ample scope for amelioration. A review of reveals that the E(MT ) in the lower range are being over predicted by the model, whereas in the higher range they appear to be satisfactory.

Table 5  An illustration of the table for computing observed and predicted E(M T ) at the median level of truncation based on the truncated normal distribution of I and gamma pdf of E(L T )

Fig. 3 Comparison of predicted and observed magnitudes (LT , second-order Markov chain).

Fig. 3 Comparison of predicted and observed magnitudes (LT , second-order Markov chain).

The higher values of E(MT ) correspond to rivers located in northern Ontario, where flow sequences generally have memory content extending beyond the second order. Generally for other rivers the memory is confined up to the second order (). Thus the rivers can be divided into two groups, Group 1 comprising rivers with memory content up to the second order: BC08KB001 to QC02PL008 and three rivers, ON02DD013, ON02DD014, and ON02DD015, from northern Ontario (). The remaining rivers, ON02AB008, ON02BB003, ON02BF002, ON02CF007, ON02EA010 and ON04JC003 () from northern Ontario with memory content extending beyond the second order are classified in Group 2. The E(MT ) (= I E(LT )) values for rivers in Group 1 were predicted by calculating LT based on the first-order Markov chain model. For rivers in Group 2, the predicted values of E(MT ) were computed using E(LT ) values based on the second-order Markov chain assumption. For the first-order Markov chain model, the E(LT ) can be computed using Equationequation (14) with qqq  = qq and qqp  = qq . The value of I remains unchanged as it is a function of e 0 and q only. The new set of predicted values for E(LT ) are designated as LT -pr-1 (column 8 in ) and the corresponding standardized magnitudes as MT -pr-1 (column 9). The values of MT -pr-1 were plotted () against their observed counterparts (column 7 in ). It is apparent from this plot that the degree of prediction has improved significantly, as is evident from the value of the COE ≈ 86% and the mean error of prediction being nearly zero (≈ –0.70%). Thus, it is reasonable to state that drought magnitudes on a weekly basis can be modelled adequately by using E(LT ) values obtained from a mixture of the first- and second-order Markov chain models, depending upon the order of the memory content inherent in the flow sequences.

Fig. 4 Comparison of predicted and observed magnitudes (LT , first- and second-order Markov chain).

Fig. 4 Comparison of predicted and observed magnitudes (LT , first- and second-order Markov chain).

The present study alludes to an observation that modelling drought magnitude, although more complex compared to duration, can be tracked by a mixture of Markovian models. In hydrological analysis, the mixing of Markovian models representing various states is not uncommon. For example, the Markov mixture model hypothesis was proposed by Jackson (Citation1975) for generating synthetic streamflows with long droughts. Likewise, for modelling the differential persistence in high and low flows, a scheme of complex Markov models was successfully used by Bayazit (Citation1982) and Bayazit & Bulu (Citation1988).

Computation of E(LT ) and E(MT ) – an illustration of the proposed methodology

Consider the Torrent River (NF02YC001) with the following statistics: mean flow μ = 24.50 m3/s, ρ = 0.73, and σ = 17.15 m3/s. It is noted that the value of σ was obtained as the average value of 52 weekly values because each week in the drought sequence has a different σ and an averaged value is a suitable estimate (Sharma & Panu, Citation2008) for such a calculation. Based on the regional analysis of Canadian rivers, one can consider the second-order conditional probability qqq to be 1.1 qq , as stated earlier, and likewise, the second-order conditional probability qqp equals qq . One needs to estimate the drought duration (weeks) and the magnitude (m3) for a time period of 100 years (T = 5200 weeks). Analysis of weekly and monthly flow sequences of Canadian rivers shows that they exhibit mild skew and fit well to the gamma pdf. Annual flow sequences can be approximated by the normal law of probability.

For drought analysis, consider a weekly flow sequence exhibiting persistence characteristics that can be described by the second-order Markov chain. For such a weekly flow sequence, one can proceed with the gamma pdf, the value of simple drought probability, q = 0.5, and the truncation level at the median based on the long-term flow sequence. The value of e 0 equal to –0.32 from Equationequation (4), is obtained by substituting the value of the standardized normal variate, z g  = 0.0. At the median level in the standardized domain, therefore, the truncation level equals –0.32. Likewise, using ρ = 0.73 in Equationequations (5) and (Equation6), the values of qq (numerical integration) and qp are obtained as 0.76 and 0.24, respectively. As described above, the values of qqq and q qp can, respectively, be taken as 0.84 (10% greater than qqq ) and 0.76 (same as qq ). Thus, one obtains a value of E(LT ) approximately equal to 37 weeks for T = 5200 weeks (by putting q = 0.5, qp = 0.24, qqq  = 0.76, and qqq  = 0.84, into Equationequation (14)). Alternatively, one can state that a 100-year drought is expected to last for 37 weeks for the Torrent River. Likewise, by putting appropriate values into the above equations a 50-year drought is expected to last for 33 weeks.

For calculating E(MT ), one first needs to identify the order of persistence in the drought sequences. Since the Torrent River can be adequately modelled using the AR-2 model, E(MT ) should be computed using E(LT ), which is based on the first-order Markov chain. Putting appropriate values into Equationequation (14), one can determine E(LT ) = 24.57 (≈ 25 weeks). Thus, E(MT ) ≈ 0.69 × E(LT ) ≈ 17.0 (note that I = 0.69 is based on the normal pdf at e 0 = –0.32). The drought magnitude, E(DT ), or the total shortfall of water in volumetric units, i.e., σ E(MT ) = 17.15 × 17 × (7 × 24 × 3600) m3 = 1.76 × 108 m3. It is noted that since σ has the unit of m3/s, a multiplier (7 × 24 × 3600) is needed to convert a week into seconds. Likewise, using a first-order Markov chain, one obtains E(LT ) for a 50-year drought to be ≈ 22 weeks with the corresponding E(DT ) of 1.55 × 108 m3. If the estimates of E(MT ) are obtained based on the second-order Markov chain model, then the corresponding E(DT ) would turn out to be 2.70 × 108 and 2.40 × 108 m3, respectively, for 100- and 50-year droughts. These values represent a substantial overprediction, and are too conservative for design purposes. The drought magnitude calculations for rivers in northern Ontario, such as the Shekak River (ON04JC003), shall be based on E(LT ) using the second-order Markov chain because the memory terms for this river are represented by the ARMA(1,1) model, which is equivalent to the AR process of order greater than 2 (Box & Jenkins, Citation1976).

CONCLUSIONS

The analysis of drought duration and magnitude can best be accomplished using a second-order Markov chain on the weekly standardized flow sequences (ei ) referred to as the standardized hydrological sequences (SHI). The term SHI is analogous to SPI (standardized precipitation index) commonly used in the context of meteorological droughts. A drought event is recognized when the value of the index falls below the long-term median value of the ei sequence. Results of the present analysis indicate that weekly river flow sequences in Canada tend to fit well the gamma probability distribution function. It is noted that the ei sequence, while displaying the characteristics of a stationary stochastic process tends to obey an AR-1, AR-2 or ARMA-1,1 process. The theorem of extremes of random numbers of random variables, which proved to be a reliable model for predicting T‐year drought parameters (E(LT ) and E(MT )) on an annual- and monthly-basis, is less satisfactory on a weekly basis. As an alternative, the second-order Markov chain model performed well for predicting T‐year drought durations and drought magnitudes. The model, while being simple in mathematical form, requires the information on the simple, first- and second-order conditional probabilities. The simple drought probability q, and first-order conditional probability qq were obtained from closed-form equations using the information on cv and ρ of the weekly flow sequences. The second-order conditional probabilities, viz., qqq and qqp were obtained from the original flow sequences as well as the SHI (ei ) sequences. The averaged values of the aforesaid parameters from both sequences were found to be optimal for performing reliable estimates of E(LT ). The value of E(MT ) was best estimated using a trivial yet simple relationship, E(MT ) = I E(LT ) with the value of drought intensity, I, dependent on the truncation level at the median and its corresponding probability for the normal pdf. The E(LT ) estimates based on the first-order Markov chain in the aforesaid relationship responded better for rivers with memory content up to second order. Other rivers with higher orders of memory content tend to resonate well with the second-order Markov chain based E(LT ) values.

Acknowledgements

The partial financial support of the Natural Sciences and Engineering Research Council of Canada for this project is gratefully acknowledged.

REFERENCES

  • Bayazit , M. 1982 . Three state Markov models for differential persistence . J. Hydrol. , 55 : 339 – 346 .
  • Bayazit , M. and Bulu , A. 1988 . Complex Markov models to simulate persistent streamflows . J. Hydrol. , 103 : 199 – 207 .
  • Box , G. E. P. and Jenkins , G. M. 1976 . Time Series Analysis, Forecasting and Control , San Francisco : Holden-Day .
  • Chin , E. 1976 . A second order Markov Chain model for daily rainfall occurrences . Preprints, Conference on Hydrometeorology, American Meteorology Society . April 20–22 1976 , Fortworth, Texas, USA. pp. 104 – 109 .
  • Chin , E. 1977 . Modeling daily precipitation occurrence process with Markov chain . Water Resour. Res. , 13 ( 6 ) : 949 – 956 .
  • Dracup , J. A. , Lee , K. S. and Paulson , E. G. Jr. 1980 . On identification of droughts . Water Resour. Res. , 16 ( 2 ) : 289 – 296 .
  • Edwards , D. C. and McKee , T. B. Characteristics of the 20th century drought in the United States at multiple time scales . Atmospheric Science Paper no. 634 . 1997 . 1–30
  • Environment Canada . 2005 . HYDAT CD-ROM Version 96–1.04 and HYDAT CD-ROM User's Manual . Surface Water and Sediment Data Water Survey of Canada. ,
  • Guttman , N. B. 1999 . Accepting the standardized precipitation index: a calculation algorithm . J. Am. Water Resour. Assoc. , 30 ( 2 ) : 311 – 322 .
  • Guven , O. 1983 . A simplified semi-empirical approach to probability of extreme hydrological droughts . Water Resour. Res. , 19 : 441 – 453 .
  • Hayes, M. (2002) Drought indices, 9-page on line document http://www.drought.unl.edu/whatis/indices.pdf. Accessed (Accessed: August 2002 ).
  • Hosking , J. R. M. and Wallis , J. R. 1997 . Regional Frequency Analysis: An Approach Based on L-Moments , 191 – 209 . Cambridge, UK : Cambridge University Press .
  • Jackson , B. B. 1975 . Markov mixture models of drought lengths . Water Resour. Res. , 11 : 64 – 74 .
  • Lopez-Moreno , J. I. , Vicente-Serrano , S. M. , Beguria , S. , Garcia-Ruiz, Prortela , M. M. and Almeida , A. B. 2009 . Dam effects on drought magnitude and duration in a transboundary basin: The lower River Tagus , Spain and Portugal Water Resour. Res 45 . W02405, doi:10.1029/2008WR007198
  • Mathier , L. , Perreault , L. and Bobee , B. 1992 . The use of geometric and gamma-related distributions for frequency analysis of water deficit . Stochast. Hydrol. Hydraul. , 6 : 239 – 254 .
  • McKee , T. B. , Doesen , N. J. and Kleist , J. 1993 . The relationship of drought frequency and duration to time scales . Preprints, 8th Conference on Applied Climatology . January 17–22 1993 , Anaheim, California, USA. pp. 179 – 184 .
  • McKee , T. B. , Doesen , N. J. and Kleist , J. Drought monitoring with multiple time scales . Preprints, 9th conference on Applied Climatology . Texas, USA. pp. 233 – 236 . Dallas
  • Panu , U. S. and Sharma , T. C. 2002 . Challenges in drought research: some perspectives and future directions . Hydrol. Sci. J , 47 ( S ) : S19 – S30 .
  • Panu , U. S. and Sharma , T. C. 2009 . An analysis of annual hydrologic droughts: a case of the northwest Ontario, Canada . Hydrol. Sci. J. , 54 : 29 – 42 .
  • Sen , Z. 1976 . Wet and dry periods of annual flow series . J. Hydraul. Engng Div. ASCE , 102 ( HY10 ) : 1503 – 1514 .
  • Sen , Z. 1977 . Run sums of annual flow series . J. Hydrol. , 35 : 311 – 325 .
  • Sen , Z. 1980a . Statistical analysis of hydrological critical droughts . J. Hydraul. Engng Div. ASCE , 106 ( HY1 ) : 99 – 115 .
  • Sen , Z. 1980b . Critical drought analysis of periodic-stochastic processes . J. Hydrol. , 46 : 251 – 263 .
  • Sen , Z. 1990 . Critical drought analysis by second order Markov Chain . J. Hydrol. , 120 : 183 – 202 .
  • Sharma , T. C. 1998 . An analysis of non-normal Markovian extremal droughts . Hydrol. Processes , 12 : 597 – 611 .
  • Sharma , T. C. 2000 . Drought parameters in relation to truncation level . Hydrol. Processes , 14 : 1279 – 1288 .
  • Sharma , T. C. and Panu , U. S. 2008 . Drought analysis of monthly hydrological sequences: a case study of Canadian rivers . Hydrol. Sci. J. , 53 ( 3 ) : 503 – 518 .
  • Sirdas , S. and Sen , Z. 2003 . Spatio-temporal drought analysis in the Trakya region . Hydrol. Sci. J. , 48 ( 5 ) : 809 – 820 .
  • Timilsena , J. , Piechota , T. C. , Hidalgo , H. and Tootle , G. 2007 . Five hundred years of hydrological drought in the upper Colorado river basin . J. Am. Water Resour. Assoc. , 43 ( 3 ) : 798 – 812 .
  • Todorovic , P. and Woolhiser , D. A. 1975 . A stochastic model of n day precipitation . J. Appl. Met. , 14 : 225 – 137 .
  • Vogel , R. M. and Fennessey , N. M. 1993 . L-moment diagrams should replace product moment diagrams . Water Resour. Res. , 29 ( 6 ) : 1745 – 1752 .
  • Yevjevich , V. 1967 . “ An objective approach to definitions and investigations of continental hydrologic drought ” . In Hydrology Paper 23 , Fort Collins, Colorado, , USA : Colorado State Univ .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.