An integrated simulation-based fuzzy regression-time series algorithm for electricity consumption estimation with non-stationary data: Journal of the Chinese Institute of Engineers: Vol 34 , No 8

Abstract

This study presents an integrated fuzzy regression, computer simulation, and time series algorithm to estimate and predict electricity demand for seasonal and monthly changes in electricity consumption especially in developing countries such as China and Iran with non-stationary data. Since, it is difficult to model the uncertain behavior of energy consumption with only conventional fuzzy regression or time series, the integrated algorithm could be an ideal method for such cases. Computer simulation is developed to generate random variables for monthly electricity consumption. The fuzzy regression is run with computer simulation output too. A Granger–Newbold test is used to select the optimum model, which could be a time series, a fuzzy regression (with or without pre-processed data, PD) or a simulation-based fuzzy regression (with or without PD). The preferred time series model is selected from linear or nonlinear models. At last, the preferred model from fuzzy regression and time series models is selected by Granger–Newbold. Monthly electricity consumption in Iran from 1995 to 2005 is considered as the basis of this study. The mean absolute percentage error estimates of a genetic algorithm, an artificial neural network, and a fuzzy inference system versus the proposed algorithm show the appropriateness of the proposed algorithm. This is the first study that introduces an integrated simulation-based fuzzy regression-time series for electricity consumption estimation with an imprecise set of data.

Keywords:

Acknowledgements

The authors are grateful for the valuable comments and suggestion from the respected reviewers. Their valuable comments and suggestions have enhanced the strength and significance of our paper.

This study was supported by a Grant from University of Tehran (Grant No. 8106013/1/06). The authors acknowledge the supports provided by the University College of Engineering, University of Tehran, Iran.

Notes

1. That the RD is simulated means that we generate the data for the algorithm and fuzzy regression by their distribution functions. Furthermore, the RD is used to identify the best distribution for each month. Then, the input data is generated from these distributions as another alternative. Moreover, both simulated (generated) data and RD are used to foresee which provides a better and more exact estimation. This is another unique feature of this study because previous studies use the RD whereas this study uses both raw and generated data. This is particularly important for ambiguous and uncertain data.

2. But according to our problem, the extrapolation ability of ANN should be calculated, therefore the data for test is chosen of the period which is closer to the last year.

3. However, in most heuristic methods, selecting input variables is experimental or based on the trial and error method (Zhang et al. 1998; Al-Saba et al. 1999; Zhang 2001; Tseng et al. 2002; Niska et al. 2004; Kim et al. 2004; Zhang et al. 2005; Palmer et al. 2006).

4. The ACF of a random process describes the correlations between the processes at different points in time.

5. The membership function of a fuzzy set is a generalization of the indicator function in classical sets. In fuzzy logic, it represents the degree of truth as an extension of valuation. Degrees of truth are often confused with probabilities, although they are conceptually distinct, because fuzzy truth represents membership in vaguely defined sets, not likelihood of some event or condition.

7. Partial ACF; in contrast, the partial autocorrelation between y _t and y_t ₋ _s eliminates the effects of the intervening values y_t ₋₁ through y_t ₋ _s ₊₁.

8. By definition, an ARIMA model is covariance stationary if it has a finite and time-invariant mean and covariance.

9. The MAPE method is the most suitable method to estimate the relative error because input data used for the model estimation, PD and RD have different scales (Azadeh et al. 2007 a,b,c).

10. Monthly consumption.

11. The subtract operation lids from 130 data decrease to 129 preprocessed data.

12. This software is produced by LINDO corporation (www.lindo.com).

13. y_t = a ₀ + a ₁ y_t ₋₁ + ε_t + β ₁ ε_t ₋₁ + β ₁₂ ε_t _−12.

14. y_t = a ₀ + a ₁ y_t ₋₁ + a ₂ y_t ₋ ₂ + ε _t + β ₁₂ ε_t _−12.

15. a ₁ is coefficient of y_t ₋₁ in ARIMA (p, q) model; β ₁ is coefficient of ε_t ₋₁ in ARIMA (p, q) model.

16. The Q-statistic can be used to test whether a group of autocorrelations is significantly different from zero. Box and Pierce used the sample autocorrelation to form the Q-statistic which has the following formula. Let there be T observations labeled y ₁ through y_T . We can let and r_s be estimates of and , respectively, where:

Under the null hypothesis that all values of r_k = 0, Q are asymptotically

distributed with s degrees of freedom. The intuition behind the use of this statistic is that high sample autocorrelations lead to large values of Q. Certainly; a white noise process (in which all autocorrelation should be zero) would have Q value of zero. If the calculated value of Q exceeds the appropriate value in a

table, the null significant autocorrelations can be rejected.

17. The two most commonly used model selection criteria are the AIC and the SBC. These criteria are used to select the most appropriate model. They have the following formulae: AIC = T ln(sum of squared residuals) + 2n, SBC = T ln(sum of squared residuals) + n ln(T), where: n = number of parameters estimated (p + q + possible constant term) and T = number of usable observations. Ideally, the AIC and SBC will be as small as possible (note that both can be negative). As the fit of the model improves, the AIC and SBC will approach −. Model A is said to fit better than model B if AIC (or SBC) for A is smaller than for B.

18. Q (n) reports the Ljung–Box Q-statistic for the autocorrelation coefficients of the squared n residuals of the estimated model. Significance levels are in parentheses.

19. Azadeh and Tarverdian (Citation2007) present an integrated algorithm for forecasting monthly electrical energy consumption based on GA, computer simulation, and design of experiments using stochastic procedures. The Duncan multiple range test (DMRT) method of paired comparison is used to select the optimum model, which could be time series, GA, or simulated-based GA. To show the applicability and superiority of the proposed algorithm, the monthly electricity consumption in Iran from March 1994 to February 2005 (131 months) is used and applied to the proposed algorithm.

20. Azadeh et al. (Citation2007c) illustrate an ANN approach based on supervised multi-layer perceptron network for the electrical consumption forecasting. In order to train the ANN, PD have been extracted from the time series techniques. However, this study shows the advantage of ANN methodology through analysis of variance (ANOVA). Monthly electricity consumption in Iran was collected to train and test the network.

21. Azadeh et al. (Citation2008) present an integrated fuzzy system, data mining, and time series framework to estimate and predict electricity demand for seasonal and monthly changes in electricity consumption especially in developing countries such as China and Iran with non-stationary data. Finally, ANOVA is used for selecting preferred model from fuzzy models and time series model. To show the applicability and superiority of the proposed algorithm, the monthly electricity consumption in Iran from March 1994 to February 2005 (131 months) is used and applied to the proposed algorithm.

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 61.00 Add to cart

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 199.00 Add to cart

* Local tax will be added as applicable

An integrated simulation-based fuzzy regression-time series algorithm for electricity consumption estimation with non-stationary data

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

An integrated simulation-based fuzzy regression-time series algorithm for electricity consumption estimation with non-stationary data

Abstract

Acknowledgements

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature