Levenberg-Marquardt Recurrent Networks for Long-Term Electricity Peak Load Forecasting

Increasing electricity demand in Java-Madura-Bali, Indonesia, must be addressed appropriately to avoid blackout by determining accurate peak load forecasting. Econometric approach may not be sufficient to handle this problem due to limitation in modelling nonlinear interaction of factors involved. To overcome this problem, Elman and Jordan Recurrent Neural Network based on Levenberg-Marquardt learning algorithm is proposed to forecast annual peak load of Java-Madura-Bali interconnection for 20092011. Actual historical regional data which consists of economic, electricity statistic and weather during 1995-2008 are applied as inputs. The networks structure is firstly justified using true historical data of 1995-2005 to forecast peak load of 2006-2008. Afterwards, peak load forecasting of 2009-2011 is conducted subsequently using actual historical data of 1995-2008. Overall, the proposed networks shown better performance compared to that obtained by Levenberg-Marquardt-Feedforward network, Double-log Multiple Regression, and with projection by PLN for 2006-2010.


Introduction
Increasing electricity demand in Java-Madura-Bali (Hereafter "JaMaLi") region just after the economic crisis had led to blackout in 2003.Less accurate demand projection in terms of peak load was possibly contributed to the situation besides power plant breakdown and system expansion postponed.Therefore, PT PLN (The Indonesian State Electricity Company) has been mandated to prepare and follow National Electricity Planning and Provision (RUPTL) on the 10 years basis based on national general planning in electricity sector.
Artificial neural network (ANN) in particular feedforward structure has been widely proposed particularly in electricity long-term peak load forecasting (LTPF) as a promising alternative approach compared to the traditional econometric method [1,2].However, to the best knowledge of the authors, not many studies of LTPF using RNN have been reported as it is found in [3][4][5][6][7].For the case of JaMaLi interconnection, a LTPF has been done using Feedforward network for 2007-2025, taken into account 10 actual historical data of 2001-2006 [2].Result on annual growth rate in the range of 6.4-7.1% is considered in level to that obtained by PLN.However, there is no verification of the network performance in terms of the absence of comparison between network forecasting result and the actual peak load.
In this research, instead of using econometric approach like what PLN does, RNN with LM learning algorithm is proposed as new approach to the JaMaLi's LTPF problem taken into account 11 actual historical and projection factors during the period of 1995-2011.This paper is organized as follows: the proposed method is presented in the next section.Research method used in this study is followed subsequently.Result and discussion are presented in the subsequent section and finally conclusion is followed.

Proposed Method
It is revealed that none of the proposed RNN utilized Levenberg-Marquardt (RNN-LM) learning algorithm which is confirmed to provide the most accurate results with the fastest and effective training algorithm [8,9].In addition, RNN-LM is potential to overcome the drawbacks of econometrics method that is to obtain reasonably accurate result, constant difference of the factors affecting the load demand is the important requisite.Hence, problem may occur when econometric method is used since it is not well adapted to model nonlinear interaction among variables affecting to load demand such as economic indicator and social indices [9,10].
RNN-LM is expected to overcome barrier in terms of the length of available data in conducting network training and forecasting.In other words, RNN-LM shall be beneficial if the set of available data is limited and difficult to be obtained up to certain extend.

Elman and Jordan Recurrent Neural Network
To handle LTPF problem, RNN is likely to be suitable due to its ability to handle certain information pattern given on the load of time t to make forecasting for t + 1 [3].The general structure of Elman and Jordan Network are illustrated in Figure 1.Note that the dashed line coming out of output layer represents feedback connection is belong to Jordan network.

Nguyen-Widrow initialization method
In this research, the weights for each layers junction and layer's bias of the networks structure are initialized using Nguyen-Widrow initialization method.This method was introduced by Derrick Nguyen and Bernard Widrow [12].Initial weights are distributed so that learning proceeds more effectively.where ij w is initial parameters of training algorithm.For the output layer, the initial weights are also randomly generated in the range of -0.5 to 0.5.β is the factor obtained from the following equation as given by where n is network inputs and hidden neurons.

Levenberg-Marquardt learning algorithm
One reason for selecting a learning algorithm is to speed up convergence.The Levenberg-Marquardt (LM) algorithm is an approximation to Newton's method to accelerate training speed.Benefits of applying LM algorithm over variable learning rate and conjugate gradient method were reported in [8].The LM algorithm is developed through Newton's method where minimization of a function ( ) with respect to parameter x can be defined as in [13]: where ( ) is the Hessian matrix and ( ) Then it can be shown that gradient and the Hessian matrix can be defined as where ( ) x J is the Jacobian matrix that contains first derivatives of the network errors with respect to the weights and biases, and e is a vector of network errors.Jacobian matrix is defined as Finally, the LM modification to the Gauss-Newton method is given as The parameter µ is multiplied by factor γ whenever a step would result in an increased ( ) When a step reduces ( ) x V , µ is devided by γ .If µ is too small, it becomes Gauss-Newton.

Research Method
Data involve in this research, network structure development, training algorithm and testing mechanism are presented in the followings.

Data and study period
11 regional factors including economic, social, electricity statistics, and weather thought to influence power demand of JaMaLi are applied as input to the network, encompasing annual historical and projection data assembled from 7 provinces in Java and Bali together with annual historical peak load of JaMaLi as the network output target.The input variables for the networks are: gross regional domestic product (GRDP) with adjusted deflator; population; number of households; total electricity energy consumption; total installed power contracted; electricity energy consumption in residential sector, commercial sector, industrial sector, and public sector; electrification ratio; and cooling degree days (CDD).
Data are selected based on preliminary investigation through literatures review and observation on data pattern and trending in relation to peak load changes of JaMaLi.Moereover, part of the selected data are typically used for econometric approach by the utility for JaMaLi interconnection.PLN data was taken based on the true historical data record for the period 1995-2008, as this is used as the training input data for the proposed networks so that the network can be able to generate appropriate pattern, whereas input data for 2009-2011 forecasting is based on PLN prediction result and by other government institution.In this research, significance contribution of selected factors is checked before training the networks.Five major factors gives significant influence in term of its contribution factor in sequence: total electricity energy consumption, GRDP, electricity consumption in residential, number of household, total installed power contracted.Meanwhile, other factors provide more or less equal contribution.
The complete time frame is 17 years data consists of 14 years (1995-2008) historical data and 3 years (2009-2011) forecasting data.The peak load in which have been officially projected by PLN is shown for comparison purpose.

Network structures
The networks for both Elman and Jordan type encompass 2 layers with distinct activation function in each layer.Number of neurons in the recurrent layer for both structures is set to 15. Number of neuron in the output layer is set to 1. Number of hidden neurons is determined to follow the rule proposed by Jadid and Fairbairn [14] as given by  (13) where N hdn is number of hidden neurons, N trn is number of training data, N inp is number of input neurons, and N out is numberof output neurons.
Network structure of which consists of number of neuraons, weight, bias parameter, and activation function is presented in Table 1 Activation function 'logsig' is applied to produce output of the first layer since it is neccesary to use 'logsig' as the output of the network should be positive value.However, inputs for the corresponding layer using the 'logsig' received is within the range of -1 to 1 after preprocessing scheme.For output layer, 'purelin' is applied.
Mathematical relationship among each layer's content considering transfer function in the proposed Elman and Jordan network structure is illustrated in Figure 2. Note that the feedback connection which is represented by dashed line is for Jordan network.All calculated values obtained from equation 1, 2, and 13 such as initial weight, layers weight, and bias status is inserted to the structure depicted in Figure 2.For instance, the first layer of Elman network will have the relationship as: a1(k) = logsig(IW 1,1 .p+ LW 1,1 .a1(k-1)+b1),whereas the output layer will have a2(k) = purelin(LW 2,1 .a1(k)+b2).This relationship is then applied until output is found.Whenever error target is not yet reached, computation will remain continued involving LM algorithm provided in equation 3-12.
The network structure is mainly consists of four layers: input layer, hidden layer, context layer, and output layer.However, the overall framework of RNN encompass two-layer network structure.The context layer is accomodated with its delay connection for respective layer.In the case of Elman network, it is identified as the first and the second layer, with the feedback connection from the first layer output to the first layer input.Thus, this framework is used in this study.

Training algorithm and testing mechanism
At each step, input vectors are presented to the network and error is generated.The error is then backpropagated to find gradients of errors for each weight and bias.This approximate gradient is then used to update the weights with the chosen learning function.In the presence of LM learning algorithm, a complete training algorithm for both proposed Elman and Jordan networks is proceeds as follows: a. Apply preprocessing scheme to scaledown the input and target vector so that they always fall within a specified range of -1 to 1. b. Create an RNN structure, define network training parameter such as error target and number of epochs.c.Present all treated inputs and corresponding target output from step 1 to the network.d.Generate initial weights and biases using Nguyen-Widrow method.e. Compute output of each network, involve feedback form the 1 st layer in the case of Elman network or feedback from output layer in the case of Jordan network.
f. Obtain network outputs and errors ( ) with respect to all inputs.g.Obtain the Jacobian matrix ( ) h. Solve Equation ( 11) to obtain ( ) is less than that computed in step 6, then reduce µ by some factor γ , calculate Network performance is defined through a predetermined mean square error (MSE).The error is calculated as the difference between the target output and the network output as given by Numerical forecasting result is measured in terms of mean absolute percentage error (MAPE) as compared with the actual peak load in the respective year.MAPE is given by

263
where i y is actual peak load for year i , and i y is forecasting peak load for year i , e is the network's vector error, and N is number of input to the network.

Result and Discussion
In Figure 3 and Figure 4 give graphical looks on how forecasting result is achieved through the first and second experiment by the proposed networks.It also shows comparison to that available from PLN.In this regards, peak load forecasting by PLN is available from the references [15. 16], in which obtained using econometric approach.
Table 2     As shown in Table 2, peak load forecasting result either by LMFN and DLMR is obtained from [1], whereas forecasting provided by PLN is taken from [15,16] of which based on econometric approach.From the first experiment symbolized by E1 and J1, both Elman and Jordan network are well trained using 1995-2005 input data and considered perform satisfactory forecasting outputs for 2006-2008 since the average error in terms of MAPE is 0.18% and 0.16%, respectively.Meanwhile, LMFN network error is slightly higher with 0.22%, and followed by DLMR for 0.39%.In addition, the yearly error obtained by the current proposed networks are less than 1% compared to that shown by PLN projection, of which accounted for 3.16%.
From the second experiment symbolized by E2 and J2, both Elman and Jordan network are trained using expanded period up to 2008 to strengthen the network pattern in order to generate forecasting peak load for 2009-2011.As can be observed, forecasting peak load for 2009 are 17,229 MW, 17,232 MW, 17,269 MW, 18,788 MW, and 18,854 MW, exhibited by the proposed RNN-LM, LMFN, DLMR, and PLN, respectively.The least forecasting error is obtained by the proposed recurrent network for 0.10%-0.12%.On the other hand, the worst forecasting error is exhibited by PLN for 9.55%.In addition, differences on forecasting between the proposed networks under the second experiment with that available from PLN are less than 7%, which is said to be acceptable for PLN's LTPF.Since there is no safety factor found anywhere in the published document of PLN's electricity expansion planning, The MAPE difference between RNN-LM and PLN, which is considered large, can be occurred mainly due to some differences in  selecting factor thought to affect peak load forecasting.In this regards, elasticity and possibility of captive power diversion to the grid are taken into consideration in forecasting made by PLN [16].
It should be noted that the main objective in this research is to compare the capability of selected methods with respect to their forecasting accuracy over the given period.It should be noted that there is difference in forecasting methodology between ANN and econometric approach, of which previously done using DLMR and method by PLN.In case of applying ANN, the immediate concern is to achieve reasonably accurate network training output, which is obtained through having the peak load pattern over the pass period when the network is trained under the specified limited epoch.In this research, MSE is set to 1.10 -5 for which the network is expected to be able to provide good pattern for the forecasting purpose as it is succeed for this study.In other words, we can determine how much error we want there in the network to allow it generates a reasonably good pattern.On the other hand, by having the regression result, error produced by the model over the several variables contributes in it can be calculated afterwards.That is why the ANN fitted error for 1995-2008 in term of MAPE or MSE far less than that generated by the regression model.MAPE or MSE of ANN can be practically considered as zero during 1995-2005 for the first experiment and during 1995-2008 for the second experiment.

Conclusion
Experiments carried out using the proposed Elman and Jordan networks has been conducted in this research to deal with the long-term peak load forecasting problem for JaMaLi taken into account several factors thought to influence the region's peak load pattern.The ability of the networks to generate fairly good results are quite satisfactory in terms of low MAPE although within limited forecasting periods, for which the network pattern are strengthened provided a limited training period.Next research may deal with the application of optimization techniques to further strengthen networks pattern and improve results provided limited period of data.

Figure 1 .
Figure 1.Elman and Jordan Recurrent Neural Networks architecture[11] Recurrent Networks for Long-Term Electricity Peak …. (Yusak Tanoto) generated in the range of -1 to 1.Then, the initial weight values are expressed using factor β as: -Newton method, Equation (8) becomes zero, thus Equation (3) becomes

Figure 2 .
Figure 2. Elman and Jordan RNN structure applied for training and testing µ by γ , go to step 8. j.The algorithm is completed when ( )x V ∇ has reduced to be equal or lower than the predetermined error value.In this paper, two (2) experiments through simulations are presented to obtain the proposed network response in terms of the resulting output changes with respect to different set of input and target training output.The objectives of each experiment are as follows: a.The first experiment is called base case simulation.The objectives are to justify the network's structure by obtaining forecasted peak load of 2006-2008 and to compare the result with corresponding actual peak load.b.The second experiment is carried out to test the network response in terms of producing forecasting peak load of 2009-2011.Overall, the length of data presented to the network is extended to achieve more accurate result by strengthening network output pattern.

Figure 3 .
Figure 3. Elman (a) and Jordan (b) network's training and forecasting result, first experiment presents actual historical peak load of 2006-2009 (APL), peak load forecasting of all experiments by Elman and Jordan networks (LM-Recurrent Network), peak load forecasting by the Double-log multiple regression (DLMR), peak load forecasting by LMfeedforward network (LMFN), and peak load forecasting provided by PLN.The forecasting error in terms of MAPE, written in the parentheses, is shown right below each year's forecasting result obtained for all methods.

Figure 4 .
Figure 4. Elman (a) and Jordan (b) network's training and forecasting result, second experiment

Table 2 .
Comparison of forecasting results in MW and MAPE