Analysis of factors affecting the area of forest and land fires in Indonesia uses spatial regression Geoda and SaTScan

Indonesia has the largest tropical rainforest region in tropical Asia. At present, Indonesia's forest area is 144 million hectares, 64.4 is still forested and contains 7 main forest types with variations of up to 18 types of forests, including bamu forest, nipah forest, sago forest and savanna forest [1]. Among the triggers of Indonesia's tropical forests are forest fires, the distribution of uneven rainfall patterns in an area, the effects of wind speed.


I. Introduction
Indonesia has the largest tropical rainforest region in tropical Asia.At present, Indonesia's forest area is 144 million hectares, 64.4 is still forested and contains 7 main forest types with variations of up to 18 types of forests, including bamu forest, nipah forest, sago forest and savanna forest [1].Among the triggers of Indonesia's tropical forests are forest fires, the distribution of uneven rainfall patterns in an area, the effects of wind speed.
During the 1997 forest fires, national mass media reported that 176 companies were accused of forest fires in land clearing, 133 of which were plantation companies.Therefore, the construction of oil palm plantations was one of the causes of 10 million hectares of forest fires in 1997/98 with economic losses reaching US $ 9.3 billion [2].Calculations in estimating or estimating the level of forest and land fires can use spatial regression to analyze the relationship between rainfall, fire events, and wind speed.In this case, the existing data has a uniform distribution pattern and makes a pattern of adjacent neighbors.
Regression method used in estimating rain frequency uses Spatial Autoregressive (SAR), Spatial Error Model (SEM) and Spatial Autoregressive Moving Average (SARMA).The weight manager used is the proximity of the queen.The results of the analysis using the above method can determine whether there is a spatial effect of data on existing variables.
The purpose of this study was to analyze the size of rainforests and land and see the influence of rain, fire events, and wind speed.

A. Spatial Statistics
Spatial data has a special method to analyze.Spatial statistics is a statistical method used to analyze it.Spatial data is data that contains information "location", so not only "what" measurable but indicates the location where the data is located.Spatial data may include information regarding the geographic location such as the location of the latitude and longitude of each border region and between regions.Simply put spatial data is expressed as the address information.In another form, spatial data is expressed in the form of grid coordinates as in the grain map or in the form of pixels as in the form of

A B S T R A C T
This study discusses the factors that influence the extent of forest and land fires in Indonesia that relate several other factors such as rain, fire events, and wind speed which were the events during 2015.Forest fires are one of the environmental and forest problems that is a local and global concern.Countermeasures have been carried out for a long time but are relatively low.By looking for the best regression model with a significance level of 0.05 or 95% using the Spatial Autoregressive Model (SAR) method, the coefficient of determination of 25.00% is obtained which can be obtained by the research regression model and leaves 75.00% needed by other variables that are variables changed.satellite imagery.Thus the approach of spatial statistical analysis is usually presented in the form of thematic maps [3].

B. Spatial Regression Model
The spatial regression model that is formed from a general regression model that gets spatial influences (location).In the spatial regression model, the value of the response variable in the model is formed.There are four models that can be formed from the General Spatial Model:

C. Spatial Autocorrelation
Calculating a correlation between location is a need in spatial model, it is called as Spatial Autocorrelation.Spatial autocorrelation is an estimate of the correlation between the value of observations relating to spatial locations at the same variable.When the spatial autocorrelation is positive value, it shows the similarity value from adjacent locations and tend to cluster.When the value is negative, it shows that the adjacent locations have different values and tends to spread [4].Characteristics of spatial autocorrelation expressed by Kosfeld, namely: 1.If there is a systematic pattern in the spatial distribution of observed variables, then there is spatial autocorrelation.2. If the proximity or adjacency between regions closer, it can be said there is positive spatial autocorrelation.3. negative spatial autocorrelation illustrates a pattern adjacency unsystematic.4. The random pattern of spatial data showed no spatial autocorrelation.
There are many Measurement of spatial autocorrelation.The measurement usually used are Moran's Index (Moran), Geary's C, and Tango's excess.In this study, the analysis method is limited only to the method of Moran's Index (Moran) [4][7].This method can be used to detect the onset of spatial randomness.This spatial randomness may indicate clusterisation or forming a trend towards space.
 where x and y the Pearson correlation equation is an average sample of predictor variables and the response.Ρ value is used to measure whether the predictor variables and the response correlated.
According to [6], [7], and [8], the coefficient of Moran's I used to test the spatial dependency or autocorrelation between observations or location.

III. Method
The research method is one way that consists of steps or sequence of activities that function as general guidelines used to carry out research so that what is the purpose of the research is realized.In carrying out this research the author uses secondary data then the data is analyzed by multiple regression then solved by the SAR, SLM, and SARMA methods.The placement method used to determine the presence or absence of spatial data effects.Then hypothesis testing will be carried out by looking at Lagrange Multiplier (LM) Error and Lag Weighting in neighboring areas uses an approach to the type of queen on a chessboard where only the area around it is included in the area that is considered to have relevance to the scale of neighbors is (1).The adjudication element uses vector and matrix.The queen weighting matrix defines W ij = 1 for adjoining areas.Meet with the area of concern, while W ij = 0 to another area.
The spatial weighting matrix is a symmetrical matrix and the main diagonal is always zero.

A. Descriptive Analysis
Descriptive analysis is an analysis that aims to describe the state of the data.Descriptive analysis in the form of central symptom measures in the form of mean, median, and mode.The size of the spread is in the form of a range of data (range), deviation (standard deviation and variance).The slope size is the population model, the slope coefficient (kurtosis), and the slope coefficient.To display a summary of data, use the command: summary ().

B. Multiple Linear Regression Analysis
Multiple linear regression analysis is a linear relationship between two or more independent variables (X1, X2, ..., Xn) with the dependent variable (Y).This analysis is to determine the direction of the relationship between the independent variable and the dependent variable whether each independent variable is positively or negatively related and to predict the value of the dependent variable if the value of the independent variable increases or decreases.The data used is usually interval or ratio scale.

C. Proporsi
The proportion means the number / frequency of certain properties that are comparable.A special form in calculating the ratio is proportion.

D. SaTScan
SaTScan is free software that analyzes spatial, temporal and spacetime data using spatial, temporal or space-time scanning statistics.The data will be analyzed on the Y variable, namely the area of fire and the area of non-fire land.Data is obtained from the link https://www.bps.go.id and http://sipongi.menlhk.go.id then for Coordinate data per province in Indonesia obtained by using Google Maps.
In the SaTScan software there are 3 menus, namely the Input column, the Analysis column, and the Output column.First on the Input menu some information is obtained:

IV. Results and Discussion
Pigure 1 shows the area of fire as a variable Y in this case and Figure 2 shows the conditional map of rainfall, fire events, and wind velocity.

A. Univariate Moran I Index
The data in the fire area (Figure 3) shows a spread pattern at one point and has no outlier value.A straight line that has a negative trend because its direction shows downwards means it has positive and not negative values.Information was obtained that there were 33 locations inputted in the SaTScan, the total population obtained was 8090184 people.For the total number of cases obtained 2791 and the percentage of cases in area 3%.The next output is the division of clusters.From the Figure 10, it can be seen that there is the first cluster of 2239802 located in the provinces of West Nusa Tenggara, Bali, East Nusa Tenggara, East Java and South South.With the case percentage of the area is 0.09% and p-value <0.00000000000000001.The number of cases is 2076.Conclusions obtained on provinces that have been obtained on the most extensive land    From the Figure 14 and Figure 15, it can be seen that the population is 196676 people where the location is only in one province, Central Kalimantan.With a percentage of cases in the area of 0.06% and have the same p-value with other clusters which is equal to 0.00000014.Number of cases as many as 123.

11
This equation is called the classical linear regression model, namely the regression model without spatial influence.2. If and then the equation becomes: W1 This equation is called regression Spatial Lag Model (SLM) or also called Spatial Autoregressive Model (SAR).
3. If and then the equation becomes:W2This equation is called regression Spatial Error Model (SEM).
equation is referred to as General Spatial Model (GSM) or Model Spatial Autoregressive Moving Average (SARMA).

1
The data used in this study are secondary data obtained from the Central Agency on Statistics in 2015.34 data collected from the provinces that were the most in Indonesia with: Y = Area of Fire (Ha) X1 = Rainfall X2 =Fire event X3 = Wind velocity The general regression model used is as follows: W  W W  Where Y is the response variable matrix (nx 1), X for the independent variable matrix (nx (p + 1)), β for the regression parameter vector coefficient (p + 1) x1, the spatial autoregression coefficient is ρ, λ for the lag coefficient of regression in error resolution | λ | <1, μ for the error vector is assumed to contain the hanging autocorrelation nx1, ε for the error of the soil vector nx1, the normal distribution with zero averages and variants σ 2I, W is the spatial weight of the vector with nxn, and the amount collected n.There are four models that can be formed from the General Spatial Model: 1.If and then the equation becomes: 1 This equation is called the classical linear regression model, namely the regression model without spatial influence.2. If and then the equation becomes: W1 This equation is called regression Spatial Lag Model (SLM) or also called Spatial Autoregressive Model (SAR).
no spatial lag dependencies and errors) H1: , ( there are slowness and dependencies of spatial errors ) Furthermore, it was carried out using the homoskedasticity test, the Breusch-Pagan test and the Koenker-Bassett test.To test the hypothesis it is used: H0 : asumming datahemogeneity is fulfilled H1 : homogeneous assumptions of residual data are not fulfilled

W2
This equation is called regression Spatial Error Model (SEM).4. If dan then the equation becomes: W1 W2 This equation is referred to as General Spatial Model (GSM) or Model Spatial Autoregressive Moving Average (SARMA).
The multiple linear regression equation is as follows: Y' = a + β1X1+ β2X2+…..+ βnXn Explanation: Y' = Dependent variable (predicted value) X1 and X2 = Independent variable a = Constants (Y' value if X1, X2 ... .. Xn = 0) b = Regression coefficient (value of increase or decrease) 1. File caseFormat : <zip=location ID> <number of case> <date> Location ID is ID case Number of Cases, namely the number of cases Date = date of case made in date format (example : 12/31/2017) 2. File control Format : <zip=location ID> <number of control> <date> Location ID is ID case Number of Controls, namely the number of cases Date = date of case made in date format (example : 12/31/2017) 3. File coordinates Format : <zip=location ID> <latitude> <longitude> Location ID is ID case Longitude and Latitude is a geographical coordinate system used to determine the location of a place on the surface of the earth.Tuti Purwaningsih et.al (Analysis of factors affecting the area of forest and...)

Fig. 10 .
Fig. 10.Cluster #1 From the Figure 11, it can be seen that the population of the second cluster is 935606 people, where locations are in North Maluku, East Nusa Tenggara, Southeast Sulawesi, Maluku and South Sulawesi provinces.With a percentage of cases in area of 0.1% and having the same p-value with other clusters, which is <0.00000000000000001.Number of cases was 1212 during 2015.

Fig. 13 .
Fig. 13.Cluster #4 [4]an's Index is the oldest measurement of spatial autocorrelation.Moran's I is developed from Pearson correlation in the data univariate series.Pearson correlation (ρ) between the predictor variables and the response variable with a lot of data n can be formulated as follows[4]: Vol. 12, No. 2, July 2018, pp.58-70 Tuti Purwaningsih et.al (Analysis of factors affecting the area of forest and...) D. Moran's I