Determining Effective Factors on Forest Fire Using the Compound of Multivariate Adaptive Regression Spline and Genetic Algorithm, a Case Study: Golestan, Iran
Pahlavani, P., Assistant professor at School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran
Raei, A., PhD Candidate of GIS at School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran
Bigdeli, B., Assistant professor at School of Civil Engineering, Shahrood University of Technology
Keywords: Forest Fire, Multivariate adaptive regression spline, Multiple linear regression, Logistic regression, Genetic Algorithm.
- Introduction
Nowadays, Determining the effective factors on fire is so important, because the plenty areas of forests around the world are destroyed annually by fire and recurrence of that in the long term can irreparably damage to the earth and its inhabitants. It helps us to identify most dangerous locations and times in forest fire. Hence, we can prevent many of driving factors of forest fire by law enforcement, efficient forest management policies and more supervision. In the current study, we identified the effective factors on the fire in Golestan forest through integration of three different methods including multiple linear regression, logistic regression and multivariate adaptive regression spline with Genetic Algorithm.
- Study Area
Golestan Province is in the North of Iran and 18% of it is covered by forests. Golestan Province is a touristic province and several roads pass through its forests and according to statistical records, most of the occurred fires were in proximity of these roads. Our study area is located in 36°53′-37°25′N and 55°5′- 55°50′E and its area is about 3719.5 km2. We selected this area, because includes the most of fires have been occurred in Golestan Province in recent years.
- Materials and Methods
A big fire was occurred on 12 December, 2010 in our study area and we used it as the dependent variable. The actual burnt area and some other data, such as Digital Elevation Model (DEM), the roads network, the rivers, the land uses, and soil types in the area were provided from Golestan Province Department of Natural Resources. Also, geographic coordination of the synoptic weather stations near the area and their data, including maximum, minimum, and mean temperature; total rainfall, as well as maximum wind speed and azimuth in December 2010 were obtained from National Meteorological Organization of Iran.
The land use and soil layers were in scale of 1:100000 and the roads and the rivers layers were in 1:5000 and all of them were provided in 2006. The region DEM is generated from topographic maps of Iran National Cartographic Center in scale of 1:25000 with positional resolution of 30m and we produced the slope and the aspect layers from it in ArcGIS software with the same resolution. The roads and the rivers were in vector format, hence, we used the Euclidean Distance analysis to generate rasters that each cell of them shows the distance from the nearest road or river.
At first we had 5 weather stations, which is very few for GWR. In this regard, we generated 1000 random points in the area and interpolated data to these points using Ordinary Kriging method with exponential semivariogram model in 30m resolution in ArcGIS software.
The multiple linear regression (MLR) model is the generalization of simple linear regression that is modeling the linear relation between one dependent variable and some independent variables. The general formula of MLR is seen below:
(1)
The unknown coefficients are obtained using least squares adjustment as follows:
(2)
The logistic regression (LR) model is a nonlinear model for determination of the relation between a binary dependent variable and some independent variables. If we use the values of 0 and 1 for non-fire and fire points respectively, then the probability that a point be a fire point is obtained by Eq. (3):
(3)
If the number of parameters is insignificant compared to the observations, then we use the unconditional maximum likelihood estimation shown by Eq. (4) to compute the unknown coefficients of this model.
(4)
The multivariate adaptive regression spline (MARS) model is a flexible non-parametric model that requires no assumption about the relation between the dependent and independent variables. Hence it has a high ability in determination of complex nonlinear relations among the variables. The general formula of MARS is seen below:
(5)
is the m’th basic function that is obtained by Eq. (6):
(6)
These basic functions are chosen in such a way that leads to minimum RMSE of model.
We use the genetic algorithem (GA) with the fitness function of the normalized RMSE to select the optimum combination of effective factors on forest fire.
- Results and Discussion
In this paper we study the dependence of the forest fire to 14 factors shown in table 1, in the study area. Our results are shown in figures 1 to 3.
Table 1. The studied factors in the present research
Factor |
Num. |
Factor |
Num. |
Factor |
Num. |
Aspect |
11 |
Maximum Wind Speed (m/s) |
6 |
Maximum Temperature (℃) |
1 |
Slope |
12 |
Soil Type |
7 |
Minimum Temperature (℃) |
2 |
Elevation (m) |
13 |
Land Use |
8 |
Mean Temperature (℃) |
3 |
Distance from The Residential Zones (m) |
14 |
Distance from The Roads (m) |
9 |
Total Rainfall (mm) |
4 |
|
|
Distance from The Rivers (m) |
10 |
Maximum Wind Azimuth |
5 |
Figure 1. (a) The best and the mean values of fitness, (b) The last best individuals, (c) The average distance between individuals, (d) The fitness of each individual in the last generation using MLR
Figure 2. (a) The best and the mean values of fitness, (b) The last best individuals, (c) The average distance between individuals, (d) The fitness of each individual in the last generation using LR
Figure 3. (a) The best and the mean values of fitness, (b) The last best individuals, (c) The average distance between individuals, (d) The fitness of each individual in the last generation using MARS
- Conclusion
This research shows that both of the biophysical and anthropogenic factors have significant effects on forest fire in our study area. Just two factors were identified as impressive factors in all three cases including the minimum temperature and the maximum speed of wind. This study concluded to the NRMSE=0.4291 and R2=0.9862 for the multiple linear regression, NRMSE=0.9416 and R2=0.9912 for the logistic regression and NRMSE=0.1757 and R2=0.9886 for the multivariate adaptive regression spline and totally the multivariate adaptive regression spline method showed a better performance in comparison to the other two methods.