APP下载

Improving forest burn severity estimations with partial least squares regression and orthogonal signal correction methods in Daxing’an Mountains, China

2021-04-30CunyongJuTijiuCaiWenhongLiGeSunChengliangLeiXueyingDiXiulingMan

Journal of Forestry Research 2021年3期

Cunyong Ju · Tijiu Cai · Wenhong Li · Ge Sun ·Chengliang Lei · Xueying Di · Xiuling Man

Abstract Several indices and simple empirical models and ratios of single band from pre- and post-f ire Landsat images have been developed to estimate and/or map burn severity. However, these models and indices are usually site-, time- and vegetation-dependent and their applications are limited. The Daxing’an Mountains range has the largest forested area in China and is prone to wildf ires. Whether or not the existing models can Effectively characterize the burn severity over a large region is unclear. In this study, we used the orthogonal signal correction method based on partial least squares regression (PLSR) to select those variables that better interpret the variance of burn severity. A new index and other commonly used indices were used to construct a new, multivariate PLSR model which was compared with the popular single variable models, according to three assessment indices: relative root mean square error ( RMSE%),relative bias ( RE%) and Nash-Sutcliff e effi ciency ( NSE%).The results indicate that the multivariate PLSR model performed better than the other single variable models with higher NSE% (68.2% vs. 67.8%) and less RE% (3.7% vs.− 8.7%), while achieving almost the same RMSE%. We also discuss the spectral characteristics of the four selected variables for constructing the multivariate PLSR model and their correlation with the f ield burn severity data. The new model developed from this study should help to better understand the patterns of forest burn severity and assist in vegetation restoration eff orts in the region.

Keywords Satellite data · Normalized burn ratio ·Variable selection · Multiple regression

Introduction

Forest f ires have signif icant impact on landscape structure,composition and plant species diversity, and can threaten lives and property (Lentile et al. 2006; Soverel et al. 2010;Zheng et al. 2016; Whitman et al. 2018). Ecologists and resource managers need burning information to understand vegetation recovery, succession, and for planning post-f ire rehabilitation (Roy et al. 2006; Haire and McGarigal 2009).Burn severity is def ined as the direct Effects of the combustion process on vegetation and soil, and is usually described as tree mortality and soil organic material loss (Langford 1976; Epting et al. 2005; Chuvieco et al. 2006; Lentile et al.2006; Soverel et al. 2010). Estimation of burn severity is essential for local governments to assess the volume loss of trees and to decide whether the regeneration of vegetation covers after a high severity f ire is needed to restore f ireimpacted watersheds and ecosystems (Epting et al. 2005;Lentile et al. 2006; Morgan et al. 2014).

Due to the cost and the lack of spatial representation of traditional f ield methods, remote sensing data, especially the Landsat Thematic Mapper (TM) or the Enhanced Thematic Mapper (ETM+), have been widely used to discriminate burn severity levels since these data can help avoid the cost of time, manpower and capital expenses in f ield inventories(Quintano et al. 2017). Therefore, near infrared (NIR, Band 4 of TM/ETM+) and short-wave infrared (SWIR, Band 7 of TM/ETM+) bands are widely applied because they are sensitive to changes in moisture content of vegetation, litter and soils, as well as to changes in composition, density and vitality of plants (White et al. 1996; Epting et al. 2005; Carreiras et al. 2006; Chuvieco et al. 2006; Soverel et al. 2010).Using these two bands, researchers successively constructed diff erent spectral indices to detect burn severity, and the normalized burn ratio (NBR) (Key and Benson 2006), differenced normalized burn ratio (dNBR) and relative dNBR(RdNBR) (Miller and Thode 2007) were widely used to map burn severity because of their simple form and clear physical meaning. The NBR used the spectral data only after the f ire while the dNBR and RdNBR took into account the spectral data before and after the f ire (Epting et al. 2005; Hudak et al.2007; Chu et al. 2016; Quintano et al. 2017). The NBR,dNBR and RdNBR are calculated as follows:

where B4 and B7 are the fourth band and seventh band of Landsat TM/ETM+ data, respectively, and the subscriptspref ireandpostf ireimply the variable calculated in terms of the pre-f ire image and post-f ire image, respectively.

The formula for NBR is similar to the normalized diff erence vegetation index (NDVI) except the red band 3 in the vegetation index is replaced by the middle infrared band 7.The dNBR is obtained by subtracting a post-f ire NBR image from a pre-f ire NBR image for absolute change detection.Further, the use of RdNBR is to remove the biases of the pre-f ire vegetation by dividing dNBR by the square root of the pre-f ire NBR (Miller and Thode 2007; Miller et al. 2009;Chu et al. 2016). So in theory, an empirical model with the latter indices as the parameter may perform better in estimating burn severity, especially in ecosystems with low biomass(Miller and Thode 2007).

However, these empirical models are usually site-, timeand vegetation-dependent (Jakubauskas and Price 1997; Chu et al. 2016). As a result, models or spectral indices suitable for one area may not be appropriate for another. For instance,Roy et al. ( 2006) suggested that NBR and dNBR were not sensitive to burn severity over a savanna in Australia, boreal forests in Russia, and a tropical forest in South America.There were no correlations between burn severity measured in the f ield and satellite-derived NBR or dNBR. Soverel et al. ( 2010) also found that, compared to dNBR, RdNBR was not as Effective at the regional, individual and f ine-scale levels. Therefore, building an improved index should incorporate knowledge of multispectral space (Roy et al. 2006;Chu et al. 2016). In addition, partial least square regression(PLSR) as an Effective approach dealing with multivariate was f irst developed by Wold et al. ( 1983) (Wang et al.2006). Since then, many successful applications have been documented in various f ield studies due to its robustness for multicollinearity existing in variables. The method has been successfully used in chemotetrics (Wold et al. 2001),agriculture (Nguyen and Lee 2006; Farifteh et al. 2007), and forestry (Naesset et al. 2005; Arcenegui et al. 2010; Lei et al.2012; Vyas and Krishnayya 2014). It can be used to model a possible linear relationship between forest burn severity and remotely sensed spectra data.

The Daxing’an Mountains in northeast China has the largest extent of forest area in the country. The forest coverage is about 25 million ha or 76% of its entire land area,representing half of the country’s forest timber stock. Forests in Daxing’an Mountains play an important role in ecological protection of agriculture of local and adjacent Songnen Plain (Xu 1998). Historically, the area is prone to forest f ires.During 1966-2005, forest f ires occurred 1483 times with an accumulated burned area of 6.38 million ha (Lu et al. 2019).A mega forest f ire on May 6, 1987, lasted for 21 days, and burned a total of 1.3 million ha forests and caused a direct economic loss of ~ 100 million US dollars (Wang 1987).With the global climate becoming warmer, an increasing trend in forest f ires is especially obvious (Yang et al. 2012).Whether or not existing models could Effectively estimate the burn severity over the Daxing’an Mountains is unknown,which will signif icantly constrain government decisions and regional economic planning.

This study aims to evaluate the applicability of existing models for the Daxing’an Mountains and to construct a new multiple linear regression model using PLSR and orthogonal signal correction (OSC) methods. Results from the major f ire assessment indicators demonstrate that our new model performs much better than other existing models for the study area.

The remainder of the paper is organized as follows. Section 2 presents the datasets and methods used in this study.Evaluation results of existing models and our new model are described in Sect. 3. The benef it and uncertainties of the new model are discussed in Sect. 4, followed by conclusions in Sect. 5.

Materials and methods

Study area and characteristics

The Daxing’an Mountains in northeast China is a key forest zone and a f ire-prone region (Xu 1998). The Nanweng River f ire (125° E and 51.5° N, Fig. 1) was ignited by lighting on May 22, 2006, and was suppressed on June 2, 2006. The total area burned was 1.7 million ha.

The climate in this area is continental. Mean annual precipitation is approximately 500 mm and most falls as rain in June, July and August. Mean annual air temperature is− 3 °C while the lowest and highest temperatures appear in January and July, respectively. The regional terrain is gentle foothills with most elevations ranging between 500 m and 800 m a.s.l. Soil is characterized as brown coniferous forest soil with depths between 15 cm and 30 cm. The dominant tree species are Chinese larch (Larix gmelinii(Rupr.) Rupr.),birch (Betula platyphyllaSukaczv) and aspen (Populus davidianaDode.), and pre-f ire percent canopy covers about 61% of the area according to inventory data conducted by the local forest service (Liu et al. 2014).

Data processing

Although burn severity generally represents the Effect of f ire on an ecosystem as a measure of changes in soil, surface fuels, understory shrubs and dominant trees (Key and Benson 2006; Miller and Thode 2007), in China, the forest management and production departments, in view of volume loss, still assigns burn severity principally according to tree mortality just as described by Larson and Franklin( 2005). In this paper, burn severity as a predictor variable was picked up from f ire burn severity thematic maps produced by the Daxing’anling Forestry Inventory and Planning Institute (DFIPI). DFIPI integrated f ield investigation of tree mortality of every 28.26 m × 28.26 m (0.08 ha) plot with visual interpretation of high-resolution (2.5-m) Spot 5 fusion images to produce the thematic map with a scale of 1: 25,000 in 2007. In the thematic map, low severity corresponds to < 20% tree mortality, middle severity indicates mortality was ≥ 20% and < 50%. Tree mortality of high severity ranges from 50 to 90%, and mortality > 90% implies extreme severity (DFIPI, Principle of Forest Fire Thematic Investigation).

Fig. 1 Location of the study area and sample plots with burn severity levels

We collected 82 points of burn severity data that included all four levels in burn severity from the thematic map; meanwhile, we recorded their geographic coordinates. We collected these burn severity data as close to the center of the corresponding plaque as possible, therefore, not resulting in a problem of spatial matching to remote sensing data. In the following analysis, the four burn severity levels are referred to by numbers 1, 2, 3, and 4, where 1 means low severity and“4” is for extreme severity.

The remote sensing images used were Landsat 5 TM 30-m multi-spectral data (Bands 1-5 and 7, and downloaded from https://earth explo rer.usgs.gov/). Based on minimal cloud cover, f ire ignition site and date, the pre-f ire image was set from June 2, 2005, path 120 and row 24, and the post-f ire image was from June 28, 2006, path 121 and row 24. These two images were geometrically rectif ied to the Chinese national grid of Beijing 54 using a map of stocked forest of 1:25,000 scale by the second-order polynomial method and resampled using the bilinear interpolation method (Cocke et al. 2005). After the rectif ication, Landsat TM digital numbers were converted to exoatmospheric ref lectance (Chander and Markham 2003; Chander et al. 2009). The COST correction, due to no need of additional parameters other than those of image, were used to eliminate or decrease the impact of haze or atmospheric absorption and scattering on remote sensing data (Mahiny and Turner 2007).

According to the geographic coordinates of 82 sample points, we picked up the relevant ref lectance from pre- and post-f ire remote sensing data. Twenty-four satellite indices were chosen and calculated as explanatory variables to evaluate the relationship between remote sensing data and burn severity derived from the thematic map (Table 1). We selected these indices mainly because they have been widely used to estimate or map burn severity and forest biophysical parameters as presented by the documents listed in Table 1.The composite index (CI) proposed is based on the following f indings. Since NDVI saturates with increased values of LAI in dense forest (Cocke et al. 2005; Jiang et al. 2006),change of dNDVI in the moderate burn severity region can be underestimated if the forests were denser during the pref ire period. However, as Miller and Thode ( 2007) illustrated,for pre-f ire high percent canopy cover, dNBR has limitations in distinguishing the diff erence between a moderate burn severity and high burn severity. Thus, the composite indexmultiplying dNBR by dNDVI could reduce the residual of dNBR in the region with moderate burn severity and impose less Effect on dNBR in the regions with low and high burn severities.

All indices were computed using the single date from the post-f ire Landsat TM data. However, dNBR, RdNBR,dNDVI and CI were calculated using bi-temporal, pre- and post-f ire Landsat TM data.

Table 1 Detailed description of composition and derivation as explanatory variables of burn severity

Methods

The partial least squares regression method (PLSR) was used to specify the relationship between field burn severity and satellite remote sensing spectra. For a detailed description of the PLSR technique, refer to Wold et al.( 2001) and Wang et al. ( 2006).

Considering that these variables are derived from limited six bands and are often highly correlated with each other, orthogonal signal correction (OSC) based on PLSR is often used to find the most correlated variables or subsets and lower the complexity of consequent PLSR models (Höskuldsson 2001; Wold et al. 2001; Zeng et al.2016). Höskuldsson ( 2001 ) described the OSC in details and a MATLAB program was provided in an appendix.This program works well in MATLAB 2010, provided that a parameter pi is replaced with another name because pi is already used as a default constant that is close to 3.1416. Before following this OSC program, all variables were normalized to the same range of − 1 to 1.

Hereafter, we named the simplified PLSR model by the OSC method as the OSC-PLSR model. To evaluate the performances of burn severity models, three indicators such as relative root mean square error (RMSE%), relative bias (RE%) and Nash-Sutcliffe efficiency (NSE) were calculated by four-fold cross validation (Farifteh et al. 2007;Legates and McCabe 1999; Quintano et al. 2017). The 82 samples were split into four groups as evenly as possible,and three split data were then used to derive the models,and one split data were used to calculate the accuracy indicators until all samples were included in the validation processes. These indicators are defined as follows:

wherePiis theith predicted or simulated value,Oiis theith observed or actual value,nis the total number of observations, and̄Ois the average of observed values. LowerRMSE% andRE% as well as higherNSE% imply better model performance.

Results

Correlation analysis

To better explore variables best related to burn severity and their relationships, correlation analysis was implemented between all variables. Considering the difficulties and crowdedness of including numerous variables on a page,we only partially plotted the Spearman’s correlation coeffi -cients (r), i.e., those coeffi cients being more than 0.8 (except variable CI) were included in Fig. 2.

All variables were signif icantly linearly correlated to burn severity (the signif icantris 0.22 at 0.05 signif icance level and 0.28 at 0.01 level for the 82 samples), except for TM2 and TC1 with the correlation coeffi cients of − 0.05 and 0.00 to burn severity, respectively (not shown). TM2 was sensitive to green vegetation, and TC1, also referred to as brightness, and mainly represented the diff erence of soil ref lectivity (Carreiras et al. 2006). For serious burn cases,green vegetation and shallow soils showed almost homogeneous characteristics in space, leading to insignif icant correlation coeffi cients of TM2 (TC1) with the burn severity.In contrast, TM7 had the highest correlation coeffi cient over all variables (Fig. 2), indicating its robust ability of separating vegetation, regardless live or dead, from soil and ash(Miller and Thode 2007). The high correlations between these variables implied that these variables can substitute for each other within a multivariate statistical model.

Fig. 2 Correlation coeffi cients between every two variables calculated using all 82 sample data (the signif icant r is 0.22 at 0.05 signif icance level). The larger or deeper color the dots, the closer their relationships; the crimson indicates a negative relationship, the blue a positive relationship

The OSC-PLSR modeling process

The principle of the PLSR method is to extract the maximum variance in turn from originally independent variables and construct new orthogonal PLS components, interpreting as much variation as possible. With all the observation and leave-one-out cross validation, the two f irst PLS components were regarded as appropriate for the PLSR burn severity estimation model. The third component decreased prediction accuracy as the variation decreased by 0.02, while it enhanced the model f it accuracy (determination coeffi cient R2 changed from 0.73 to 0.76) (Fig. 3).

We ran the PLSR-based OSC program with the f irst two PLS components and let two OSC components regress the model. The regression process can be carried out stepwise by each component, or by all components at one time. Whether one or two OSC components at one time were introduced to adjust the original variables, the four variables, i.e., TM4,TM7, CI, and RdNBR had the highest values of regression coeffi cients, though their values were diff erent. The high regression coeffi cient values also implied the index CI was not less important than other indices, since they had almost the same amplitude (Table 2).

Fig. 3 Percent cumulated variance of the f irst three components of the PLSR model; R2 is determination coeffi cient, a measure of f it,and Q2 is cross-validated

Accuracy comparison of models on estimating burn severity

To evaluate the performances of the PLSR model on estimating burn severity, we compared the PLSR model with other commonly used single variable models f itted by the linear, second-order polynomial and exponential models as in previous research (Epting et al. 2005; Miller and Thode 2007; Hall et al. 2008). After a four-fold cross validation was carried out, the average values of the three indicators were produced and are listed in Table 3.

NBR had the highest accuracy among the three remotesensing-derived indices. The assessment indicator values were lower (~ 10%) forRMSE% and higher (~ 20%)forNSE% than dNBR and RdNBR. At the same time, the polynomial models illustrated better prediction with lowerRMSE% andRE% and higherNSE% than the linear and exponential models. As we expected, compared with the best model of a single variable, the OSC-PLSR model especially improved the ability of estimating burn severity, asNSEandRE% increased by 0.4% and 5%, respectively, with the indicator ofRMSE% decreased only by 0.2%.

Table 3 Accuracy comparison of single-variable models and OSCPLSR model according to four-fold cross validation

Table 2 Coeffi cients of 24 standardized variables derived by the PLSR-based OSC model and the bold fonts indicating a higher correlation than others

Discussion

The multivariate model

Accurate estimations of burn severity of post-f ire forests can help forest managers and ecologists determine f ire behavior and subsequent vegetation succession regimes (White et al.1996; Keeley 2009; Soverel et al. 2010; Morgan et al. 2014;Quintano et al. 2017). However, despite the existence of abundant peer-viewed papers on estimating burn severity,there is no consensus on standard methods. One reason is the intrinsic uncertainty of remote sensing data (Jakubauskas and Price 1997; Roy et al. 2006; Houborg et al. 2007).For instance, diff erent objects can produce similar spectra on a satellite image (Eriksson et al. 2006). In addition, burn severity has no common def inition, although a number of quantitative and qualitative indices (tree mortality, burnt soil depth, char and ash cover) were used as guides and implemented in f ield investigations (Cocke et al. 2005; Eriksson et al. 2006; De Santis and Chuvieco 2009; Keeley 2009;Quintano et al. 2017). In China, burn severity f ield investigations use only one factor, tree mortality, and is inevitably diff erent from the composite burn index (CBI) that considers several factors. The simple empirical relationships were usually site-, time- and vegetation-specif ic (Jakubauskas and Price 1997); a multivariate regression and selecting method,OSC-PLSR was developed to improve burn severity mapping by incorporating knowledge of multispectral spaces.Our work indicates that a multivariate model worked generally better than simple models with one variable in burn severity estimation. More case studies are needed for other areas to conf irm the performance of a multivariate model developed from this study.

Performance of the OSC-PLSR model

Fire causes biotic and abiotic changes of a forest such as canopy consumption, ground charring and soil color alteration. These changes are detectable using satellite sensors represented by Landsat TM/ETM+. This is mainly because the near infrared (NIR, Band 4 of TM/ETM+) is sensitive to chlorophyll content and leaf morphology, and f ire damaged leaf tissues will result in the ref lectance of near infrared to decrease. Meanwhile, middle infrared (MIR, Band 7 of TM/ETM+) is susceptible to moisture changes in both soil and vegetation and consequently, as f ire results in a decrease of canopy moisture, the middle infrared ref lectance distinctly increases (Epting et al. 2005; Miller and Thode 2007;Fernández-Manso et al. 2016). Therefore, these two bands are widely used to construct the indicators for mapping burn severity (White et al. 1996; Epting et al. 2005; Carreiras et al. 2006; Chuvieco et al. 2006; Soverel et al. 2010),although a few studies also highlighted some shortcomings(Roy et al. 2006; Quintano et al. 2017). In our OSC-PLSR model, both bands worked well independently in line with their spectral characteristics, but only one index derived from them was picked out (Table 2). Though NDVI or dNBR individually showed a less close relation to burn severity as their lower regression coeffi cients suggested, CI with information from both NDVI and dNBR showed higher ability to interpret burn severity variance. The results indicate that the OSC method could separate the dominated information from noises to a certain extent. The multivariate OSC-PLSR model using these variables especially improved the accuracy of burn severity estimation with NSE% increased by 0.4% and RE% by 5%.

Other variables such as NDVI, ARVI and ratio7/5 showed strong linear correlation to burn severity but they all closely correlated to Band 7 and Band 4 (Fig. 2). The close relationship indicated similarity among them. PLSR can synthesize their inherent information sensitive to burn severity and avoid potential adverse impacts resulting from collinearity,subsequently constructing a higher accuracy model (Wold et al. 2001; Naesset et al. 2005; Vyas and Krishnayya 2014).NBR was excluded from the f inal OSC-PLSR model mainly because it highly correlates to Band 4 and Band 7 which are already included in the model. As a good indicator of burn severity due to containing pre- and post-f ire information (Miller and Thode 2007; Miller et al. 2009; Soverel et al. 2010), RdNBR was retained in the OSC-PLSR model and can be applied in low hilly regions as in the Daxing’an Mountains.

Among all the single variable models, NBR performed the best in capturing burn severity in this study, consistent with the f indings by Epting et al. ( 2005). In our study area,where the coniferous forest dominated the mixed forest,dNBR performed better than RdNBR; this is in agreement with studies in western Canada (Soverel et al. 2010). On the other hand, the good performance of dNBR was also attributed to moderate percent canopy cover (~ 60%). Miller and Thode ( 2007) also showed that RdNBR don’t generally perform better than dNBR in black oak, ponderosa pine and mixed conifer with homogeneous 52% canopy cover.Besides, one particular parameter or method did not consistently produce the best f it to f ield investigations (French et al. 2008). These conclusions show again the uncertainty of estimating surface conditions of the Earth using remote sensing (Jakubauskas and Price 1997; Houborg et al. 2007;Rocchini et al. 2013). In addition, diff erent standards to estimate burn severity in the f ield may also be an important factor causing these disagreements (Cocke et al. 2005; Roy et al. 2006; Meng et al. 2017). Another adverse aspect may result from the practice that continuous data were discretized to class numbers. Due to limited forest inventory data,we did not further analyze the inf luence of tree species on burn severity. Under global warming, the consensus on burn severity measured in f ield investigations will contribute signif icantly to research on the prevention of forest f ire ignition,the behavior of forest f ires and post-f ire recovery.

Conclusions

[Remote sensing information can ref lect the degree of forest damage by f ire..] A multivariate PLSR model was applied to simulate the relationship between burn severity and remote sensing factors. Our work conf irmed that Band 4 and Band 7 of Landsat TM data record important spectral characteristics related to forest burn severity. As a result, each of these bands or their derivations showed higher capability to estimate burn severity. The multivariate model designed for the Daxing’an Mountains enhanced the accuracy of estimating burn severity based on Landsat TM data compared to single variable models. Our work also illustrated that the OSC method was an Effective approach in variable selection and improved burn severity estimation. The model built by our methods can provide useful information from a large amount of remotely sensed data.

Acknowledgements We are very grateful to the U.S. Geological Survey for providing the free remote sensing data. The authors also gratefully acknowledge anonymous reviewers for their patient comments.

Compliance with ethical standards

Conf lict of interestThe authors declare that they have no conf lict of interest.

Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affi liations.