APP下载

Random Forest-Based Snow Cover Mapping in China Using Fengyun-3B VIRR Data

2023-11-10YuchenXIEYonghongLIUYepingZHANGFuzhongWENGShanyouZHUZhaojunZHENGandShihaoTANG

Journal of Meteorological Research 2023年5期

Yuchen XIE, Yonghong LIU, Yeping ZHANG, Fuzhong WENG, Shanyou ZHU,Zhaojun ZHENG, and Shihao TANG

1 School of Remote Sensing & Geomatics Engineering, Nanjing University of Information Science & Technology, Nanjing 210044

2 CMA Earth System Modeling and Prediction Centre, China Meteorological Administration (CMA), Beijing 100081

3 State Key Laboratory of Severe Weather, Chinese Academy of Meteorological Sciences, China Meteorological Administration, Beijing 100081

4 National Satellite Meteorological Centre, China Meteorological Administration, Beijing 100081

ABSTRACT Currently, there is variability in the spectral band thresholds for snow cover recognition using remote sensing in different regions and for complex terrains.Using Fengyun-3B Visible and Infra-Red Radiometer (FY-3B VIRR)satellite data, we applied random forest (RF) methodology and selected 13 feature variables to obtain snow cover.A training set was generated, containing approximately 1 million snow and nonsnow samples obtained in China from the snow monitoring reports issued by the National Satellite Meteorological Centre and four snow cover products from the Interactive Multi-sensor Snow and Ice Mapping System (IMS), the FY-3B Multi-Sensor Synergy (MULSS),the Moderate Resolution Imaging Spectroradiometer (MODIS) snow cover product (MYD10A1), and the National Cryosphere Desert Data Center (NCDC).This training set contained many different samples of cloud types and snow under forest cover to help effectively distinguish snow and clouds and improve the recognition rate of snow under forest cover.Then, two RF snow cover recognition models were constructed for the snow and nonsnow seasons and they were used to conduct daily snow cover recognition in China from 2011 to 2020.The results show that the RF models constructed based on FY-3B VIRR data have good recognition performance for shallow snow, understory snow, and snow on the Qinghai-Tibetan Plateau.The recognition accuracy against weather stations and the spatial consistency with the IMS product are better than the MULSS, MYD10A1, and NCDC products.The overall accuracy of the RF product is 90.6%, and the recall rate is 93.8%.The omission and commission errors are 6.2% and 11.1%, respectively.Unlike other existing snow cover algorithms, the established RF model skips the complicated atmospheric correction and cloud identification processes and does not involve external auxiliary data; thus, it is more easily popularized and operationally applicable to generating long-time series snow cover products.

Key words: snow identification, random forest, Fengyun-3B Visible and Infra-Red Radiometer, feature selection

1.Introduction

Snow is one of the most widely distributed and seasonally and interannually variable elements in the cryosphere; the largest extent of snow cover is 49% of the land surface in the Northern Hemisphere (Qin et al.,2020a).Snow is also an indicator of climate change, and its thermal and radiation characteristics have important impacts on the radiation budget and thermal balance of the land surface, as well as nonnegligible impacts on climate, the natural environment, and human activities(Qiao and Zhang, 2020; Qin et al., 2020b).Snow also stores fresh water.Alpine snowmelt is an important source of freshwater for inland rivers and serves as a lifeline for survival in arid regions (Shen et al., 2013).Therefore, accurate and rapid snow cover monitoring is critical for ecological protection and climate change research (Han et al., 2018; Tong et al., 2020).

With the rise of remote sensing technology, satellite sensors have become the primary means of identifying and monitoring snow.Snow has the characteristics of high reflection in the visible light band and low reflection in the near-infrared band, and these features are the basis for identifying snow cover using visible light and infrared remote sensing techniques (Dozier, 1989; Wu et al., 2018).Traditional snow identification methods mainly include the spectral threshold method, remote sensing index method, and decision tree method.Among these methods, the normalized difference snow index(NDSI) is the most used index for discriminating snow(Hall et al., 1995; Salomonson and Appel, 2006; Sood et al., 2020; Zhang et al., 2020).However, NDSI thresholds vary greatly among different sensors, regions, and ground surface types.For example, when identifying snow in the Three-River Headwaters region of China, the dynamic NDSI threshold derived based on the Landsat-8 satellite is between 0.29 and 0.37 (Sun et al., 2020); for identifying snow on the Qinghai-Tibetan Plateau and in Xinjiang, the optimal Fengyun-3 (FY-3) NDSI threshold for satellites is 0.35 (Zhang et al., 2015; Chen et al.,2020).The universality of the NDSI threshold method is not high when it is extended to large-scale areas or complex regional terrains.Although the NDSI can discriminate between snow cover and most clouds, it is easily affected by cirrus clouds (Riggs et al., 1994; Zeng and Yan, 2005).To differentiate snow and cirrus clouds,scholars have leveraged the fact that the brightness temperature of snow in the far-infrared channel (10.3-11.3 μm) is significantly higher than that of cirrus clouds(Zhang X.et al., 2016).In addition, the water vapor channel located at approximately 1.35 μm can also be used to effectively identify cirrus clouds (Ishida et al.,2018).In vegetation-covered areas, some remote sensing indices, such as the normalized difference vegetation index (NDVI) and normalized difference forest snow index (NDFSI), can be used, in addition to the NDSI, for snow cover identification (Hou and Yang, 2009; Wang et al., 2017; Wang et al., 2019).In addition, external auxiliary data, such as land cover type data, are often introduced to snow cover mapping (Coll and Li, 2018; Gao et al., 2019).

Cloud identification or removal processing is an important step before snow cover identification can be implemented (Zhang H.et al., 2016); the products, such as the current Moderate Resolution Imaging Spectroradiometer (MODIS) daily snow cover products (MOD 10A1/MYD10A1) and 1-km-resolution FY-3 Multi-Sensor Synergy (MULSS) fusion daily snow cover product (Yang and Dong, 2011), are based on cloud detection, so the accuracy of cloud identification affects that of the subsequent snow identification to a large extent.Many scholars have evaluated the accuracy of MODIS snow products and have shown that the overall quality of MODIS products is good, with accuracy rates reaching 90% in clear weather conditions; however, their use in dense forests is challenging (Simic et al., 2004;Hall and Riggs, 2007; Hall et al., 2010).Based on MOD10C1/MYD10C1 global daily snow cover data,monthly and seasonal average FY-3A/B MULSS snow cover product evaluations were conducted from 2010 to 2014 in previous studies.The results indicate an overall snow cover identification accuracy of 60%-90%, but this accuracy varied greatly (Zhang et al., 2018).Min et al.(2021) evaluated theFY-3CMULSS snow cover products on the Qinghai-Tibetan Plateau based on observation stations and showed that the overall snow cover identification accuracy was 87.2%, while the snow recognition rate (referred to as the recall rate in this study)was only 66.7%.

In the process of snow identification, accurately constructing a classification model from independent parameters, such as the NDSI, brightness temperature, and NDVI, and target parameters (snow or nonsnow categories), is key.Because of the complex nonlinear relationship between the selected independent parameters and target parameter, the traditional multi-source linear fitting method is not sufficient to accurately determine a nonlinear relationship.Therefore, machine learning techniques have been applied in remote sensing information recognition and are becoming increasingly widespread.Artificial neural networks (ANNs), support vector machines (SVMs), and random forests (RFs) are now commonly used machine learning algorithms.Some scholars have used ANNs to identify and map snow; ANNs have the advantages of being able to easily incorporate auxiliary information, such as land cover types, and to learn the nonlinear relationship between surface reflectance and fractional snow cover (Dobreva and Klein, 2011;Moosavi et al., 2014; Hou et al., 2020).The applicability and capability of SVM and RF models in snow recognition have been thoroughly studied and discussed, and both methods have been shown to achieve high accuracy,with correlations to reference datasets reaching 93%(Kuter, 2021).Based on the RF algorithm, MODIS reflectivity, altitude, surface type, wind speed, and other data have been used as predictors to identify snow cover,with overall accuracies reaching as high as 97% (Rittger et al., 2021).

RF is an enhanced nonparametric decision tree model that bases decisions on the average multiple regression tree, and it has been widely used in classification and regression problems (Kavzoglu, 2017; Xie et al., 2021).It has also been used for remote sensing extraction of surface parameters from FY-3B data (Liu et al., 2021;Sheng and Rao, 2021).Compared with other machine learning methods, such as ANNs and SVM, RF has the advantages of fewer parameter requirements and faster implementation (Breiman, 2001).In addition, the RF model is more adaptive than other models in terms of processing outliers in the training process through parameterfree features and randomization (Kavzoglu, 2017).Therefore, in this study, the RF methodology is used to identify snow cover.

Currently, international long-term (time series lasting over 10 yr) global- and regional-scale snow cover satellite products are mainly based on foreign satellites, such as MODIS and NOAA satellites, while similar products based on domestic satellites have not yet been established.The FY-3B satellite, the second of China’s second-generation polar-orbiting FY-3 meteorological satellite, was launched on 5 November 2010 and has been accumulating data for more than 10 yr.Establishing an algorithm for identifying snow cover in China based on the FY-3B satellite is important for developing long-term snow cover products based on domestic satellites to dynamically monitor China’s ecological environment and associated climate change.Therefore, in this study, we use FY-3B data to build an RF model that directly omits the complex cloud identification and atmospheric correction processes, and then use the FY-3B-derived atmospheric top reflectivity and brightness temperature to conduct snow cover identification in China.Along this way in the future, we expect to produce longterm FY-3-based daily snow cover products in China.

2.Instruments and datasets

2.1 FY-3B VIRR

The Visible and Infra-Red Radiometer (VIRR) onboard the FY-3B polar-orbiting meteorological satellite was designed with 10 bands ranging from visible light to mid-infrared light, with a spectral range of 0.43-12.5 μm and a spatial resolution of 1 km.The spectral characteristics of each VIRR band are listed in Table 1.

In this study, FY-3B VIRR L1 data are used to identify snow cover, and a radiometric calibration is performed according to the calibration coefficient in the L1 level data file to obtain the apparent reflectance and brightness temperature corresponding to each VIRR band for use in the subsequent feature selection and classification.

2.2 Snow remote sensing monitoring reports

The China Meteorological Administration Decisionmaking Service Information Sharing Platform (http://

10.1.64.187/jcxt/home) issued by the National SatelliteMeteorological Centre (NSMC) provides snow remote sensing monitoring reports for typical dates from 2011 to 2022.These reports include the specific occurrence times and areas of snow cover, and are generally corroborated by meteorological observation data to ensure the reliability and accuracy of these samples in snow-covered areas.Therefore, this product can be used for the selection of the time of snow training samples and main snow areas.Figure 1 shows the identified snow areas in the snow remote sensing monitoring reports released by the NSMC on 14 June 2015 and 29 March 2016 (National Satellite Meteorological Center, 2015, 2016).

Table 1.Spectral characteristics of the FY-3B VIRR bands

2.3 Snow cover products

2.3.1IMS snow cover synthesized products

The Interactive Multi-sensor Snow and Ice Mapping System (IMS) provides snow and sea ice products synthesized from multi-source data, such as passive microwave, visible imagery, meteorological station observations, and other ancillary data (Ramsay, 1998; Helfrich et al., 2007).The IMS products became operational in 1997 and were first available at a 24-km resolution.Since then, the spatial resolution of IMS snow products has been greatly improved to 1 km, with more data sources being combined.The most prominent advantage of the IMS products lies in the interaction of analysts,who manually judge which data sources are the most reliable in different regions and conditions (Frei et al.,2012).The IMS products have been well applied in China.The annual accuracy of the IMS products in northern Xinjiang, Northeast China, and the Qinghai-Tibetan Plateau is more than 92%, but the snow cover area is overestimated to some extent (Liu et al., 2014).

In this study, we used the IMS Northern Hemisphere snow products in Geo TIFF format on a spatial resolution of 1 km, which are downloaded from the US National Ice Center at https://usicecenter.gov/Products/ImsData/.The data validity period of the products is from December 2014 to the present.

2.3.2FY-3B MULSS snow cover products

Fig.1.Snow cover identification by the NSMC on (a) 14 June 2015 and (b) 29 March 2016.The cyan color represents snow.

The Fengyun Satellite Remote Sensing Data Service Network (http://satellite.nsmc.org.cn/) provides daily FY-3B MULSS-based snow cover products, referred to herein as MULSS products, on a spatial resolution of 1 km.The product uses the FY-3B VIRR and Mediumresolution Spectral Imager (MERSI) sensors to identify snow-covered areas based on a combination of the NDSI index and multispectral band thresholds, and does the FY-3B cloud mask product to remove clouds before identification (Yang and Dong, 2011).The valid period of the FY-3B MULSS products provided by the website spans from 2015 to 2020.

2.3.3MODIS snow cover products

The NASA (https://earthdata.nasa.gov/) provides two MODIS-derived daily snow cover products (MOD10A1 and MYD10A1) with different equatorial crossing times at a spatial resolution of 500 m.Considering the observation time of FY-3B, the C6 version of Aqua MODIS snow products (MYD10A1) was used in this study(Riggs et al., 2017; Hall and Riggs, 2021).The previous studies have shown that the accuracy of the MYD10A1 C6 version is better than that of the C5 version in China(Zhang et al., 2019).The MYD10A1 datasets provide information about snow cover in the form of NDSI values.This product is mainly based on the NDSI indicator and uses a set of grouped decision support algorithms to extract snow cover information (Salomonson and Appel,2006).Similarly, clouds are culled prior to the snow identification step using MODIS-derived cloud product data that are valid from 2000 to the present.

2.3.4NCDC snow cover products

The National Cryosphere Desert Data Center (NCDC;http://www.ncdc.ac.cn) provides daily snow cover products at a resolution of 500 m in China from 2000 to 2020, referred to herein as NCDC products (Hao et al.,2022).These products are based on the MODIS-derived daily reflectance product MOD09GA/MYD09GA and use land cover type data and multiple band indices to identify snow cover, thus improving the snow cover identification accuracy in forested and mountainous areas.Before the snow recognition process is conducted,the hidden Markov model and multisource data fusion method are used to realize complete cloud removal from the product.The overall accuracy of the product in China is over 93%, and the snow leakage rate is less than 9%.The product accuracy of NCDC is significantly better than that of the C6.1 version of MOD10A1 and MYD10A1 (Gao et al., 2019).

2.4 Daily snow depth measurements at weather stations

In this study, the daily snow depths from 2521 weather stations in China are used to evaluate the snow cover products.The data are obtained from the “China Daily Surface Data (National Station)” of the National Meteorological Information Centre.As shown in Fig.2a, the weather stations are distributed on a variety of terrains at different altitudes in China, and the stations are densely concentrated in the east and sparse in the west, especially on the Qinghai-Tibetan Plateau.

3.Methodology

RF is a kind of ensemble learning method.The principle of ensemble learning is that different prediction models can yield different results using the same inputs(Aria et al., 2021).An RF is a collection of decision trees based simple statistical models.Although the single-feature models tend to yield poor predictions, by combining a series of trees, the RF methodology provides accurate results.Combining decision trees reduces the variability in the prediction results and provides a stable and robust prediction framework.An RF is a nonlinear model, the complexity of which depends on the number of trees in the cluster, the number of tree branches, and the depth of the trees.RFs tend to avoid overfitting, and outliers in the training set do not have a great impact on the model training process.The process of implementing RF methodology includes the following steps: (1) the selection of training samples, (2) the selection of feature variables,(3) model training and construction, and (4) model accuracy verification.

Fig.2.Spatial distributions of (a) topography and 2521 weather stations with daily snow depth measurements and (b) forest coverage with a 1-km resolution in China.

3.1 Selection of training samples

The RF algorithm is a supervised learning method.Accurate, large, and complete training samples are very important prerequisites for the establishment of a regional snow recognition model in China.The training samples must be selected to cover various terrains.Figure 2a shows the large spatial extent of China, with a land area of approximately 9.6 × 106km2and altitudes of 0-8848 m.The Chinese forest map product, which includes 6 main forest types with a resolution of 30 m in 2010 (Li et al.,2014), was used to estimate forest coverage with a resolution of 1 km (Fig.2b) for the selection of training samples of understory snow cover with different coverages.

The first step is to select typical training sample dates.In this study, the snow areas listed in 18 snow remote sensing monitoring reports issued by the NSMC are selected as the basis for the training sample date selection from 2011 to 2019, including most months of the year and covering most of the snow-covered areas in China,as shown in Table 2.

After determining the sample selection dates shown in Table 2, it is necessary to accurately select snow samples.We used the snow remote sensing monitoring reports on the above dates to determine the main snow cover areas, as shown in Table 2.In these snow cover areas, four snow cover products (IMS, MULSS,MYD10A1, and NCDC) are used to accurately select snow sample points.The selection of snow cover samples is only effective when at least three snow cover products have consistency.Compared with selecting snow samples, nonsnow sample selection is much easier.The nonsnow sample points, including various types ofclouds and nonsnow-covered surface types, are selected by visual interpretation from FY-3B VIRR RGB(red-green-blue) false color composite images on the dates listed in Table 2.

Table 2.Snow sample selection dates and main snow cover areas

Because the seasonal and interannual snow cover changes are significant and can be found in the actual training process, it is difficult to achieve good results throughout the year using a single model.Therefore, in this study, we defined the snow season as lasting from October of a given year to May of the following year and the nonsnow season as lasting from June to September.Thus, the training sample selection and model construction processes were carried out for these two periods.In addition, because the distinguishing between snow and clouds is difficult in snow recognition research, the selection of samples of different cloud types under various complex terrains and weather conditions was emphasized during the training sample selection process.Furthermore, to improve the recognition rate of snow under forest cover, combined with the forest coverage image with a 1-km resolution (Fig.2b), many snow or nonsnow samples with different forest coverages were selected.Figure 3 shows the training sample point selection results obtained for the snow and nonsnow seasons.Light blue indicates snow sample points, and pink indicates nonsnow sample points.Notably, this sample distribution was the result of comprehensively displaying all sample points, providing the distribution of sample points obtained throughout dozens of snow events from 2011 to 2019.That is, it is possible that snow and nonsnow samples collected at different times were in the same spatial position; however, they were still considered different samples.Ultimately, a total of approximately 1 million snow and nonsnow sample points were obtained.Once the location information and time of every sample were determined, the sample datasets for the snow and nonsnow seasons were constructed separately from different data sources for feature selection and RF model training and construction in the next step.

Fig.3.Distributions of snow and nonsnow training sample points in China and its surrounding areas: (a) snow samples (363,603 samples) and(b) nonsnow samples (478,681 samples) during the snow season; and (c) snow samples (11,569 samples) and (d) nonsnow samples (154,888 samples) during the nonsnow season.

3.2 Selection of feature variables

The selection of feature variables is another important step when establishing an RF model.The selection principle applied in this study is to obtain as much information as possible from the image features contained in the FY-3B data; thus, the primary features are identified based on the following aspects:

(1) Spectral features: Following previous studies(Zhang et al., 2015; Kan et al., 2016; Chen et al., 2020),we select bands 1, 2, 3, 4, 5, 6, 9, and 10, which contain information related to snow, clouds, water, forest vegetation, shadows, and other features.Bands 1, 2, 3, 4, 5, 6,9, and 10 are denoted as Ref1, Ref2, Tb3, Tb4, Tb5,Ref6, Ref9, and Ref10, respectively, where Ref is the apparent reflectance and Tb is the brightness temperature.

(2) Band index features: Based on the apparent reflectance of the FY-3B VIRR L1 data, Ref9 and Ref6 are selected to calculate the NDSI values (Han et al., 2018);Ref1 and Ref2 are selected to calculate the NDVI values(Zeng and Yan, 2005).Finally, the Tb difference between Tb3 and Tb4 is used as an additional feature(Zheng et al., 2004; Li and Zhang, 2009).

(3) Texture features: Considering the obvious gaps between the texture features of snow and those of other ground objects (Xu and Xu, 2021), shortwave infrared Ref6, which is sensitive to snow, is selected to calculate the homogeneity (hereafter Homo) texture feature attribute within a 9 × 9 sliding window using the gray level cooccurrence matrix method (Liu and Niu, 2004).

(4) Geographical features: Considering the strong geographical characteristics of snow cover in China (Kan et al., 2016), three characteristics are selected: elevation(Dem), longitude (Lon), and latitude (Lat).

(5) Vegetation type: Considering the strong spatial heterogeneity of forestland in China and because snow under the forest is not easy to identify (Salminen et al.,2009; Wang et al., 2015), forest coverage (FC) in Fig.2b is selected.

(6) Time: Considering that there may be a relationship between snow cover and occurrence time (Zhang et al., 2018), the month when snow cover is identified is selected as a feature in this study.

In summary, 17 features are initially selected to be used in the feature importance assessment, and the appropriate features are selected according to the importance.The RF method is used to evaluate the importance of these features.This function can not only eliminate unimportant features in the training process to simplify the model, but also can be used in the model interpretation and error analysis steps.In this paper, we used the Gini index (GI) to calculate the feature importance; the equations used to calculate the importance of each feature in the RF (GIimp) are as follows:

whereKrepresents the number of categories,pkis the probability that the sample is divided into categoryk,Mrepresents that the current node hasMchild nodes,Nis the total number of samples, andnis the number of sample points in the current node.The larger the GIimpvalue of the feature is, the greater the influence the feature has on the prediction accuracy of the model.

After evaluating the importance of the 17 primary features listed above, the 4 least important features are removed, and the remaining 13 features are selected for use in the model training and construction processes.

Figure 4 shows the feature importance levels in descending order obtained for the two snow recognition models during the snow and nonsnow seasons.In both models, the importance levels of the NDSI and Ref6 exceeded 0.15, regardless of the season.The importance levels of other features differed significantly between the snow and nonsnow seasons.The third most important feature during the snow season is Lat (Fig.4a), indicating that during the snow season, Lat has an important impact on the distribution of snow in China, which is due to the influence of the monsoon climate.Snow is rarely found in low-latitude regions, and significant snow occurs only at relatively high latitudes.The third and fourth most important features during the nonsnow season are Lon and Dem (Fig.4b), indicating that Lon and Dem are important factors that limit the distribution of snow in China during the nonsnow season.

After feature selection, according to the location and time information of the selected sample points and FY-3B VIRR, Lat, Lon, Dem, and month data, 13 feature variable values and 1 target variable (snow or nonsnow category) of snow samples (363,603) and nonsnow samples (478,681) are selected (Figs.3a, b) during the snow season, and they are used as the training set for the snow season.The 13 characteristic variable values and 1 target variable (snow or nonsnow category) of the selected snow samples (11,569) and nonsnow samples(154,888) during the nonsnow season (Figs.3c, d) are used as the training set for the nonsnow season.Then, in these two training sets, the snow and nonsnow season RF models are constructed separately.

Fig.4.Importance of different features in the RF model.(a) Snow season and (b) nonsnow season.

3.3 Model training and evaluation

During the training process, the model is trained in the snow and nonsnow season sets; in this study, the accuracy rate (Ac), recall rate (Re), omission error (OE), and commission error (CE) are used to evaluate the performance of the model, and the formulas used to calculate that the metrics are given as follows:

where TP represents the number of true positive samples(correctly classified as snow), TN represents the number of true negative samples (correctly classified as nonsnow), FP is the number of false positive samples (nonsnow samples that are misclassified as snow), and FN is the number of false negative samples (snow samples that are misclassified as nonsnow).

When constructing the RF model, the most important parameters arefmaxandne, wherefmaxrepresents the maximum number of features that determine the node segmentation between different decision trees, andne(number of estimator) affects the fitting performance,and increasing the value ofmimproves the results but requiresmore computationalresources(LiawandWi ener,2002).Typically,fmaxoftheRF modelissettowherenis the number of input variables (Breiman,2001).To balance the stability, accuracy, and efficiency of the model, it is necessary to select appropriate values forfmaxandne.In addition to the two important parameters, there are other parameters, such asdmax(maximum depth of trees) andlmax(maximum number of leaf nodes of trees), that also need to be calibrated.To obtain the optimal model parameters, the parameters of the RF classifier were calibrated by using the fivefold cross-validation method.That is, in the training process, the training sets are divided into five equal parts: in each iteration,one part is used as the test set and the other four parts are used as the training set.Finally, Ac and Re of the model are calculated five times and averaged.In the process of model debugging, we calibrated the above four parameters.For the snow season model,fmax,ne,dmax, andlmaxare set to 4, 100, 50, and 250, respectively.For the nonsnow season model,fmax,ne,dmax, andlmaxare set to 4,100, 50, and 150, respectively.With the optimal parameter settings, Ac and Re values for the snow (nonsnow)season model reach 97.7% (99.7%) and 98.4% (99.1%),respectively.Thus, the constructed RF models can be used for daily snow cover identification in China.

4.Results

4.1 Typical application case analysis

Based on the snow remote sensing monitoring report released by the NSMC, we selected 3 days in the snow season (26 November 2015, 29 March 2016, and 6 April 2018) and 1 day in the nonsnow season (14 June 2015),independent of the training sample dates from daily FY-3B VIRR snow cover identification results in China from 2011 to 2020 and compared the results of a previously established RF model (RF for short) with the IMS, FY-3B MULSS, MODIS MYD10A1, and NCDC products.Among the five snow products, IMS is a fusion product,and the other four products are optical remote sensing products.In addition, since FY-3B did not release snow products after 2020, to test the performance of the RF model on real-time data beyond training samples, we usedFY-3Dsatellite data to implement snow recognition.FY-3Dwas successfully launched on 15 January 2017.The Medium Resolution Spectral Imager II (MERSI-II)mounted on theFY-3Dsatellite has 25 bands, including all bands of FY-3B VIRR, so the RF model used in this study can be applied toFY-3Ddata.The date selected for snow recognition is 20 January 2022.

To evaluate the snow recognition performance, false color RGB synthesis was performed on bands 6, 2, and 1 of the FY-3B VIRR product.In the RGB image, snow generally appeared cyan or dark cyan and was obviously different from other features; this difference is useful for comparing and analyzing the snow recognition effects of different algorithms through visual interpretations.To evaluate the effect of the RF algorithm, an accuracy comparison was made based on weather stations for different snow products in China.In addition, the average NDSI and Ref9 values of the typical selected regions were calculated for auxiliary distinguishing of snow and clouds.Generally, both clouds and snow have higher Ref9 values, but there are significant differences in the NDSI.In snow-covered areas within a certain range of values, the greater the NDSI is, the higher the snow depth, and the higher Ref9 is, the higher the snow albedo (Liang et al.,2009; Jiang et al., 2011).However, the range may not be fixed and probably varies with different environments and sensors with different spatial resolutions.

In this analysis, the NDSI and Ref9 comparison of some snow or cloud areas is performed.The snow area is mainly determined by the snow area of the IMS product combined with the visual interpretation of the RGB map,and the cloud area is mainly determined by the visual interpretation of the RGB map combined with the cloud identification area of the MYD10A1 product.In addition,the cloud recognition results from MULSS and NCDC are used for auxiliary analysis of accuracy.

4.1.1Results derived for 26 November 2015

Figure 5 shows the FY-3B VIRR RGB false color composite map, snow depth at weather stations, snow recognition accuracy, and the snow recognition results of 5 snow cover products (IMS, RF, MULSS, MYD10A1,and NCDC) on 26 November 2015.

Snow is detected at 636 weather stations in China: stations 587, 516, 154, 222, and 453, which are located on IMS, RF, MULSS, MYD10A1, and NCDC snow cover,respectively.The IMS product had the highest recognition accuracy (92.3%) due to its all-weather multisource data fusion advantages; therefore, the snow recognition results of the other four optical remote sensing products were almost within the range of IMS snow cover.The RF product with a snow recognition accuracy of 81.8% was the best among the four optical remote sensing products.The MULSS product was the lowest with an accuracy of 24.2%, exhibiting obvious leakage phenomena in central Inner Mongolia (the blue circle in the RGB image in Fig.5) and northern Heilongjiang (Area B in the RGB image in Fig.5).The MOD10A1 products also exhibited obvious snow leakage phenomena in northern Hebei and central Inner Mongolia with an accuracy of only 34.9%.

To assess the snow distribution results in greater detail, the snow recognition results derived for central Gansu (Area A) and northeastern forest (Area B) in the RGB image shown in Fig.5 were selected for a comparative analysis.As shown in Fig.6, snow cover was distributed in northern, central, and southeastern Area A (in cyan in the RGB image), while it was lighter in the northern snow region (average NDSI of 0.310) than in the southern snow region (average NDSI of 0.478).Although the IMS snow product identified almost all the snow, snow was overestimated in northwestern and central China.The RF algorithm identified most snow areas well, covering the 3 snow monitoring weather stations in this area; the MYD10A1 product also identified most snow areas, but the recognition performance was poor in the northern shallow snow region (snow depth ≤ 1 cm),and neither the MULSS nor NCDC product could effectively identify the northern shallow snow.For Area B, the IMS snow product covered all climate sites and identified all snow cover, but there was a significant overestimation of snow cover compared to the RGB map.Although the snow depth of most understory snow (dark cyan in the RGB image) was high (7-8 cm), the NDSI of the understory snow area (mostly < 0.40) was less than that of other snow cover areas (cyan in the RGB diagram,NDSI > 0.45).The RF algorithm recognized most of these snow areas well.The MYD10A1 product recognized understory snow well, but it classified snow in many other areas as clouds; while the NCDC product failed to recognize most of the understory snow.The snow-recognition performance of the MULSS product was the worst.Overall, the RF algorithm effectively identified both shallow snow and understory snow in China on 26 November 2015.

4.1.2Results derived for 29 March 2016

Figure 7 shows the FY-3B VIRR RGB false color composite map, snow depth at weather stations, snow recognition accuracy, and the snow recognition results of the RF model and the other four snow cover products (IMS,MULSS, MYD10A1, and NCDC) on 29 March 2016.

Fig.5.Comparisons of FY-3B VIRR false color composite RGB images, snow depth of weather stations, snow recognition accuracy, and overlaid snow cover results derived from different snow products in China on 26 November 2015.(a) RGB image, (b) IMS, (c) RF, (d) MULSS, (e)MYD10A1, and (f) NCDC.

Fig.6.Comparisons of FY-3B VIRR false color composite RGB images and overlaid snow cover results derived from different snow products in (a1-f1) Area A and (a2-f2) Area B on 26 November 2015.(a) RGB image, (b) IMS, (c) RF, (d) MULSS, (e) MYD10A1, and (f) NCDC.

There are 51 weather stations in China that observed significant snow cover, but only 31 of them are covered by IMS snow products.The accuracy of the IMS snow product was 60.8%.The weather stations showed the presence of snow in northeastern Inner Mongolia and central Sichuan, but the IMS product did not.This is probably because the microwave snow products fused by IMS cannot effectively monitor local snow due to their coarse spatial resolution (usually 10-36 km).The accuracy of the other four optical products was very low, and the RF product with the highest accuracy was only 35.3%.The RF algorithm identified most of the snow on the central Qinghai-Tibetan Plateau and in northern Xinjiang, but it led to some misclassifications in the northeast region; that is, some clouds were mistakenly classified as snow, while the other three products showed obvious omissions on the Qinghai-Tibetan Plateau and in northern Xinjiang (the blue circle in the RGB image in Fig.7), and the accuracies were all below 10%.

Fig.7.As in Fig.5, but for 29 March 2016.

Fig.8.As in Fig.6, but for 29 March 2016.

Due to the unique topographic (altitude with more than 4000 m in Fig.2a) and climatic characteristics of the Qinghai-Tibetan Plateau, this region has become an important area in snow cover research.Here, western Area C and central Area D of the Qinghai-Tibetan Plateau (the extents of which are shown in the RGB image in Fig.7)were selected for further analysis (Fig.8): a large snowcovered area was distributed in western Area C on the Qinghai-Tibetan Plateau (cyan in the RGB image, NDSI larger than 0.45 and Ref9 larger than 0.55); obvious clouds occurred in the northeast area (gray-white in the RGB image, NDSI below 0.40 and Ref9 greater than 0.50), and scattered cloud patches occurred in the central region (bright white in the RGB image, NDSI below 0.35 and Ref9 greater than 0.70).Almost all snow was effectively recognized by the IMS product, RF model, and NCDC product, while the MULSS and MYD10A1 products all had obvious omissions because these regions were marked as clouds according to the cloud recognition results of the products.However, the RF model misclassified many clouds in the central and northeast areas as snow.For Area D, by combining the results with the snow remote sensing monitoring report (Fig.1b), a large snow-covered area (cyan and gray-cyan in the RGB image, an average NDSI of 0.523 and an average Ref9 of 0.497) and relatively broken cloud blocks (white in RGB image, an average NDSI of 0.343 and an average Ref9 of 0.659) was observed on central Qinghai-Tibetan Plateau.The IMS product and RF model identified almost all snow, including shallow snow with a depth of only 1 cm.The MYD10A1 product also identified most of the snow and had leakage due to misclassifying part of the snow cover as clouds, while the other two products had missed points, especially for low albedo snow (gray-cyan in the RGB image, which refers to snow with NDSI below 0.40 and Ref9 below 0.40).These results demonstrate that the RF model can identify the snow-covered areas on the Qinghai-Tibetan Plateau well and its performance is clearly better than other products for recognizing shallow snow and low albedo snow; however, the model do misclassify some clouds as snow.

4.1.3Results for 6 April 2018 and 14 June 2015

Figure 9 shows the FY-3B RGB false color composite map, snow depth at weather stations, snow recognition accuracy, and snow recognition results obtained by using the RF model and two other snow cover products(IMS and MYD10A1) on 6 April 2018 and 14 June 2015.

Fig.9.As in Fig.5, but for (a1-c1) 6 April 2018 and (a2-c2) 14 June 2015.

On 6 April 2018, 126 climate stations in China detected significant snow cover, and the IMS snow product only identified 66 of them; that is, the snow monitoring accuracy of the IMS snow product was only 52.4%.Although most of the snow-covered areas of the RF and MYD10A1 products were within the IMS area, the identifiable area significantly reduced, probably due to the influence of clouds and rain, and the snow recognition accuracies were 3.2% and 1.6%, respectively.

On 14 June 2015, only one weather station detected obvious snow cover, while the IMS product showed that snow cover mainly appeared on the Qinghai-Tibetan Plateau.The RF and MYD10A1 products detected the same snow cover with a small area.

Area E in central Inner Mongolia on 6 April 2018 and Area F in central Gansu on 14 June 2015 are selected for further analysis in Fig.10.Area E is a cloud-snow mixed area, and 21 weather stations detect snow in this area.The IMS snow product identified most snow, covering 20 weather stations.The RF model identified part of the snow, covering 9 weather stations, including some shallow snow with a depth of no more than 1 cm, but some snow cover areas with low albedo (gray-blue in RGB,Ref9 less than 0.40) were omitted, while other products failed to identify the main snow cover area because they classified most snow as clouds based on the cloud identification results of the products.By combining the satellite remote sensing monitoring report of the NSMC obtained on 14 June 2015 (Fig.1a), a large snow-covered area was observed in central Area F in northern Qinghai(cyan and dark cyan in the RGB image, an average NDSI of 0.718 and an average Ref9 of 0.562), but many clouds (the mainly white regions in the RGB image,some light cyan, NDSI below 0.60 and Ref9 greater than 0.70) also persisted in this region.The IMS snow product and RF model identified most areas of snow, while the MULSS, MYD10A1, and NCDC products missed large snow-covered areas because they could not distinguish between snow and clouds.These results demonstrate that the RF model can distinguish clouds and snow cover well, including some shallow snow.

Fig.10.Comparison of FY-3B VIRR false color composite RGB images and overlaid snow cover results derived from different snow products in (a1-f1) Area E on 6 April 2018 and (a2-f2) Area F on 14 June 2015.(a) RGB image, (b) IMS, (c) RF, (d) MULSS, (e) MOD10A1, and (f)NCDC.

4.1.4Results derived for 20 January 2022

Figure 11 shows an overlaid display of theFY-3DMERSI-II false color RGB composite image, snow depth at weather stations, snow recognition accuracy, and the snow-recognition results of the RF model, IMS, and MYD10A1 products on 20 January 2022.There are 282 weather stations in China that have observed significant snow, and the IMS product covers 259 of them.The snow monitoring accuracy of the IMS product reached 91.8%, and the result showed that the IMS product identified most of the snow.The RF product identified the main snow cover areas with an accuracy of 68.4%, but MYD10A1 exhibited obvious leakage of snow cover in northern Xinjiang and northwestern Qinghai-Tibetan Plateau (the blue circle area in the RGB image) with an accuracy of only 28.7%.

Areas G and H in the RGB map (Fig.11a) are selected for further analysis, as shown in Fig.12.Area G had a large area of snow with a depth of 0-18 cm, including understory snow in the southwest.The IMS snow product identified all snow cover but overestimates the extent of snow cover; RF also identified all snow cover well and covered all snow monitoring weather stations,including shallow snow with a snow depth of no more than 2 cm; however, MYD10A1 had obvious leakage of understory snow and shallow snow in some plains.

Fig.11.As in Fig.5, but for FY-3D MERSI-II on 20 January 2022.

Fig.12.As in Fig.6, but for (a1-d1) Area G and (a2-d2) Area H on 20 January 2022.

There is an obvious cloud-snow mixing area in Area H (in the RGB image, snow is cyan, and clouds are white or light gray).The IMS snow product identified all snow;RF effectively distinguished clouds and snow and identified most of the snow and only missed snow in some southern areas; and MYD10A1 has exhibited obvious leakage phenomenon because it classified much snow as clouds according to the cloud identification result of MYD10A1.These results showed that the RF model established based on FY-3B data could be effectively applied toFY-3DMERSI-II, which is important for the development of subsequent FY-3 long-term series products.

4.2 Product accuracy evaluation

4.2.1Accuracy evaluation based on snow stations

To quantitatively evaluate the recognition effect of the RF model on shallow snow and understory snow with forest coverage greater than or equal to 0.30, a comparison was made based on snow depth data from weather stations for different snow products in China, including shallow snow (snow depth not larger than 1 cm), medium snow (snow depth greater than 1 cm but less than 10 cm),and thick snow (snow depth greater than or equal to 10 cm), as well as snow recognition results with forest coverage greater than 0.30, as shown in Table 3.

Based on the results of snow recognition at different depths by various products, we found that as the snow depth increased, the snow recognition rates of various products all increased significantly.Among them, the IMS product has the highest recognition accuracy for shallow, medium, and thick snow due to its all-weather multisource data fusion advantages (except on 29 November 2015, when the NCDC product had the highest recognition accuracy of 94.4% for thick snow).The RF product had the second highest recognition accuracy, which was significantly higher than that of the MULSS, MYD10A1, and NCDC products.For shallow snow, the RF product had a recognition accuracy of 59.4% on 29 November 2015, which was significantly higher than that of the MULSS (12.5%), MYD10A1(27.1%), and NCDC (32.3%) products.On 29 March 2016, the RF product had the highest recognition accuracy of 35.0% for shallow snow among all products.Although the recognition accuracy of the RF product for shallow snow was significantly low on 6 April 2018 and 20 January 2022, at 4.4% and 23.3%, respectively, it was still significantly higher than that of the MULSS product(0), MYD10A1 product (2.2% and 13.3%), and NCDC product (0).These results indicate that the RF product has advantages over other optical remote sensing products in shallow snow recognition.

Table 3.Comparison of snow recognition accuracy with different snow depths and understory snow for different snow products based on weather stations in China

The results of understory snow classification by different products show that the IMS product still had the highest recognition accuracy.On 26 November 2015,and the RF product achieved a recognition accuracy of 54.5% for understory snow, which was less than that of the NCDC product (59.1%).On 29 March 2016 and 6 April 2018, the recognition accuracies of the RF product were 50.0% and 18.2%, respectively, and the MULSS,MYD10A1, and NCDC products were unable to identify the understory snow.On 20 January 2022, the recognition accuracy of the RF product reached 71.0%, significantly higher than that of the MYD10A1 product (29.0%).These results indicate that the RF product generally has advantages over other optical remote sensing products in understory snow recognition.

4.2.2Spatial consistency evaluation

In the previous analyses, the IMS snow product identified the largest range of snow that was not affected by weather conditions.Therefore, based on the IMS snow products, the spatial consistency ratios between the IMS and the other four products in China and the Qinghai-Tibetan Plateau region were calculated, as shown in Fig.13.Note that there were no data from MULSS products or NCDC products on 20 January 2022.There were no IMS snow cover data with a spatial resolution of 1 km before December 2014.

In general, in China (Fig.13a), the spatial consistency ratio between RF products and IMS products was the highest for all dates except for 14 June 2015, followed by NCDC products, MYD10A1 was third, and MULSS products were the worst.The spatial consistency of RF products and IMS products reached 59%, indicating that at least 40% of the snow cover in optical remote sensing could not be detected due to the influence of clouds and rain or other factors.

For the Qinghai-Tibetan Plateau (Fig.13b), the spatial consistency of the RF and IMS products was significantly higher than that of the MYD10A1 and MULSS products but lower than that of the NCDC.This may be because the NCDC combines MODIS/Terra and MODIS/Aqua data to better remove clouds over the plateau area and thus more effectively identified snow cover.

4.2.3Snow cover accuracy evaluation

Due to the uneven distribution and limited representative range of weather stations, the monitoring accuracy of 1-km snow products based on weather stations is very uncertain.For example, Figs.5-11 show that the snow recognition accuracy of the MYD10A1 product based on snow monitoring stations is only 1.6%-34.9%, and the highest recognition accuracy for thick snow cover in Table 3 does not exceed 60%.This finding is notable for the MYD10A1 product, which has already been widely used internationally as a mature snow product.Therefore,the accuracy evaluation method based on meteorological stations could not effectively evaluate snow products.To this end, several methods are used to evaluate the accuracy of RF snow products.Data collected over 7 days were selected to perform an accuracy evaluation, and these 7 days were independent of the training sample dates.For these 7 images, the main areas in which snow was identified in China were selected and randomly sampled based on a proportion of 2%-4% to generate snow and nonsnow samples, and each scene with more than 2000 samples was evaluated by visual interpretation based on snow remote sensing monitoring reports comprising the monitoring area released by the NSMC and IMS snow products.The specific evaluation results are listed in Table 4.

Fig.13.Comparison of spatial consistency ratios between IMS snow products and other snow products in (a) China and (b) the Qinghai-Tibetan Plateau region.

During the snow season (20 March 2013, 8 February 2014, 26 November 2015, 29 March 2016, and 6 April 2018), Ac was significantly lower than Re.The previous analysis showed that the RF algorithm used in this study had a very low OE but had a certain CE; thus, Re was high because the Ac results include CE, and Ac was lower than Re.For example, on 29 March 2016, the Re value was 96.8%, while the Ac value was only 84.5%.During the nonsnow season (14 June 2015), because the proportion of snow samples was much smaller than that of nonsnow samples, the nonsnow sample recognition rate was significantly higher than the snow sample recognition rate, so Ac was higher than Re.The RF algorithm constructed in this paper achieved an average Re value of 93.8% and an average Ac value of 90.6%, while the average OE and CE were 6.2% and 11.1%, respectively.These results met the accuracy requirements for longterm snow-coverage products.

Table 4.Snow cover recognition accuracies in China based on the FY-3B RF algorithm

5.Conclusions and discussion

5.1 Conclusions

In this study, we implemented snow cover recognition in China using the RF algorithm, which performed well in applications in the information extraction field.Based on FY-3B VIRR L1 data and the other data, 13 characteristic variables were selected to construct 2 RF snow recognition models in the snow and nonsnow seasons.The recognition results of the RF product on 4 snow-season days and 1 nonsnow-season day were selected and compared with those of the MULSS, MOD10A1, and NCDC optical remote sensing snow products, using snow depth weather station data, and the IMS synthesized product.

The model yields good recognition results for shallow snow, understory snow, low albedo snow, and snow on the Qinghai-Tibetan Plateau.The recognition accuracy of RF products based on weather stations is significantly higher than those of the other optical remote sensing snow products in most cases, whether shallow snow, medium snow, thick snow, or understory snow with forest vegetation coverage are greater than 30%.However, the RF model exhibits a certain misclassification rate.In addition, the model is applied toFY-3Ddata, which is conducive to the effective connection and continuation of subsequent long-term FY-3 snow products.The snow recognition accuracy evaluation of 7-day independent sample data shows that Re and Ac are 93.8% and 90.5%,respectively; OE and CE are 6.2% and 11.1%, respectively; these values meet the accuracy requirements of FY-3 long-time-series snow cover product production.

5.2 Discussion

Due to the important role of snow cover in the globaland regional-scale ecological environment, rapidly and accurately identifying snow cover is critical.Until now,most existing snow cover optical remote sensing products are developed based on physical snow cover models.The physical model mainly carries out snow recognition from the spectrum characteristics of snow cover.It has the characteristics of clear and understandable recognition principles, and the most important technology is the determination of spectral thresholds.To obtain accurate spectral thresholds, considering atmospheric conditions, clouds, observation conditions, regional characteristics, land cover types, and a series of complex processing methods, such as cloud removal, atmospheric correction, angle correction, and classification of surface types, are required.Affected by sensors, weather conditions, and spatial scale, whether these thresholds are suitable for large areas or complex terrains remains to be determined.Based on the NDSI and Ref9 calculated from the snow and clouds in the area selected in Figs.5-12, it was difficult to effectively distinguish snow and clouds based on a single NDSI or Ref9 threshold.For example,the average NDSI in the northern and southern snow regions in Area A in Fig.6 was 0.31 and 0.478, respectively, while the average NDSI for the broken cloud blocks in Area D in Fig.8 was 0.343, which shows that it is difficult to identify clouds and snow by spectral thresholds alone.Compared with the physical model, RF models have a “black box” feature, which means that they do not consider the mechanism process between spectral features and snow information.Instead, they establish forest trees to avoid the uncertainty of the spectral thresholds by selecting many feature variables related to snow cover and clouds, and establish many training sets for snow cover (marked as snow) and different types of clouds (all marked as nonsnow) in a large-scale area.Therefore, it can ignore the complex preprocessing procedure such as the spectral atmospheric correction and separate cloud recognition, and achieve good recognition results.Thus, it is easier and more practical to extract snow cover through machine learning techniques based on remote sensing data over a large-scale region,which is of great reference value to the production of long-term remote sensing datasets.

In addition, many snow cover products rely on cloud recognition and certain auxiliary data, such as land cover type or meteorological data, thus adding complexity to the snow cover recognition process.Due to the large uncertainties associated with the cloud recognition process itself, the analysis described above shows that the FY-3B MULSS, MODIS MYD10A1, and NCDC products all exhibit cloud misjudgments, thus resulting in missing snow points.Moreover, an increase in the use of auxiliary data also introduces multiple sources of error.When these errors are large, it can be difficult to judge the source of the errors.The RF algorithm utilized in this paper selects many different types of cloud samples during the training sample selection process, thus optimally overcoming the cloud recognition problem.Moreover,the algorithm used in this study is based only on FY-3B L1 remote sensing data and does not borrow other auxiliary data, thus further reducing the sources of errors.This algorithm is simpler and more convenient than other algorithms and is also conducive to further improvements and operation applications.In addition, the RF model based on FY-3B VIRR has also been successfully applied to theFY-3DMERSI, which has spectral characteristics similar to those of FY-3B, demonstrating its high robustness and generalized performance.

Because the RF algorithm is based on the selection of suitable feature bands and many training sets, there are some potential limitations to the results of this study in the following perspectives.

(1) Feature band selection.Most of the feature bands selected in this study were spectral features, and it was still difficult to effectively identify the “different objects with the same spectrum” phenomenon in these features,especially when snow and clouds have similar spectral characteristics, which will inevitably lead to some snow being misclassified as clouds based on this algorithm.In the future, the participation of other nonspectral features will still be needed to further reduce the snow cover misclassification rate.In addition, in our RF algorithm, the feature variables were selected by evaluating the order of importance of the feature variables through the Gini index, which will likely result in a high correlation among the selected feature variables, leading to redundant predictor variables affecting the efficiency and accuracy of the model.For example, the Pearson correlation coefficient of the NDSI and Ref6 (ranking first and second in importance, respectively, in Fig.4a) in the snow cover model was 0.66, while the Pearson correlation coefficients of Ref2, Ref9 and Ref1 were 0.96 and 0.97, respectively.Therefore, adopting more optimized feature selection algorithms, such as the Max-Relevance and Min-Redundancy (mRMR) method (Ding and Peng,2005), which can maximize the correlation between features and classification variables and minimize the correlations among features, may help to improve the efficiency and accuracy of the model.

(2) Applicability of the RF model.Compared with the physical model with a clear processing procedure, due to the black-box characteristics of the RF model, the modeler cannot understand the internal operation of the model and can only continuously debug between different feature parameters and target variables.In addition, many high-quality and representative samples are needed to train the model, which results in a high cost.The training samples selected in this study were mainly concentrated in China and the surrounding areas.Therefore,these results are not sufficient to represent the globe.To extend this algorithm to the global scale, many samples from other regions must be collected.

(3) Fitting of the RF model.The test set Ac and Re values at the time of establishment of the RF model reached 97% or even above 99%, but the average Ac and Re values applied to the independent training samples were below 94% and even below 88% in individual cases.This finding shows that the training sets may not fully cover all existing scenarios; on the other hand, the model may exhibit overfitting due to excessive noise, although the RF model is theoretically not prone to overfitting.Therefore, in the process of establishing the RF model, the overfitting phenomenon must be carefully examined and addressed.

(4) Accuracy evaluation.In this study, both the sample selection and accuracy evaluation processes were based mainly on the pixel scale with a resolution of 1 km.The analysis results show that the accuracy evaluation based on the weather station could not effectively evaluate the snow recognition results at the scale of 1 km, creating challenges for the accuracy evaluation of different snow products, especially for long-time series snow products.

Moreover, although many references have been used,there is still a certain degree of subjectivity when performing visual interpretations and comparative analyses.In particular, deviations may arise in the distinction between snow cover and some clouds; and thus, some uncertainty still exists in the accuracy evaluation results.High-resolution optical satellite images can be used to address the uncertainty because they fit better with the“ground truth” than medium- and coarse-resolution images.However, due to limited coverage, long revisit periods, and the influence of clouds and rain, they can only be used to evaluate limited ranges and targets, which greatly limits the accuracy evaluation of snow products over a large range.For example, on 26 November 2015(the case in this study), 253 Landsat-8 images with a resolution of 30 m were needed to cover all of China.Determining how to use these massive data to assess snow cover over a large area is still a challenge, especially on a long timescale.

Furthermore, due to the impact of the revisit cycle of Landsat-8, Areas A and B (both 160,000 km2) selected in Fig.5 on 26 November 2015, did not have the appropriate clear sky Landsat-8 images (one scene only covers approximately 35,000 km2) or only covered a small part of the area, which could be overcome by finding other high-resolution images.Therefore, in the future, it is very important to develop a set of regional and even global high-resolution long-time series snow cover verification databases from many historical high-resolution images to precisely evaluate long-time series snow products.

In addition, there may have been inconsistencies in validation between the IMS product and RF recognition results.If one pixel is classified as a snow pixel under clouds but is classified as a cloud pixel by the RF model,this indicates that the RF model recognition is correct.However, when comparing this recognition result with that of the IMS product, the recognition result is not correct, as the identified image is an optical image that cannot penetrate clouds to identify snow cover, unlike in the RF model algorithm.

Acknowledgments.We thank the China National Satellite Meteorological Centre for providing the FY-3B andFY-3Ddata, the LP-DAAC and MODIS science teams for providing free MODIS MYD10A1 products,the Interactive Multisensor Snow and Ice Mapping System (IMS) for providing IMS snow products, and the National Cryosphere Desert Data Center for providing snow products.