Geometric-Spectral Reconstruction Learning for Multi-Source Open-Set Classification With Hyperspectral and LiDAR Data
2022-10-26LeyuanFangDingshunZhuJunYueBobZhangandMinHe
Leyuan Fang,, Dingshun Zhu, Jun Yue,Bob Zhang,, and Min He,
Dear editor,
This letter presents an open-set classification method of remote sensing images (RSIs) based on geometric-spectral reconstruction learning. More specifically, in order to improve the ability of RSI classification model to adapt to the open-set environment, an openset classification method based on geometric and spectral feature fusion is proposed. This method proposes to realize RSI open-set classification based on geometric and spectral features with hyperspectral and light detection and ranging (LiDAR) data for the first time. In a variety of data sources of remote sensing, hyperspectral images (HSIs) and LiDAR data can provide rich spectral and geometric information for target objects. This letter combines both HSIs and LiDAR data to realize the recognition of unknown classes and the classification of known classes. Experiments show that the proposed method is better than previous state-of-the-art methods.
With the development of deep learning, the performance of RSI classification has been rapidly improved. However, most of the existing RSI classification has a premise assumption, that is, the object classes in the test set are also in the training set. Generally speaking,the classification task under this assumption is called closed-set classification (CSC) [1], [2]. In fact, in many application scenarios, this assumption does not accord with the fact. In the actual scenario, the object classes in the test data set may not exist in the training set. In open-set environment (OSE), the set of object classes in the training set is a subset of the set of object classes in the test set. Because the object classes on Earth are very rich and changing dynamically, collecting all object classes on Earth is an almost impossible task.Therefore, the RSI classifier will inevitably encounter the problem of dealing with unknown open classes. How to effectively deal with a large number of unknown classes is a key step to promote the practical application of RSI classification methods.
Obviously, since the training set can hardly cover all object classes in the actual scene, the practicability of the model trained in closedset environment (CSE) will be greatly limited in OSE. The problem to be studied in this letter is to accurately classify the known classes while identifying the unknown open classes. This task is also called open-set classification (OSC) [3], [4]. The open-set classifier trained under OSE can effectively restrict known classes in the feature space,so as to detect samples that do not belong to known classes and realize an accurate unknown recognition. Most of the existing OSCs completely rely on training using the training set within the known classes [5]. However, in this case, the trained feature extractors tend to extract features that can be used to accurately classify known classes, while the features that can help classify these known classes may not help identify unknown open classes [6]–[8]. Therefore, this approach may ignore the features that can help reject unknown open classes. For OSC, how to fully retain the features that can be used to identify unknown open classes is very important.
There may be such a phenomenon in HSIs: Objects with the same spectrum belong to different classes, so it is possible to incorrectly classify unknown classes into known classes or misclassify known classes into unknown classes during open-set HSI classification [9],[10]. Therefore, this letter uses LiDAR data and hyperspectral data for OSC. LiDAR data can provide geometric texture features of ground objects [11], and combined with spectral information, it can better realize the rejection of unknown classes and the identification of known classes. Since the preservation of features that can be used to reject unknown classes is particularly important in OSC, we can design a framework to retain the most important spectral and geometric features in hyperspectral data and LiDAR data.
In this letter, we propose a novel open-set RSI classification method called geometric-spectral reconstruction learning (GSRL).Our goal is to learn efficient geometric-spectral feature representations for each sample, and retain features that are capable of classifying known classes and identifying unknown open classes. By adding an unsupervised regularizer, the learned geometric-spectral representation can retain the features useful for identifying unknown open classes and classifying known classes at the same time. The proposed GSRL method is mainly composed of two modules: geometric-spectral reconstruction module and geometric-spectral openset adaptation module, as shown in Fig. 1. The geometric-spectral reconstruction module generates the geometric-spectral representation and the reconstructed geometric-spectral feature matrix through the geometric-spectral encoder and decoder networks. Then, the mean absolute error (MAE) between the reconstructed and the original geometric-spectral feature matrices is calculated. The geometricspectral open-set adaptation module performs extreme value analysis on the generated MAEs and generates a cumulative distribution function of the Weibull distribution [1]. For each test sample, the cumulative probability of the Weibull distribution is used for adaptive adjustment to realize the rejection of the unknown classes. Through the above two modules, the geometric and spectral features of each sample can be retained, so as to help identify the unknown classes and classify known classes.
Geometric-spectral reconstruction learning:
Fig. 1. Overview of the proposed geometric-spectral reconstruction learning for multi-source open-set classification with hyperspectral and LiDAR data.
Algorithm 2 Geometric-Spectral Open-Set Adaptation Module Input:The number of training and testing instances ,IτG NTR NTS The training geometric-spectral instances IυGS, υ ∈[1,NTS]The testing geometric-spectral instances S, τ ∈[1,NTR]The number of tails σ LτGS Iτ 1: Compute geometric-spectral MAE for by Algorithm 1 Wcdf GS α,β(·) L1GS,L2GS,...,LNTR GS 2: Weibull Fitting = FitHigh([ ], σ)υ NTS 3: For =1,…, do LυGS Iυ 4: Compute geometric-spectral MAE for by Algorithm 1 yIυ GS GS Iυ GS 5: Calculate the predicted label of by (3)Wcdf α,β 6: Calculate the cumulative probability Wcdf α,β))>γ 7: If then(LυGS(LυGS 8:9: End if 10: End for yIυG S =unknown 11: Return yIυ G S, υ ∈[1,NTS]
Algorithm 1 Geometric-Spectral Reconstruction Module Input:IGS ∈RU×U×(T+1)A given geometric-spectral instance Geometric-spectral encoder network Geometric-spectral decoder network FθEN(·)DθDE(·)1: Generate geometric-spectral representation of by (2)Rins GS ΘGS IGS 2: Generate reconstructed geometric-spectral matrix by (4)IGS Rins GS LGS 3: Compute the MSE between and the reconstructed geometric-spectral matrix as the geometric-spectral reconstruction loss 4: Return LGS
Experiments: In order to verify the effectiveness of the proposed method, comparative experiments are conducted on two popular HSI datasets with corresponding LiDAR Data in the OSE, including the Houston (HU) dataset and the Trento (TR) dataset. In this experiment, two different openness environments (i.e., low openness and high openness) are created. Among them, the low openness environ-
1) Datasets: HU dataset: The spatial resolution and image size of the HSI and the corresponding LiDAR derived DSM are 2.5 m and 349×1905 [18], [19]. For the HSI, after preprocessing (attitude processing and radiation correction), 144 spectral channels are left for experiments, and the wavelength range is 0.38 μm to 1.05 μm. This dataset contains 15 object classes. In order to conduct the open-set classification experiment, grandstand is additionally labeled as a new class and the remaining 15 classes are selected as known classes. Fig. 2 shows the color composite image, the ground-truth map and the legend of the HU dataset.
Fig. 3. The composite image, the ground-truth map and the legend of the TR dataset.
Fig. 2. The composite image, the ground-truth map and the legend of the HU dataset.
TR dataset: The spatial resolution and image size of the HSI and the corresponding LiDAR derived DSM are 1 m and 600×166. For the HSI, it has 63 spectral channels with wavelengths ranging from 402.89 nm to 989.09 nm. This dataset contains 6 object classes (i.e.,wood, buildings, roads, ground, apple trees, and vineyard). In order to conduct the open-set experiment, two classes (i.e., grass and soil)are additionally labeled as new classes and the remaining 6 classes are selected as known classes. Fig. 3 shows the color composite image, the ground-truth map and the legend of the TR dataset.
2) Parameter settings: At the beginning of model training, the learning rate is set to 0.5. After every 200 epochs, the learning rate is multiplied by 0.1. In this experiment, the number of tails σ is set to 40 [6], [16] and the number of principal componentsTis set to 2.The backbone used in GSRL is the same used in [16]. The computer environment for model training is as follows: the processor is Intel i9-10850K; the graphics card is NVIDIA GeForce RTX 3090 with CUDA 11.0; the programming language and the deep learning platform are Python (version 3.8.8) and PyTorch (version 1.7.1), respectively.
3) Accuracies: To verify the performance of the open-set classification method with multi-source data, we conducted several experiments on two datasets. In this experiment, we use three indicators to verify the performance of the classification methods, that is, the overall accuracy (OA), average accuracy (AA), and Kappa coefficient(κ). In order to avoid extreme values, we conducted five repeated experiments and reported the average accuracies and standard deviations. To verify the effectiveness of the proposed method with limited training samples, L samples of each class are randomly selected.In this experiment, 20 samples from each known class are training samples (L = 20). We compare several state-of-the-art (SOTA) classification methods with the proposed method to verity the performance.
The open-set classification methods for comparison include classification-reconstruction learning for open-set recognition (CROSR)[6] and multitask deep learning method for the open world(MDL4OW) [16]. The multi-source method for comparison is spectral-spatial residual network (SSRN) [20]. For SSRN, if the maximum SoftMax value corresponding to a test sample is less than 0.5,this sample is determined as unknown class. These comparison methods include both SOTA open-set and joint HSI and LiDAR data classification methods. Therefore, these methods can be used to comprehensively verify the performance of the proposed method.
The accuracies in terms of OA, AA and κ of the proposed method and comparison methods are reported in Table 1. For HU dataset,compared with the MDL4OW method, the OA, AA and κ of the proposed GSRL method increase by 6%, 4.3%, and 5.6%, respectively.The classification maps of the HU dataset are shown in Fig. 4. For TR dataset, compared with the MDL4OW method, the open OA and AA are increased by 7.3% and 10%, respectively. Compared with the CROSR method, the κ of GSRL is increased by 11%. The classification maps of the TR dataset are shown in Fig. 5.
Table 1.Classification Results of the Proposed Method and Several SOTA Methods. The Highest Accuracies are Highlighted in Bold
In order to verify the effectiveness of each module in GSRL, an ablation study is conducted on HU dataset. After removing the geometric-spectral reconstruction module, the OA is decreased by 2.5%.After removing the open-set adaptation module, the OA is decreased by 2.1%. It can be concluded that each module in the proposed method contributes to the improvement of the accuracy.
Conclusions: Although the method based on deep learning has achieved success in RSI classification, it still lacks robustness in dealing with unknown classes in OSE. The existing open-set classification methods rely on training with known samples in the training set, which will lead to the learned features tending to retain the features that help to classify the known classes and ignore the information that can be used to reject the unknown classes. In order to improve the adaptability of RSI classification methods in the new environment and make full use of the characteristics of hyperspectral and LiDAR data, an open-set classification method for geometricspectral feature reconstruction and pixel by pixel classification in OSE is proposed. By reconstructing the geometric-spectral feature, it can retain the features useful for classifying known classes and rejecting unknown classes at the same time, and enhance the ability to separate unknown classes from known classes. Through experiments on two multi-source datasets, the performance of the proposed method is better than existing open-set classification methods. In the future research, we will continue to study how to achieve target detection, instance segmentation, significance detection and other tasks in the presence of unknown classes.
Fig. 4. The classification maps of the HU dataset (black represents unknown classes). (a) SSRN; (b) CROSR; (c) MDL4OW; (d) The proposed GSRL.
Fig. 5. The classification maps of the TR dataset (black represents unknown classes). (a) SSRN; (b) CROSR; (c) MDL4OW; (d) The proposed GSRL.
Acknowledgments: This work was supported in part by the National Natural Science Foundation of China (61922029, 62101072),the Hunan Provincial Natural Science Foundation of China (2021JJ 30003, 2021JJ40570), the Science and Technology Plan Project Fund of Hunan Province (2019RS2016), the Key Research and Development Program of Hunan (2021SK2039), and the Scientific Research Foundation of Hunan Education Department (20B022, 20B157).
杂志排行
IEEE/CAA Journal of Automatica Sinica的其它文章
- Distributed Cooperative Learning for Discrete-Time Strict-Feedback Multi Agent Systems Over Directed Graphs
- An Adaptive Padding Correlation Filter With Group Feature Fusion for Robust Visual Tracking
- Interaction-Aware Cut-In Trajectory Prediction and Risk Assessment in Mixed Traffic
- Designing Discrete Predictor-Based Controllers for Networked Control Systems with Time-varying Delays: Application to A Visual Servo Inverted Pendulum System
- A New Noise-Tolerant Dual-Neural-Network Scheme for Robust Kinematic Control of Robotic Arms With Unknown Models
- A Fully Distributed Hybrid Control Framework For Non-Differentiable Multi-Agent Optimization