A Difference Enhanced Neural Network for Semantic Change Detection of Remote Sensing Images

Renfang Wang, Hucheng Wu, Hong Qiu, Xiufeng Liu, Feng Wang, Xu Cheng

Research output: Contribution to journalJournal articleResearchpeer-review

140 Downloads (Pure)

Abstract

Change detection (CD) is to obtain the changed objects by joint analysis of two (or more) remote sensing images (RSIs) that obtained in the same area and at different times [1], [2], [3], and it has been applied to various kinds of real-world applications including land and resource survey, environmental monitoring, and urban management [4], [5].

Deep learning is a type of powerful numerical tools for extracting features and has been very popular in the community of CD. Zhang and Lu [6] proposed a spectral–spatial joint learning network using a Siamese convolutional neural network (CNN) to extract a dual-temporal spectral–spatial joint representation. To address the lack of resilience to pseudovariation information, Chen et al. [1] introduced a dual-attentive fully convolutional Siamese neural network (DASNet) that employs weighted double marginal contrast loss (WDMC) to solve the sample imbalance issue. Chen et al. [7] and Dong et al. [8] proposed to improve the capacity to describe contextual information in the network with a dual-time spatial and temporal domain-based Transformer.

Although binary CD (BCD) can provide the information about the location and the geometry of changes, the information is usual coarse-grained, and it cannot describe the types of changes. On the contrary, semantic CD (SCD) approaches can provide the location and the geometry of changes as well as the types of changes. Yang et al. [9], Ding et al. [10], and Mou et al. [11] introduced a triple-branch CD paradigm in which two semantic segmentation branches divide the dual-temporal pictures into land cover and land use (LCLU) mappings, respectively, and the third one is used to identify the changes. Yang et al. [9] developed an asymmetric Siamese network (ASN) for locating and identifying semantic changes by incorporating gating and weighting schemes into the decoder. Ding et al. [10] discussed the possible network architecture for SCD and demonstrated that the late fusion method of separating semantic segmentation tasks and CD tasks (SSCD1) is appropriate for SCD. Yang et al. [9], Ding et al. [10], and Daudt et al. [12] designed deep neural networks with three CNNs that can extract the semantic information and changes individually.

The above-mentioned SCD approaches have shown outstanding performance, however, two main issues are still remained: 1) for large-scale variation in RSIs, existing models are not sensitive enough to the edge of the changed objects. False alarms and missed alarms often occur at the edge of changed targets and 2) many of them failed to capture the tiny discontinuous changes (e.g., vegetation degradation) in localized objects. Additionally, the nonlocal block [13] can obtain the global correlation using a self-attention mechanism, which benefits to capture long-range dependencies among inputs. Lei et al. [2] and Yuan et al. [14] improved the feature extraction by increasing the size of receptive field.

In remainder of this letter, we first report how to construct DESNet by embedding the DE module into the adjacent layers of ResNet to ensure the network can focus on the changes of bitemporal RSIs, and we then introduce the spatial–spectral nonlocal (SSN) module, combines multiscale spatial global features to simulate large-scale variations, to enhance the integrity of the changed objects. Finally, the performance of the experimental tests on the SECOND dataset and the Landsat-SCD dataset demonstrates the superiority of DESNet in terms of SCD accuracy and preservation of the integrity of changed objects.
Original languageEnglish
JournalIEEE Geoscience and Remote Sensing Letters
Volume20
ISSN1545-598X
DOIs
Publication statusPublished - 2023

Fingerprint

Dive into the research topics of 'A Difference Enhanced Neural Network for Semantic Change Detection of Remote Sensing Images'. Together they form a unique fingerprint.

Cite this