集成空间变换结构与深度残差网络的遥感影像场景分类方法

孟亦菲; 郑贵洲; 冀炜臻

doi:10.3799/dqkx.2021.218

集成空间变换结构与深度残差网络的遥感影像场景分类方法

doi: 10.3799/dqkx.2021.218

孟亦菲^1, ,,
郑贵洲^1, , ,,
冀炜臻²

1.
中国地质大学地理与信息工程学院, 湖北武汉 430078
2.
江西理工大学土木与测绘工程学院, 江西赣州 341000

基金项目:

国家自然科学基金重点项目 42130309

山西省大同经济技术开发区城市地质调查项目 2022030115

详细信息

作者简介:
孟亦菲（1998—），女，硕士，主要研究方向为深度学习、遥感场景分类. ORCID：0000-0002-5699-7837. E-mail：cugmyf@cug.edu.cn

通讯作者:
郑贵洲, ORCID: 0000-0002-2890-6395. E-mail: zhenggz@cug.edu.cn

中图分类号: P237
计量
- 文章访问数: 760
- HTML全文浏览量: 529
- PDF下载量: 31
- 被引次数: 0
出版历程
- 收稿日期: 2021-07-07
- 网络出版日期: 2023-10-07
- 刊出日期: 2023-09-25

Remote Sensing Image Scene Classification Method Integrating Spatial Transformation Structure and Depth Residual Network

Meng Yifei^{1
,
,},
Zheng Guizhou^{1
, ,
,},
Ji Weizhen²

1.
School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China
2.
School of Architectural and Surveying and Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China

摘要

摘要: 针对传统高分辨率遥感影像的场景分类效率较低，以及卷积神经网络在遥感影像场景分类上由于空间不变性而导致的分类精度不高的问题，提出了一种结合空间变换网络和迁移学习的高分辨率遥感影像场景分类算法.首先，利用ImageNet数据集训练深度残差网络ResNet101得到预训练模型，通过知识迁移提高模型目标探测效率；之后在模型中嵌入空间变换结构，使模型能够主动在空间上变换特征映射，提高模型的鲁棒性；最后，在模型中添加Dropout层减小模型出现过拟合的概率.本方法在AID和NWPU-RESISC45两种不同规模的高分遥感影像数据集上进行了验证，在只有20%训练样本的情况下仍达到了94.30%和93.63%的分类精度.实验结果表明本次改进模型具有更好的特征提取能力，针对易误分类场景的分类结果更优.
- 深度学习 /
- 残差网络 /
- 空间变换网络 /
- 迁移学习 /
- 场景分类 /
- 遥感
Abstract: In order to solve the problem that the remote sensing image with small sample set can easily lead to the over-fitting of the training model and the low classification accuracy caused by the spatial invariance of convolution neural network in remote sensing image scene classification, a high-resolution remote sensing image scene classification algorithm based on spatial transformation network and transfer learning is proposed. Firstly, the ImageNet dataset is used to train the deep residual network ResNet101 to obtain the pre-training model, and the training efficiency of the model is improved through knowledge transfer. Then, the spatial transformation structure is embedded in the model, so that the model can actively transform the feature mapping in space and improve the robustness of the model. Finally, the Dropout layer is added to the model to reduce the probability of over-fitting of the model. This method is verified on two high-score remote sensing image data sets of AID and NWPU-RESISC45, and the classification accuracy of 94.30% and 93.63% is achieved in the case of only 20% training samples. The experimental results show that the improved model has better feature extraction ability and better classification results for misclassification scenarios.
- deep learning /
- residual network /
- spatial transformation networks /
- transfer learning /
- scene classification /
- remote sensing

HTML全文

图 1 残差学习模块

Fig. 1. Residual learning module

下载: 全尺寸图片幻灯片

图 2 ResNet101的Backbone部分

Fig. 2. The Backbone part of ResNet101

下载: 全尺寸图片幻灯片

图 3 遥感影像场景分类流程

Fig. 3. Flow chart of scene classification of remote sensing image

下载: 全尺寸图片幻灯片

图 4 空间变换结构

Fig. 4. Spatial transformation structure

下载: 全尺寸图片幻灯片

图 5 Dropout原理示意

Fig. 5. Schematic diagram of Dropout principle

下载: 全尺寸图片幻灯片

图 6 AID数据集部分场景示例

Fig. 6. Example images of AID dataset

下载: 全尺寸图片幻灯片

图 7 NWPU-RESISC45数据集部分场景示例

Fig. 7. Example images of NWPU-RESISC45 dataset

下载: 全尺寸图片幻灯片

图 8 ResNet101和SF-ResNet101在AID数据集上的训练情况

Fig. 8. ResNet101 and improved ResNet101 training on AID datasets

下载: 全尺寸图片幻灯片

图 9 ResNet101和SF-ResNet101在NWPU-RESISC45数据集上的训练情况

Fig. 9. ResNet101 and improved ResNet101 training on NWPU-RESISC45 datasets

下载: 全尺寸图片幻灯片

图 10 不同Dropout率在AID数据集上的训练情况

Fig. 10. Different Dropout rate training on AID datasets

下载: 全尺寸图片幻灯片

图 11 不同Dropout率在NWPU-RESISC45数据集上的训练情况

Fig. 11. Different Dropout rate training on NWPU-RESISC45 datasets

下载: 全尺寸图片幻灯片

图 12 各网络模型易误分类场景性能对比

Fig. 12. Comparison of performance for different models on easily misclassified scene images

下载: 全尺寸图片幻灯片

表 1 不同训练比率设置下SF-ResNet101模型测试集精度对比

Table 1. Accuracy comparison of SF-RESNET 101 model test sets under different training ratio Settings

训练比率	test_acc (%)
训练比率	10%	20%	50%	80%
AID	91.65	94.30	96.52	96.81
NWPU	91.66	93.63	93.75	93.77

下载: 导出CSV

表 2 不同Dropout率测试集精度对比

Table 2. Comparsion of test accuracy of different Dropout rates

Dropout率	test_acc (%)
Dropout率	0.1	0.2	0.4
AID	94.30	94.24	94.02
NWPU	93.47	93.63	93.60

下载: 导出CSV

表 3 各网络模型在AID数据集上的分类精度

Table 3. Classification accuracy of different models on AID dataset

模型	总体精度(%)
模型	20%训练比率	50%训练比率
GoogleNet（Xia et al.，2017）	83.44±0.40	89.36±0.55
VGG-VD16+MSCP+MRA（He et al.，2018）	92.21±0.17	96.56±0.18
CNN-CapsNet（Zhang et al.，2019）	93.79±0.13	96.32±0.12
D-CNNs（Cheng et al.，2018）	90.82±0.16	96.89±0.10
ResNet101	92.32±0.23	95.49±0.38
ResNet101+STN	93.58±0.22	95.89±0.27
本文方法	94.30±0.29	96.52±0.10

下载: 导出CSV

表 4 各网络模型在NWPU-RESISC45数据集上的分类精度

Table 4. Classification accuracy of different models on NWPU-RESISC45 dataset

模型	总体精度(%)
模型	10%训练比率	20%训练比率
Fine-tuned VGGNet-16（Cheng et al.，2017）	87.15±0.45	90.36±0.18
VGG-VD16+MSCP+MRA（He et al.，2018）	88.07±0.18	90.81±0.13
CNN-CapsNet（Zhang et al.，2019）	89.03±0.21	92.60±0.11
D-CNNs（Cheng et al.，2018）	89.22±0.50	91.89±0.22
ResNet101	89.27±0.21	91.81±0.19
ResNet101+STN	90.72±0.23	92.47±0.28
本文方法	91.66±0.15	93.63±0.22

下载: 导出CSV

参考文献(46)

Berman, M., Triki, A. R., Blaschko, M. B., 2018. The Lovasz-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City. https://doi.org/10.1109/CVPR.2018.00464
Cheng, G., Guo, L., Zhao, T. Y., et al., 2013. Automatic Landslide Detection from Remote-Sensing Imagery Using a Scene Classification Method Based on BoVW and pLSA. International Journal of Remote Sensing, 34(1): 45-59. https://doi.org/10.1080/01431161.2012.705443
Cheng, G., Han, J. W., Lu, X. Q., 2017. Remote Sensing Image Scene Classification: Benchmark and State of the Art. Proceedings of the IEEE, 105(10): 1865-1883. https://doi.org/10.1109/JPROC.2017.2675998
Cheng, G., Yang, C. Y., Yao, X. W., et al., 2018. When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNS. IEEE Transactions on Geoscience and Remote Sensing, 56(5): 2811-2821. https://doi.org/10.1109/TGRS.2017.2783902
Cheng, G. X., Niu, R. Q., Zhang, K. X., et al., 2018. Opencast Mining Area Recognition in High-Resolution Remote Sensing Images Using Convolutional Neural Networks. Earth Science, 43(S2): 256-262 (in Chinese with English abstract).
Dalal, N., Triggs, B., 2005. Histograms of Oriented Gradients for Human Detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego. https://doi.org/10.1109/CVPR.2005.177
Donahue, J., Jia, Y. Q., Vinyals, O., et al., 2014. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. Proceedings of the 31st International Conference on Machine Learning, Beijing. https://doi.org/10.5555/3044805.3044879
Han, X., Zhang, Z. Y., Ding, N., et al., 2021. Pre-Trained Models: Past, Present and Future. AI Open, 2: 225-250. https://doi.org/10.1016/j.aiopen.2021.08.002
He, K. M., Zhang, X. Y., Ren, S. Q., et al., 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas. https://doi.org/10.1109/CVPR.2016.90
He, N. J., Fang, L. Y., Li, S. T., et al., 2018. Remote Sensing Scene Classification Using Multilayer Stacked Covariance Pooling. IEEE Transactions on Geoscience and Remote Sensing, 56(12): 6899-6910. https://doi.org/10.1109/TGRS.2018.2845668
He, X., Chen, Y. S., 2019. Optimized Input for CNN-Based Hyperspectral Image Classification Using Spatial Transformer Network. IEEE Geoscience and Remote Sensing Letters, 16(12): 1884-1888. https://doi.org/10.1109/LGRS.2019.2911322
Jaderberg, M., Simonyan, K., Zisserman, A., et al., 2015. Spatial Transformer Networks. arXiv, 1506.02025. https://arxiv.org/abs/1506.02025
Jia, Y. Q., Shelhamer, E., Donahue, J., et al., 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. Proceedings of the 22nd ACM international conference on Multimedia, Orlando. https://doi.org/10.1145/2647868.2654889
Krizhevsky, A., Sutskever, I., Hinton, G. E., 2012. ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe. https://doi.org/10.5555/2999134.2999257
Li, D. R., Wang, M., Shen, X., et al., 2017. From Earth Observation Satellite to Earth Observation Brain. Geomatics and Information Science of Wuhan University, 42(2): 143-149 (in Chinese with English abstract).
Li, G. D., Zhang, C. J., Wang, M. K., et al., 2019. Transfer Learning Using Convolutional Neural Network for Scene Classification within High Resolution Remote Sensing Image. Science of Surveying and Mapping, 44(4): 116-123, 174 (in Chinese with English abstract).
Oliva, A., Torralba, A., 2001. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope. International Journal of Computer Vision, 42(3): 145-175. https://doi.org/10.1023/A: 1011139631724 doi: 10.1023/A:1011139631724
Oquab, M., Bottou, L., Laptev, I., et al., 2014. Learning and Transferring Mid-Level Image Representations Using Convolutional Neural Networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus. https://doi.org/10.1109/CVPR.2014.222
Pan, S. J., Yang, Q., 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10): 1345-1359. https://doi.org/10.1109/TKDE.2009.191
Perronnin, F., Sánchez, J., Mensink, T., 2010. Improving the Fisher Kernel for Large-Scale Image Classification. European Conference on Computer Vision, Berlin. https://doi.org/10.1007/978-3-642-15561-1_11
Simonyan, K., Zisserman, A., 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv, 1409.1556. https://arxiv.org/abs/1409.1556
Srinivas, A., Lin, T. Y., Parmar, N., et al., 2021. Bottleneck Transformers for Visual Recognition. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville. https://doi.org/10.1109/CVPR46437.2021.01625
Szegedy, C., Liu, W., Jia, Y. Q., et al., 2015. Going Deeper with Convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston. https://doi.org/10.1109/CVPR.2015.7298594
Văduva, C., Gavăt, I., Datcu, M., 2013. Latent Dirichlet Allocation for Spatial Analysis of Satellite Images. IEEE Transactions on Geoscience and Remote Sensing, 51(5): 2770-2786. https://doi.org/10.1109/TGRS.2012.2219314
van de Sande, K., Gevers, T., Snoek, C., 2010. Evaluating Color Descriptors for Object and Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9): 1582-1596. https://doi.org/10.1109/TPAMI.2009.154
Wallraven, C., Caputo, B., Graf, A., 2003. Recognition with Local Features: The Kernel Recipe. Proceedings Ninth IEEE International Conference on Computer Vision, Nice. https://doi.org/10.1109/ICCV.2003.1238351
Wang, R. C., 2018. Feature Learning and Patch Matching of Multispectral Images Based on Deep Neural Networks (Dissertation). Beijing University of Posts and Telecommunications, Beijing (in Chinese with English abstract).
Weiss, K., Khoshgoftaar, T. M., Wang, D. D., 2016. A Survey of Transfer Learning. Journal of Big Data, 3(1): 1-40. https://doi.org/10.1186/s40537-016-0043-6
Xia, G. S., Hu, J. W., Hu, F., et al., 2017. AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, 55(7): 3965-3981. https://doi.org/10.1109/TGRS.2017.2685945
Xu, Y. Y., Li, Z. X., Xie, Z., et al., 2020. Prediction of Copper Mineralization Based on Semi-Supervised Neural Network. Earth Science, 45(12): 4563-4573 (in Chinese with English abstract).
Yang, Y., Newsam, S., 2008. Comparing SIFT Descriptors and Gabor Texture Features for Classification of Remote Sensed Imagery. 2008 15th IEEE International Conference on Image Processing, San Diego. https://doi.org/10.1109/ICIP.2008.4712139
Yang, Y., Newsam, S., 2013. Geographic Image Retrieval Using Local Invariant Features. IEEE Transactions on Geoscience and Remote Sensing, 51(2): 818-832. https://doi.org/10.1109/TGRS.2012.2205158
Yu, D. H., Zhang, B. M., Zhao, C., et al., 2020. Scene Classification of Remote Sensing Image Using Ensemble Convolutional Neural Network. Journal of Remote Sensing, 24(6): 717-727 (in Chinese with English abstract).
Yu, S. C., Yu, D. Q., Wang, L. C., et al., 2019. Remote Sensing Study of Dongting Lake Beach Changes before and after Operation of Three Gorges Reservoir. Earth Science, 44(12): 4275-4283 (in Chinese with English abstract).
Zhang, K., Hei, B. Q., Li, S. Y., et al., 2018. Complex Scene Classification of Remote Sensing Images Based on CNN. Remote Sensing for Land & Resources, 30(4): 49-55 (in Chinese with English abstract).
Zhang, W., Tang, P., Zhao, L. J., 2019. Remote Sensing Image Scene Classification Using CNN-CapsNet. Remote Sensing, 11(5): 494. https://doi.org/10.3390/rs11050494
Zuo, R. G., Peng, Y., Li, T., et al., 2021. Challenges of Geological Prospecting Big Data Mining and Integration Using Deep Learning Algorithms. Earth Science, 46(1): 350-358 (in Chinese with English abstract).
程国轩, 牛瑞卿, 张凯翔, 等, 2018. 基于卷积神经网络的高分遥感影像露天采矿场识别. 地球科学, 43(S2): 256-262. doi: 10.3799/dqkx.2018.987
李德仁, 王密, 沈欣, 等, 2017. 从对地观测卫星到对地观测脑. 武汉大学学报(信息科学版), 42(2): 143-149. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201702001.htm
李冠东, 张春菊, 王铭恺, 等, 2019. 卷积神经网络迁移的高分影像场景分类学习. 测绘科学, 44(4): 116-123, 174. https://www.cnki.com.cn/Article/CJFDTOTAL-CHKD201904021.htm
王瑞琛, 2018. 基于深层神经网络的异源图像特征学习及块匹配(硕士学位论文). 北京: 北京邮电大学.
徐永洋, 李孜轩, 谢忠, 等, 2020. 基于半监督神经网络的铜矿预测方法. 地球科学, 45(12): 4563-4573. doi: 10.3799/dqkx.2020.297
余东行, 张保明, 赵传, 等, 2020. 联合卷积神经网络与集成学习的遥感影像场景分类. 遥感学报, 24(6): 717-727. https://www.cnki.com.cn/Article/CJFDTOTAL-YGXB202006006.htm
余姝辰, 余德清, 王伦澈, 等, 2019. 三峡水库运行前后洞庭湖洲滩面积变化遥感认识. 地球科学, 44(12): 4275-4283. doi: 10.3799/dqkx.2019.182
张康, 黑保琴, 李盛阳, 等, 2018. 基于CNN模型的遥感图像复杂场景分类. 国土资源遥感, 30(4): 49-55. https://www.cnki.com.cn/Article/CJFDTOTAL-GTYG201804008.htm
左仁广, 彭勇, 李童, 等, 2021. 基于深度学习的地质找矿大数据挖掘与集成的挑战. 地球科学, 46(1): 350-358. doi: 10.3799/dqkx.2020.111

施引文献

资源附件(0)

访问统计

点击查看大图

图(12) / 表(4)

计量

文章访问数: 760
HTML全文浏览量: 529
PDF下载量: 31
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

集成空间变换结构与深度残差网络的遥感影像场景分类方法

doi: 10.3799/dqkx.2021.218

作者简介:
孟亦菲（1998—），女，硕士，主要研究方向为深度学习、遥感场景分类. ORCID：0000-0002-5699-7837. E-mail：cugmyf@cug.edu.cn

通讯作者:
郑贵洲, ORCID: 0000-0002-2890-6395. E-mail: zhenggz@cug.edu.cn

计量

Remote Sensing Image Scene Classification Method Integrating Spatial Transformation Structure and Depth Residual Network

计量

目录

留言板

集成空间变换结构与深度残差网络的遥感影像场景分类方法

doi: 10.3799/dqkx.2021.218

作者简介: 孟亦菲（1998—），女，硕士，主要研究方向为深度学习、遥感场景分类. ORCID：0000-0002-5699-7837. E-mail：cugmyf@cug.edu.cn

通讯作者: 郑贵洲, ORCID: 0000-0002-2890-6395. E-mail: zhenggz@cug.edu.cn

计量

出版历程

Remote Sensing Image Scene Classification Method Integrating Spatial Transformation Structure and Depth Residual Network

计量

出版历程

目录

作者简介:
孟亦菲（1998—），女，硕士，主要研究方向为深度学习、遥感场景分类. ORCID：0000-0002-5699-7837. E-mail：cugmyf@cug.edu.cn

通讯作者:
郑贵洲, ORCID: 0000-0002-2890-6395. E-mail: zhenggz@cug.edu.cn