• 中国出版政府奖提名奖

    中国百强科技报刊

    湖北出版政府奖

    中国高校百佳科技期刊

    中国最美期刊

    留言板

    尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

    姓名
    邮箱
    手机号码
    标题
    留言内容
    验证码

    集成空间变换结构与深度残差网络的遥感影像场景分类方法

    孟亦菲 郑贵洲 冀炜臻

    孟亦菲, 郑贵洲, 冀炜臻, 2023. 集成空间变换结构与深度残差网络的遥感影像场景分类方法. 地球科学, 48(9): 3526-3538. doi: 10.3799/dqkx.2021.218
    引用本文: 孟亦菲, 郑贵洲, 冀炜臻, 2023. 集成空间变换结构与深度残差网络的遥感影像场景分类方法. 地球科学, 48(9): 3526-3538. doi: 10.3799/dqkx.2021.218
    Meng Yifei, Zheng Guizhou, Ji Weizhen, 2023. Remote Sensing Image Scene Classification Method Integrating Spatial Transformation Structure and Depth Residual Network. Earth Science, 48(9): 3526-3538. doi: 10.3799/dqkx.2021.218
    Citation: Meng Yifei, Zheng Guizhou, Ji Weizhen, 2023. Remote Sensing Image Scene Classification Method Integrating Spatial Transformation Structure and Depth Residual Network. Earth Science, 48(9): 3526-3538. doi: 10.3799/dqkx.2021.218

    集成空间变换结构与深度残差网络的遥感影像场景分类方法

    doi: 10.3799/dqkx.2021.218
    基金项目: 

    国家自然科学基金重点项目 42130309

    山西省大同经济技术开发区城市地质调查项目 2022030115

    详细信息
      作者简介:

      孟亦菲(1998—),女,硕士,主要研究方向为深度学习、遥感场景分类. ORCID:0000-0002-5699-7837. E-mail:cugmyf@cug.edu.cn

      通讯作者:

      郑贵洲, ORCID: 0000-0002-2890-6395. E-mail: zhenggz@cug.edu.cn

    • 中图分类号: P237

    Remote Sensing Image Scene Classification Method Integrating Spatial Transformation Structure and Depth Residual Network

    • 摘要: 针对传统高分辨率遥感影像的场景分类效率较低,以及卷积神经网络在遥感影像场景分类上由于空间不变性而导致的分类精度不高的问题,提出了一种结合空间变换网络和迁移学习的高分辨率遥感影像场景分类算法.首先,利用ImageNet数据集训练深度残差网络ResNet101得到预训练模型,通过知识迁移提高模型目标探测效率;之后在模型中嵌入空间变换结构,使模型能够主动在空间上变换特征映射,提高模型的鲁棒性;最后,在模型中添加Dropout层减小模型出现过拟合的概率.本方法在AID和NWPU-RESISC45两种不同规模的高分遥感影像数据集上进行了验证,在只有20%训练样本的情况下仍达到了94.30%和93.63%的分类精度.实验结果表明本次改进模型具有更好的特征提取能力,针对易误分类场景的分类结果更优.

       

    • 图  1  残差学习模块

      Fig.  1.  Residual learning module

      图  2  ResNet101的Backbone部分

      Fig.  2.  The Backbone part of ResNet101

      图  3  遥感影像场景分类流程

      Fig.  3.  Flow chart of scene classification of remote sensing image

      图  4  空间变换结构

      Fig.  4.  Spatial transformation structure

      图  5  Dropout原理示意

      Fig.  5.  Schematic diagram of Dropout principle

      图  6  AID数据集部分场景示例

      Fig.  6.  Example images of AID dataset

      图  7  NWPU-RESISC45数据集部分场景示例

      Fig.  7.  Example images of NWPU-RESISC45 dataset

      图  8  ResNet101和SF-ResNet101在AID数据集上的训练情况

      Fig.  8.  ResNet101 and improved ResNet101 training on AID datasets

      图  9  ResNet101和SF-ResNet101在NWPU-RESISC45数据集上的训练情况

      Fig.  9.  ResNet101 and improved ResNet101 training on NWPU-RESISC45 datasets

      图  10  不同Dropout率在AID数据集上的训练情况

      Fig.  10.  Different Dropout rate training on AID datasets

      图  11  不同Dropout率在NWPU-RESISC45数据集上的训练情况

      Fig.  11.  Different Dropout rate training on NWPU-RESISC45 datasets

      图  12  各网络模型易误分类场景性能对比

      Fig.  12.  Comparison of performance for different models on easily misclassified scene images

      表  1  不同训练比率设置下SF-ResNet101模型测试集精度对比

      Table  1.   Accuracy comparison of SF-RESNET 101 model test sets under different training ratio Settings

      训练比率 test_acc (%)
      10% 20% 50% 80%
      AID 91.65 94.30 96.52 96.81
      NWPU 91.66 93.63 93.75 93.77
      下载: 导出CSV

      表  2  不同Dropout率测试集精度对比

      Table  2.   Comparsion of test accuracy of different Dropout rates

      Dropout率 test_acc (%)
      0.1 0.2 0.4
      AID 94.30 94.24 94.02
      NWPU 93.47 93.63 93.60
      下载: 导出CSV

      表  3  各网络模型在AID数据集上的分类精度

      Table  3.   Classification accuracy of different models on AID dataset

      模型 总体精度(%)
      20%训练比率 50%训练比率
      GoogleNet(Xia et al.,2017 83.44±0.40 89.36±0.55
      VGG-VD16+MSCP+MRA(He et al.,2018 92.21±0.17 96.56±0.18
      CNN-CapsNet(Zhang et al.,2019 93.79±0.13 96.32±0.12
      D-CNNs(Cheng et al.,2018 90.82±0.16 96.89±0.10
      ResNet101 92.32±0.23 95.49±0.38
      ResNet101+STN 93.58±0.22 95.89±0.27
      本文方法 94.30±0.29 96.52±0.10
      下载: 导出CSV

      表  4  各网络模型在NWPU-RESISC45数据集上的分类精度

      Table  4.   Classification accuracy of different models on NWPU-RESISC45 dataset

      模型 总体精度(%)
      10%训练比率 20%训练比率
      Fine-tuned VGGNet-16(Cheng et al.,2017 87.15±0.45 90.36±0.18
      VGG-VD16+MSCP+MRA(He et al.,2018 88.07±0.18 90.81±0.13
      CNN-CapsNet(Zhang et al.,2019 89.03±0.21 92.60±0.11
      D-CNNs(Cheng et al.,2018 89.22±0.50 91.89±0.22
      ResNet101 89.27±0.21 91.81±0.19
      ResNet101+STN 90.72±0.23 92.47±0.28
      本文方法 91.66±0.15 93.63±0.22
      下载: 导出CSV
    • Berman, M., Triki, A. R., Blaschko, M. B., 2018. The Lovasz-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City. https://doi.org/10.1109/CVPR.2018.00464
      Cheng, G., Guo, L., Zhao, T. Y., et al., 2013. Automatic Landslide Detection from Remote-Sensing Imagery Using a Scene Classification Method Based on BoVW and pLSA. International Journal of Remote Sensing, 34(1): 45-59. https://doi.org/10.1080/01431161.2012.705443
      Cheng, G., Han, J. W., Lu, X. Q., 2017. Remote Sensing Image Scene Classification: Benchmark and State of the Art. Proceedings of the IEEE, 105(10): 1865-1883. https://doi.org/10.1109/JPROC.2017.2675998
      Cheng, G., Yang, C. Y., Yao, X. W., et al., 2018. When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNS. IEEE Transactions on Geoscience and Remote Sensing, 56(5): 2811-2821. https://doi.org/10.1109/TGRS.2017.2783902
      Cheng, G. X., Niu, R. Q., Zhang, K. X., et al., 2018. Opencast Mining Area Recognition in High-Resolution Remote Sensing Images Using Convolutional Neural Networks. Earth Science, 43(S2): 256-262 (in Chinese with English abstract).
      Dalal, N., Triggs, B., 2005. Histograms of Oriented Gradients for Human Detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego. https://doi.org/10.1109/CVPR.2005.177
      Donahue, J., Jia, Y. Q., Vinyals, O., et al., 2014. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. Proceedings of the 31st International Conference on Machine Learning, Beijing. https://doi.org/10.5555/3044805.3044879
      Han, X., Zhang, Z. Y., Ding, N., et al., 2021. Pre-Trained Models: Past, Present and Future. AI Open, 2: 225-250. https://doi.org/10.1016/j.aiopen.2021.08.002
      He, K. M., Zhang, X. Y., Ren, S. Q., et al., 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas. https://doi.org/10.1109/CVPR.2016.90
      He, N. J., Fang, L. Y., Li, S. T., et al., 2018. Remote Sensing Scene Classification Using Multilayer Stacked Covariance Pooling. IEEE Transactions on Geoscience and Remote Sensing, 56(12): 6899-6910. https://doi.org/10.1109/TGRS.2018.2845668
      He, X., Chen, Y. S., 2019. Optimized Input for CNN-Based Hyperspectral Image Classification Using Spatial Transformer Network. IEEE Geoscience and Remote Sensing Letters, 16(12): 1884-1888. https://doi.org/10.1109/LGRS.2019.2911322
      Jaderberg, M., Simonyan, K., Zisserman, A., et al., 2015. Spatial Transformer Networks. arXiv, 1506.02025. https://arxiv.org/abs/1506.02025
      Jia, Y. Q., Shelhamer, E., Donahue, J., et al., 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. Proceedings of the 22nd ACM international conference on Multimedia, Orlando. https://doi.org/10.1145/2647868.2654889
      Krizhevsky, A., Sutskever, I., Hinton, G. E., 2012. ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe. https://doi.org/10.5555/2999134.2999257
      Li, D. R., Wang, M., Shen, X., et al., 2017. From Earth Observation Satellite to Earth Observation Brain. Geomatics and Information Science of Wuhan University, 42(2): 143-149 (in Chinese with English abstract).
      Li, G. D., Zhang, C. J., Wang, M. K., et al., 2019. Transfer Learning Using Convolutional Neural Network for Scene Classification within High Resolution Remote Sensing Image. Science of Surveying and Mapping, 44(4): 116-123, 174 (in Chinese with English abstract).
      Oliva, A., Torralba, A., 2001. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope. International Journal of Computer Vision, 42(3): 145-175. https://doi.org/10.1023/A: 1011139631724 doi: 10.1023/A:1011139631724
      Oquab, M., Bottou, L., Laptev, I., et al., 2014. Learning and Transferring Mid-Level Image Representations Using Convolutional Neural Networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus. https://doi.org/10.1109/CVPR.2014.222
      Pan, S. J., Yang, Q., 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10): 1345-1359. https://doi.org/10.1109/TKDE.2009.191
      Perronnin, F., Sánchez, J., Mensink, T., 2010. Improving the Fisher Kernel for Large-Scale Image Classification. European Conference on Computer Vision, Berlin. https://doi.org/10.1007/978-3-642-15561-1_11
      Simonyan, K., Zisserman, A., 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv, 1409.1556. https://arxiv.org/abs/1409.1556
      Srinivas, A., Lin, T. Y., Parmar, N., et al., 2021. Bottleneck Transformers for Visual Recognition. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville. https://doi.org/10.1109/CVPR46437.2021.01625
      Szegedy, C., Liu, W., Jia, Y. Q., et al., 2015. Going Deeper with Convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston. https://doi.org/10.1109/CVPR.2015.7298594
      Văduva, C., Gavăt, I., Datcu, M., 2013. Latent Dirichlet Allocation for Spatial Analysis of Satellite Images. IEEE Transactions on Geoscience and Remote Sensing, 51(5): 2770-2786. https://doi.org/10.1109/TGRS.2012.2219314
      van de Sande, K., Gevers, T., Snoek, C., 2010. Evaluating Color Descriptors for Object and Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9): 1582-1596. https://doi.org/10.1109/TPAMI.2009.154
      Wallraven, C., Caputo, B., Graf, A., 2003. Recognition with Local Features: The Kernel Recipe. Proceedings Ninth IEEE International Conference on Computer Vision, Nice. https://doi.org/10.1109/ICCV.2003.1238351
      Wang, R. C., 2018. Feature Learning and Patch Matching of Multispectral Images Based on Deep Neural Networks (Dissertation). Beijing University of Posts and Telecommunications, Beijing (in Chinese with English abstract).
      Weiss, K., Khoshgoftaar, T. M., Wang, D. D., 2016. A Survey of Transfer Learning. Journal of Big Data, 3(1): 1-40. https://doi.org/10.1186/s40537-016-0043-6
      Xia, G. S., Hu, J. W., Hu, F., et al., 2017. AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, 55(7): 3965-3981. https://doi.org/10.1109/TGRS.2017.2685945
      Xu, Y. Y., Li, Z. X., Xie, Z., et al., 2020. Prediction of Copper Mineralization Based on Semi-Supervised Neural Network. Earth Science, 45(12): 4563-4573 (in Chinese with English abstract).
      Yang, Y., Newsam, S., 2008. Comparing SIFT Descriptors and Gabor Texture Features for Classification of Remote Sensed Imagery. 2008 15th IEEE International Conference on Image Processing, San Diego. https://doi.org/10.1109/ICIP.2008.4712139
      Yang, Y., Newsam, S., 2013. Geographic Image Retrieval Using Local Invariant Features. IEEE Transactions on Geoscience and Remote Sensing, 51(2): 818-832. https://doi.org/10.1109/TGRS.2012.2205158
      Yu, D. H., Zhang, B. M., Zhao, C., et al., 2020. Scene Classification of Remote Sensing Image Using Ensemble Convolutional Neural Network. Journal of Remote Sensing, 24(6): 717-727 (in Chinese with English abstract).
      Yu, S. C., Yu, D. Q., Wang, L. C., et al., 2019. Remote Sensing Study of Dongting Lake Beach Changes before and after Operation of Three Gorges Reservoir. Earth Science, 44(12): 4275-4283 (in Chinese with English abstract).
      Zhang, K., Hei, B. Q., Li, S. Y., et al., 2018. Complex Scene Classification of Remote Sensing Images Based on CNN. Remote Sensing for Land & Resources, 30(4): 49-55 (in Chinese with English abstract).
      Zhang, W., Tang, P., Zhao, L. J., 2019. Remote Sensing Image Scene Classification Using CNN-CapsNet. Remote Sensing, 11(5): 494. https://doi.org/10.3390/rs11050494
      Zuo, R. G., Peng, Y., Li, T., et al., 2021. Challenges of Geological Prospecting Big Data Mining and Integration Using Deep Learning Algorithms. Earth Science, 46(1): 350-358 (in Chinese with English abstract).
      程国轩, 牛瑞卿, 张凯翔, 等, 2018. 基于卷积神经网络的高分遥感影像露天采矿场识别. 地球科学, 43(S2): 256-262. doi: 10.3799/dqkx.2018.987
      李德仁, 王密, 沈欣, 等, 2017. 从对地观测卫星到对地观测脑. 武汉大学学报(信息科学版), 42(2): 143-149. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201702001.htm
      李冠东, 张春菊, 王铭恺, 等, 2019. 卷积神经网络迁移的高分影像场景分类学习. 测绘科学, 44(4): 116-123, 174. https://www.cnki.com.cn/Article/CJFDTOTAL-CHKD201904021.htm
      王瑞琛, 2018. 基于深层神经网络的异源图像特征学习及块匹配(硕士学位论文). 北京: 北京邮电大学.
      徐永洋, 李孜轩, 谢忠, 等, 2020. 基于半监督神经网络的铜矿预测方法. 地球科学, 45(12): 4563-4573. doi: 10.3799/dqkx.2020.297
      余东行, 张保明, 赵传, 等, 2020. 联合卷积神经网络与集成学习的遥感影像场景分类. 遥感学报, 24(6): 717-727. https://www.cnki.com.cn/Article/CJFDTOTAL-YGXB202006006.htm
      余姝辰, 余德清, 王伦澈, 等, 2019. 三峡水库运行前后洞庭湖洲滩面积变化遥感认识. 地球科学, 44(12): 4275-4283. doi: 10.3799/dqkx.2019.182
      张康, 黑保琴, 李盛阳, 等, 2018. 基于CNN模型的遥感图像复杂场景分类. 国土资源遥感, 30(4): 49-55. https://www.cnki.com.cn/Article/CJFDTOTAL-GTYG201804008.htm
      左仁广, 彭勇, 李童, 等, 2021. 基于深度学习的地质找矿大数据挖掘与集成的挑战. 地球科学, 46(1): 350-358. doi: 10.3799/dqkx.2020.111
    • 加载中
    图(12) / 表(4)
    计量
    • 文章访问数:  305
    • HTML全文浏览量:  413
    • PDF下载量:  18
    • 被引次数: 0
    出版历程
    • 收稿日期:  2021-07-07
    • 网络出版日期:  2023-10-07
    • 刊出日期:  2023-09-25

    目录

      /

      返回文章
      返回