• 中国出版政府奖提名奖

    中国百强科技报刊

    湖北出版政府奖

    中国高校百佳科技期刊

    中国最美期刊

    留言板

    尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

    姓名
    邮箱
    手机号码
    标题
    留言内容
    验证码

    大语言模型赋能的地质找矿知识图谱与问答模型构建

    张宝一 唐嘉成 张彤蕴 王宾海 史与正 詹庆忠 方振西 Or Aimon Brou Koffi Kablan 马凯

    张宝一, 唐嘉成, 张彤蕴, 王宾海, 史与正, 詹庆忠, 方振西, Or Aimon Brou Koffi Kablan, 马凯, 2026. 大语言模型赋能的地质找矿知识图谱与问答模型构建. 地球科学, 51(3): 982-995. doi: 10.3799/dqkx.2025.176
    引用本文: 张宝一, 唐嘉成, 张彤蕴, 王宾海, 史与正, 詹庆忠, 方振西, Or Aimon Brou Koffi Kablan, 马凯, 2026. 大语言模型赋能的地质找矿知识图谱与问答模型构建. 地球科学, 51(3): 982-995. doi: 10.3799/dqkx.2025.176
    Zhang Baoyi, Tang Jiacheng, Zhang Tongyun, Wang Binhai, Shi Yuzheng, Zhan Qingzhong, Fang Zhenxi, Kablan Or Aimon Brou Koffi, Ma Kai, 2026. Knowledge Graph and Question-Answering Model for Geological Prospecting Empowered by Large Language Models. Earth Science, 51(3): 982-995. doi: 10.3799/dqkx.2025.176
    Citation: Zhang Baoyi, Tang Jiacheng, Zhang Tongyun, Wang Binhai, Shi Yuzheng, Zhan Qingzhong, Fang Zhenxi, Kablan Or Aimon Brou Koffi, Ma Kai, 2026. Knowledge Graph and Question-Answering Model for Geological Prospecting Empowered by Large Language Models. Earth Science, 51(3): 982-995. doi: 10.3799/dqkx.2025.176

    大语言模型赋能的地质找矿知识图谱与问答模型构建

    doi: 10.3799/dqkx.2025.176
    基金项目: 

    国家科技重大专项 2024ZD1001201

    湖南省地质院重大科研项目 HNGSTP202301

    详细信息
      作者简介:

      张宝一(1979—),男,副教授,博士,博士生导师,主要从事地学大数据挖掘与机器学习、三维地质建模、地理信息应用工程的教学与科研工作.ORCID:0000-0001-6075-9359. E-mail:zhangbaoyi@csu.edu.cn

      通讯作者:

      马凯,ORCID: 0000-0001-5432-1166. E-mail: makai@ctgu.edu.cn

    • 中图分类号: P628+.3

    Knowledge Graph and Question-Answering Model for Geological Prospecting Empowered by Large Language Models

    • 摘要:

      当前地质找矿领域的大语言模型应用面临着专业知识不足、数据隐私安全和模型幻觉等问题,同时大语言模型在地质找矿领域应用中仍缺乏高效快捷的知识推荐手段.本研究提出了知识图谱与检索增强生成相结合的KG-RAG(Knowledge Graph Retrieval-Augmented Generation)框架,以大语言模型为工具,在地质本体约束下实现了地质找矿知识图谱的自动化抽取和结构化表达,同时利用知识图谱的多跳检索算法实现检索内容的深度与广度优化,构建了地质找矿智能知识问答模型.实验结果表明:在知识图谱构建任务中,KG-RAG的准确率、召回率和可信度(F1分数)分别为0.807、0.833和0.819,相比大语言基模型GLM4-9B的直接知识抽取,分别取得了约50%、8%和29%的提升;在问答任务中,KG-RAG召回率和准确率分别为0.917和0.88,相比文档向量检索增强生成方法分别取得了约24%和22%的提升.KG-RAG在知识图谱构建与智能问答两方面均表现出了较好的性能,能够有效从地质资料中进行地质找矿知识收集与表达,支持地质工作者的地质调查与找矿预测工作,本研究为大语言模型与知识图谱的联合应用提供了借鉴.

       

    • 图  1  基于知识图谱的检索增强生成框架KG-RAG

      Fig.  1.  KG-RAG: Knowledge graph-embedded retrieval-augmented generation framework for geological prospecting

      图  2  地质找矿知识图谱本体模式层

      Fig.  2.  Ontology schema of geological prospecting knowledge graph

      图  3  基于思维链的三元组抽取伪代码

      Fig.  3.  Pseudocode of triple extraction through chain-of-thought

      图  4  基于地质找矿本体模式层的三元组抽取提示词工程框架

      Fig.  4.  Prompt engineering framework for triple (head entity, relationship, tail entity) extraction constrained by the geological prospecting ontology schema

      图  5  嵌入知识图谱的检索增强生成

      Fig.  5.  Knowledge graph embedded retrieval-augmented generation

      图  6  地质找矿问答模型提示词工程

      Fig.  6.  Prompt engineering for geological prospecting question-answering model

      图  7  知识图谱检索增强生成的知识回答示例:使用KG-RAG的大语言模型(a),大语言模型直接回答(b)

      Fig.  7.  Instances of question-answering by KG-RAG (a) and LLM directly (b)

      图  8  研究区地质找矿知识回答结果:找矿标志(a)、控矿要素(b)

      Fig.  8.  Question-answering instances of geological prospecting knowledge in the study area: Prospecting clues (a) and ore-forming conditioning factors (b)

      表  1  研究区文档资料

      Table  1.   Geological prospecting documents of the study area

      序号 资料类型 资料字符数
      1 矿区钨矿普查设计书 25 096
      2 矿区钨矿普查总结报告 46 826
      3 矿区钨矿普查附表 33 239
      4 矿区地球物理勘探报告 16 323
      5 矿区野外工作总结 20 434
      6 矿区工程测量报告 1 507
      7 矿田外围钨成矿规律及靶区预测研究 30 260
      8 矿区质量总结报告 6 605
      9 矿区实测地质剖面原始地质记录表 72
      10 矿区实测剖面数据计算表 87
      11 矿区实测剖面小结 2 683
      12 矿区外检样 63
      13 矿区地质填图小结 4 320
      14 矿区内检样 71
      15 矿区钻孔采样登记表 338
      16 矿区化学样分析结果 275
      17 矿区情况说明 81
      下载: 导出CSV

      表  2  地质找矿知识图谱三元组示例(部分)

      Table  2.   Some triple instances in the geological prospecting knowledge graph

      实体A 关系或属性 实体B
      泥盆系跳马涧组 岩性 巨厚层状石英砾岩
      泥盆系跳马涧组 岩性 砂砾岩
      泥盆系跳马涧组 岩性 砂岩
      泥盆系跳马涧组 岩性 含泥、钙质砂岩
      泥盆系跳马涧组 上段厚度 约35 m
      泥盆系跳马涧组 中段厚度 约37 m
      泥盆系跳马涧组 下段厚度 约46 m
      研究区域 气候类型 亚热带季风气候
      花岗岩体群 岩性 中细粒二云母花岗岩
      花岗岩体群 岩性 细粒白云母花岗岩
      花岗岩体群 岩性 斑状黑云母花岗岩
      花岗岩体群 形态 岩株
      某背斜 核部地层 棋梓桥组
      某断层 长度 17.35 km
      某矿区 矿产类型 岩体型钨矿
      下载: 导出CSV

      表  3  评估指标统计表

      Table  3.   Evaluation metrics of the constructed knowledge graph

      $ \mathrm{a}\mathrm{v}\mathrm{e}D $ MRR Hits@1 Hits@3 Hits@10
      3.05 0.631 0.476 0.741 0.933
      下载: 导出CSV

      表  4  用于知识图谱对比的人工标注语料库

      Table  4.   Manually annotated corpus for KG comparison

      序号 语料类型 数量 内容描述
      1 地理 10 描述区域、矿区等的地理位置、坐标范围等信息
      2 地层 25 描述各地层单位的岩石特征、厚度、组成等
      3 岩体 5 描述岩体类型、成分、结构等信息
      4 构造 20 描述褶皱、断层、背斜等地质构造信息
      5 地质年代 8 描述地质时代、期次等时间尺度信息
      6 成矿要素 20 描述成矿系统、控矿因素、矿化特征等信息
      7 地球物理 6 描述物探异常、物理场特征等信息
      8 地球化学 6 描述化探异常、元素分布等信息
      下载: 导出CSV

      表  5  地质找矿知识图谱构建评估表

      Table  5.   Evaluation of knowledge graphs for geological prospecting

      Precision Recall F1-score
      KG-RAG 0.807 0.833 0.819
      ChatGLM-9B 0.537 0.768 0.632
      下载: 导出CSV

      表  6  问答关联检索实验结果评价

      Table  6.   Evaluation metrics of question-answering models

      Recall Faithfulness Accuracy
      KG-RAG 0.917 0.808 0.88
      GraphRAG 0.785 0.808 0.73
      Doc RAG 0.735 0.813 0.72
      No RAG 0.38
      下载: 导出CSV
    • Bizer, C., Lehmann, J., Kobilarov, G., et al., 2009. DBpedia: A Crystallization Point for the Web of Data. Journal of Web Semantics, 7(3): 154-165. https://doi.org/10.1016/j.websem.2009.07.002
      Cai, F. C., Qin, J. H., Qin, J. N., et al., 2021. Geochemical Characteristics and LA-ICP-MS Zircon U-Pb Dating of Ore-Bearing Granite of Chuankou Intrusion-Related Tungsten Deposit, Hunan Province. China Geology, 48(4): 1212-1224 (in Chinese with English abstract).
      Chen, X. D., Liu, Y. P., Han, W., et al., 2025. A Vision-Language Foundation Model-Based Multi-Modal Retrieval-Augmented Generation Framework for Remote Sensing Lithological Recognition. ISPRS Journal of Photogrammetry and Remote Sensing, 225: 328-340. https://doi.org/10.1016/j.isprsjprs.2025.04.015
      Church, K. W., Sun, J. M., Yue, R., et al., 2024. Emerging Trends: A Gentle Introduction to RAG. Natural Language Engineering, 30(4): 870-881. https://doi.org/10.1017/s1351324924000044
      de Almeida, T. D., de Oliveira, N. N., He, C. D., et al., 2025. Using Generative Pre-Trained Transformer-4 (GPT-4), Ffmpeg, and Microsoft Azure to Aid in Creating a Text-to-Video Generation Tool to Improve Safety Shares and Incident Descriptions in the Mining Industry. Mining, Metallurgy & Exploration, 42(3): 1325-1343. https://doi.org/10.1007/s42461-024-01114-y
      Dong, S. C., Li, Y., Lü, H. R., et al., 2020. An Editing Platform of Geoscience Knowledge System. Geological Journal of China Universities, 26(4): 384-394 (in Chinese with English abstract).
      Dreyer, J., 2025. China Made Waves with Deepseek, but Its Real Ambition is AI-Driven Industrial Innovation. Nature, 638(8051): 609-611. https://doi.org/10.1038/d41586-025-00460-1
      Floridi, L., Chiriatti, M., 2020. GPT-3: Its Nature, Scope, Limits, and Consequences. Minds and Machines, 30(4): 681-694. https://doi.org/10.1007/s11023-020-09548-1
      Fu, Y., Wang, M. G., Wang, C. B., et al., 2025. GeoMinLM: A Large Language Model in Geology and Mineral Survey in Yunnan Province. Ore Geology Reviews, 182: 106638. https://doi.org/10.1016/j.oregeorev.2025.106638
      Guo, F., Lai, P., Huang, F. M., et al., 2024. Literature Review and Research Progress of Landslide Susceptibility Mapping Based on Knowledge Graph. Earth Science, 49(5): 1584-1606 (in Chinese with English abstract).
      Hosseini, S., Seilani, H., 2025. The Role of Agentic AI in Shaping a Smart future: A Systematic Review. Array, 26: 100399. https://doi.org/10.1016/j.array.2025.100399
      Hu, Y. J., Mai, G. C., Cundy, C., et al., 2023. Geo-Knowledge-Guided GPT Models Improve the Extraction of Location Descriptions from Disaster-Related Social Media Messages. International Journal of Geographical Information Science, 37(11): 2289-2318. https://doi.org/10.1080/13658816.2023.2266495
      Jiang, B., Yang, J. X., Yang, C., et al., 2020. Knowledge Augmented Dialogue Generation with Divergent Facts Selection. Knowledge-Based Systems, 210: 106479. https://doi.org/10.1016/j.knosys.2020.106479
      Jiang, S. W., Zhang, J. W., Hua, L. S., et al., 2025. Implementation of Meteorological Database Question-Answering Based on Large-Scale Model Retrieval-Augmentation Generation. Computer Engineering and Applications, 61(5): 113-121 (in Chinese with English abstract).
      Katz, D. M., Bommarito, M. J., Gao, S., et al., 2024. GPT-4 Passes the Bar Exam. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 382(2270): 20230254. https://doi.org/10.1098/rsta.2023.0254
      Li, C. L., Wang, Z. X., Lü, Q. T., et al., 2021. Mesozoic Tectonic Evolution of the Eastern South China Block: A Review on the Synthesis of the Regional Deformation and Magmatism. Ore Geology Reviews, 131: 104028. https://doi.org/10.1016/j.oregeorev.2021.104028
      Li, H., Yue, P., Tapete, D., et al., 2024. ESDC: An Open Earth Science Data Corpus to Support Geoscientific Literature Information Extraction. Science China Earth Sciences, 67(12): 3840-3854. https://doi.org/10.1007/s11430-023-1444-9
      Li, H., Yue, P., Wu, H. R., et al., 2025. A Question-Answering Framework for Geospatial Data Retrieval Enhanced by a Knowledge Graph and Large Language Models. International Journal of Digital Earth, 18(1): 2510566. https://doi.org/10.1080/17538947.2025.2510566
      Li, N. X., Zhang, R. Q., Zhu, L., et al., 2023. Tracing Tungsten-Tin Mineralization Processes with Tourmaline Geochemistry in the Wangxianling-Hehuaping District, Nanling Range (South China). Ore Geology Reviews, 163: 105806. https://doi.org/10.1016/j.oregeorev.2023.105806
      Liang, J. Y., Hou, S. Y., Jiao, H. Y., et al., 2025. GeoGraphRAG: A Graph-Based Retrieval-Augmented Generation Approach for Empowering Large Language Models in Automated Geospatial Modeling. International Journal of Applied Earth Observation and Geoinformation, 142: 104712. https://doi.org/10.1016/j.jag.2025.104712
      Ma, X. G., 2022. Knowledge Graph Construction and Application in Geosciences: A Review. Computers & Geosciences, 161: 105082. https://doi.org/10.1016/j.cageo.2022.105082
      Memduhoğlu, A., Fulman, N., Zipf, A., 2024. Enriching Building Function Classification Using Large Language Model Embeddings of OpenStreetMap Tags. Earth Science Informatics, 17(6): 5403-5418. https://doi.org/10.1007/s12145-024-01463-8
      Peng, N. L., Wang, X. H., Yang, J., et al., 2017. Re-Os Dating of Molybdenite from Sanjiaotan Tungsten Deposit in Chuankou Area, Hunan Province, and Its Geological Implications. Mineral Deposits, 36(6): 1402-1414 (in Chinese with English abstract).
      Qin, J. H., Wang, D. H., Li, C., et al., 2020. The Molybdenite Re-Os Isotope Chronology, in Situ Scheelite and Wolframite Trace Elements and Sr Isotope Characteristics of the Chuankou Tungsten Ore Field, South China. Ore Geology Reviews, 126: 103756. https://doi.org/10.1016/j.oregeorev.2020.103756
      Qiu, Q. J., Wu, L., Ma, K., et al., 2023. A Knowledge Graph Construction Method for Geohazard Chain for Disaster Emergency Response. Earth Science, 48(5): 1875-1891 (in Chinese with English abstract).
      Song, H. B., Huang, M. X., Fan, Z. H., et al., 2002. Characteristics of the Ore-Controlling Structures of the Sanjiaotan Wolframite Deposit and Its Relationships with Ore Formation in Chuankou, Hunan. Geotectonica et Metallogenia, 26(1): 51-54 (in Chinese with English abstract).
      Tong, B., Yin, Y. P., Li, B., et al., 2025. Review on Artificial Intelligence-Based Large Language Models for Geological Hazards. The Chinese Journal of Geological Hazard and Control, 36(2): 1-12 (in Chinese with English abstract).
      Vidivelli, S., Ramachandran, M., Dharunbalaji, A., 2024. Efficiency-Driven Custom Chatbot Development: Unleashing LangChain, RAG, and Performance-Optimized LLM Fusion. Computers, Materials & Continua, 80(2): 2423-2442. https://doi.org/10.32604/cmc.2024.054360
      Wang, C. B., Wang, M. G., Wang, B., et al., 2024. Knowledge Graph-Infused Quantitative Mineral Resource Forecasting. Earth Science Frontiers, 31(4): 26-36 (in Chinese with English abstract).
      Wang, D. H., Liu, X. X., Liu, L. J., 2015. Characteristics of Big Geodata and Its Application to Study of Minerogenetic Regularity and Minerogenetic Series. Mineral Deposits, 34(6): 1143-1154 (in Chinese with English abstract).
      Wang, G. Q., Xie, J. L., Zhang, T., et al., 2025. LLaMA-Unidetector: An LLaMA-Based Universal Framework for Open-Vocabulary Object Detection in Remote Sensing Imagery. IEEE Transactions on Geoscience and Remote Sensing, 63: 4409318. https://doi.org/10.1109/TGRS.2025.3564332
      Wu, H. Y., Shen, Z. X., Hou, S. Y., et al., 2025. Large Language Model-Driven GIS Analysis: methods, Applications, and Prospects. Acta Geodaetica et Cartographica Sinica, 54(4): 621-635 (in Chinese with English abstract).
      Wu, R. L., Guo, D. H., 2025. Research on Evaluation Standards for Spatial Cognitive Abilities in Large Language Models. Journal of Geo-Information Science, 27(5): 1041-1052 (in Chinese with English abstract).
      Xu, C., Su, M. Y., Sun, B., et al., 2024. Tourism Knowledge Graph Construction Based on ChatGLM and Prompt-Tuning. Science Technology and Engineering, 24: 13484-13492 (in Chinese with English abstract).
      Zhang, W., Cai, M. X., Zhang, T., et al., 2024a. EarthGPT: A Universal Multimodal Large Language Model for Multisensor Image Comprehension in Remote Sensing Domain. IEEE Transactions on Geoscience and Remote Sensing, 62: 5917820. https://doi.org/10.1109/TGRS.2024.3409624
      Zhang, Y. F., Wei, C., He, Z. T., et al., 2024b. GeoGPT: An Assistant for Understanding and Processing Geospatial Tasks. International Journal of Applied Earth Observation and Geoinformation, 131: 103976. https://doi.org/10.1016/j.jag.2024.103976
      Zhang, Z. J., Kusky, T., Gao, M., et al., 2023. Spatio-Temporal Analysis of Big Data Sets of Detrital Zircon U-Pb Geochronology and Hf Isotope Data: Tests of Tectonic Models for the Precambrian Evolution of the North China Craton. Earth-Science Reviews, 239: 104372. https://doi.org/10.1016/j.earscirev.2023.104372
      Zhou, Y. Z., Zuo, R. G., Liu, G., et al., 2021. The Great-Leap-Forward Development of Mathematical Geoscience during 2010-2019: Big Data and Artificial Intelligence Algorithm are Changing Mathematical Geoscience. Bulletin of Mineralogy, Petrology and Geochemistry, 40(3): 556-573 (in Chinese with English abstract).
      蔡富成, 秦锦华, 覃金宁, 等, 2021. 湖南川口岩体型钨矿赋矿花岗岩地球化学特征及LA-ICP-MS锆石U-Pb定年. 中国地质, 48(4): 1212-1224.
      董少春, 李艳, 闾海荣, 等, 2020. 地球科学知识体系编辑平台. 高校地质学报, 26(4): 384-394.
      郭飞, 赖鹏, 黄发明, 等, 2024. 基于知识图谱的滑坡易发性评价文献综述及研究进展. 地球科学, 49(5): 1584-1606. doi: 10.3799/dqkx.2023.058
      江双五, 张嘉玮, 华连生, 等, 2025. 基于大模型检索增强生成的气象数据库问答模型实现. 计算机工程与应用, 61(5): 113-121.
      彭能立, 王先辉, 杨俊, 等, 2017. 湖南川口三角潭钨矿床中辉钼矿Re-Os同位素定年及其地质意义. 矿床地质, 36(6): 1402-1414.
      邱芹军, 吴亮, 马凯, 等, 2023. 面向灾害应急响应的地质灾害链知识图谱构建方法. 地球科学, 48(5): 1875-1891. doi: 10.3799/dqkx.2022.313
      宋宏邦, 黄满湘, 樊钟衡, 等, 2002. 湖南川口三角潭黑钨矿床控矿构造特征及其与成矿的关系. 大地构造与成矿学, 26(1): 51-54.
      佟彬, 殷跃平, 李昺, 等, 2025. 地质灾害人工智能大语言模型研究展望. 中国地质灾害与防治学报, 36(2): 1-12.
      王成彬, 王明果, 王博, 等, 2024. 融合知识图谱的矿产资源定量预测. 地学前缘, 31(4): 26-36.
      王登红, 刘新星, 刘丽君, 2015. 地质大数据的特点及其在成矿规律、成矿系列研究中的应用. 矿床地质, 34(6): 1143-1154.
      吴华意, 沈张骁, 侯树洋, 等, 2025. 大语言模型驱动的GIS分析: 方法、应用与展望. 测绘学报, 54(4): 621-635.
      吴若玲, 郭旦怀, 2025. 大语言模型空间认知能力测试标准研究. 地球信息科学学报, 27(5): 1041-1052.
      徐春, 苏明钰, 孙彬, 等, 2024. 基于ChatGLM和提示微调的旅游知识图谱构建. 科学技术与工程, 24(31): 13484-13492.
      周永章, 左仁广, 刘刚, 等, 2021. 数学地球科学跨越发展的十年: 大数据、人工智能算法正在改变地质学. 矿物岩石地球化学通报, 40(3): 556-573.
    • 加载中
    图(8) / 表(6)
    计量
    • 文章访问数:  576
    • HTML全文浏览量:  53
    • PDF下载量:  75
    • 被引次数: 0
    出版历程
    • 收稿日期:  2025-07-22
    • 刊出日期:  2026-03-25

    目录

      /

      返回文章
      返回