融合知识图谱与大语言模型的地学知识抽取与信息挖掘——以卡林型金矿为例

doi:10.3799/dqkx.2026.036

融合知识图谱与大语言模型的地学知识抽取与信息挖掘——以卡林型金矿为例

doi: 10.3799/dqkx.2026.036

刘国庆^1,2,
陈国雄^1, ,

1. 中国地质大学(武汉), 地质过程与成矿预测全国重点实验室, 湖北武汉 430074;
2. 中国地质大学(武汉)地球与行星科学学院, 湖北武汉 430074

基金项目:

国家深地重大专项青年科学家课题(2024ZD10019007)

贵州省地质矿产局地质科研项目(黔地质科合[2025]01号)

中央高校基本科研业务费专项资金资助项目(GUG-DMX2025-01)

国家级大学生创新训练计划资助项目(202510491034)

详细信息

作者简介:
刘国庆(2004-),男,本科生,主要从事大数据找矿研究工作,ORCID:0009-0003-1518-6258,E-mail:liuguoqing@cug.edu.cn

通讯作者:
陈国雄(1988-),男,研究员,主要从事数学地球科学领域教学科研,ORCID:0000-0002-6785-9675,E-mail:gxchen@cug.edu.cn

中图分类号: P628;P618.51
计量
- 文章访问数: 76
- HTML全文浏览量: 0
- PDF下载量: 6
- 被引次数: 0
出版历程
- 收稿日期: 2025-12-30
- 网络出版日期: 2026-02-28

Geological Knowledge Extraction and Information Mining via the Fusion of Knowledge Graphs and LLMs:A Case Study of Carlin-type Gold Deposits

Guoqing Liu^1,2,
Guoxiong Chen^{1
, ,}

1 State Key Laboratory of Geological Processes and Mineral Resources, China University of Geosciences, Wuhan 430074, China;
2. School of Earth and Planetary Sciences, China University of Geosciences, Wuhan 430074, China

摘要

摘要: 针对地质勘查领域海量非结构化数据难以被有效利用以及通用大模型存在“事实幻觉”与专业逻辑匮乏等问题，本文提出一种融合知识图谱（KG）与检索增强生成（RAG）的垂直领域智能知识挖掘框架，并以中国黔西南与美国内华达地区的卡林型金矿成矿规律总结和对比研究为例进行了验证。首先，构建了基于本地化部署DeepSeek-32B的RAG智能问答系统，通过向量检索与生成式阅读理解，实现了专业知识的精准溯源与高可信问答。其次，利用大模型监督微调（SFT）技术，从数百份多源异构地质资料中高效构建了系统涵盖地层构造、蚀变矿物及控矿要素的跨区域成矿知识图谱。实验结果表明，该系统在客观准确性上显著优于GPT-4o，在主观生成上具备高忠实度与完全可溯源性，有效解决了幻觉问题。基于图谱拓扑学的分析不仅定量揭示了两地成矿的宏观异同，还量化了从矿石实体、蚀变组合到地球化学元素异常的级联指示路径，证实了其发现隐性找矿线索的能力。该研究实现了从非结构化文本到结构化知识的智能转化与深度挖掘，为解决地学领域“海量数据、知识饥饿”困境提供了新的技术路径。
- 卡林型金矿 /
- 大语言模型 /
- 知识图谱 /
- 检索增强生成 /
- 知识抽取
Abstract: To address the challenges in effectively utilizing massive unstructured data within geological exploration and the issues of hallucination and lack of specialized logic in general large language models (LLMs), we propose an intelligent knowledge mining framework for vertical domains that integrates Knowledge Graph (KG) and Retrieval-Augmented Generation (RAG). This framework is validated through a case study of Carlin-type gold deposits in the Southwest Guizhou, China, and Nevada, USA. Firstly, a RAG-based intelligent question-answering system was constructed using a locally deployed DeepSeek-32B model. Through vector retrieval and generative reading comprehension, the system achieves precise traceability of professional knowledge and highly reliable Q&A. Secondly, leveraging Supervised Fine-Tuning (SFT) techniques on the LLM, a cross-regional metallogenic knowledge graph systematically covering stratigraphy, structure, alteration minerals, and ore-controlling factors was efficiently built from hundreds of multi-sources, heterogeneous geological documents. Experimental results demonstrate that the proposed system significantly outperforms GPT-4o in terms of objective accuracy. For subjective content generation, it exhibits high faithfulness and full traceability, effectively mitigating the hallucination problem. Analyses based on graph topology not only quantitatively reveal the macroscopic similarities and differences in mineralization between the two regions but also quantify the cascading indicative pathways—from orebody entities and alteration assemblages to geochemical element anomalies. This confirms the system’s capability to discover implicit clues for mineral exploration. This study realizes the intelligent transformation and in-depth mining of knowledge from unstructured text to structured representations. It offers a novel technical pathway to address the dilemma of "data-rich yet knowledge-poor" prevalent in the geoscience domain.
- Carlin-type Gold Deposit /
- Large Language Model(LLM) /
- Knowledge Graph /
- Retrieval-Augmented Generation(RAG) /
- Knowledge Extraction

HTML全文

参考文献(0)

施引文献

资源附件(0)

访问统计

点击查看大图

计量

文章访问数: 76
HTML全文浏览量: 0
PDF下载量: 6
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

融合知识图谱与大语言模型的地学知识抽取与信息挖掘——以卡林型金矿为例

doi: 10.3799/dqkx.2026.036

作者简介:
刘国庆(2004-),男,本科生,主要从事大数据找矿研究工作,ORCID:0009-0003-1518-6258,E-mail:liuguoqing@cug.edu.cn

通讯作者:
陈国雄(1988-),男,研究员,主要从事数学地球科学领域教学科研,ORCID:0000-0002-6785-9675,E-mail:gxchen@cug.edu.cn

计量

Geological Knowledge Extraction and Information Mining via the Fusion of Knowledge Graphs and LLMs:A Case Study of Carlin-type Gold Deposits

计量

目录

留言板

融合知识图谱与大语言模型的地学知识抽取与信息挖掘——以卡林型金矿为例

doi: 10.3799/dqkx.2026.036

作者简介: 刘国庆(2004-),男,本科生,主要从事大数据找矿研究工作,ORCID:0009-0003-1518-6258,E-mail:liuguoqing@cug.edu.cn

通讯作者: 陈国雄(1988-),男,研究员,主要从事数学地球科学领域教学科研,ORCID:0000-0002-6785-9675,E-mail:gxchen@cug.edu.cn

计量

出版历程

Geological Knowledge Extraction and Information Mining via the Fusion of Knowledge Graphs and LLMs:A Case Study of Carlin-type Gold Deposits

计量

出版历程

目录

作者简介:
刘国庆(2004-),男,本科生,主要从事大数据找矿研究工作,ORCID:0009-0003-1518-6258,E-mail:liuguoqing@cug.edu.cn

通讯作者:
陈国雄(1988-),男,研究员,主要从事数学地球科学领域教学科研,ORCID:0000-0002-6785-9675,E-mail:gxchen@cug.edu.cn