| Citation: | Zhang Baoyi, Tang Jiacheng, Zhang Tongyun, Wang Binhai, Shi Yuzheng, Zhan Qingzhong, Fang Zhenxi, Kablan Or Aimon Brou Koffi, Ma Kai, 2026. Knowledge Graph and Question-Answering Model for Geological Prospecting Empowered by Large Language Models. Earth Science, 51(3): 982-995. doi: 10.3799/dqkx.2025.176 |
Current applications of Large Language Models (LLMs) in geological prospecting face challenges including insufficient domain expertise, data privacy concerns, and model hallucinations. Furthermore, there remains a lack of efficient and rapid knowledge recommendation methods for LLMs in this field. This study proposes a KG-RAG (Knowledge Graph-Embedded Retrieval-Augmented Generation) framework that automates the extraction and structured representation of geological prospecting knowledge under the constraints of a geological ontology, leveraging large LLMs as tools. It further employs multi-hop retrieval algorithms within the knowledge graph to enhance the depth and breadth of retrieved content, thereby constructing an intelligent question-answering model for geological prospecting. Experimental results demonstrate that KG-RAG achieves scores of 0.807 (Precision), 0.833 (Recall), and 0.819 (F1-score) in knowledge graph construction tasks. Compared to direct knowledge extraction using the baseline LLM (GLM4-9B), KG-RAG delivers improvements of approximately 50% (Precision), 8% (Recall), and 29% (F1-score), respectively. In question-answering tasks, KG-RAG achieves 0.917 (Recall) and 0.88 (Precision), outperforming document vector-embedded retrieval-augmented generation methods by approximately 24% (Recall) and 22% (Precision), respectively. KG-RAG exhibits superior performance in both knowledge graph construction and intelligent question-answering. It effectively collects and represents geological prospecting and mineral exploration knowledge, providing a valuable reference to geologists for the combined application of LLMs and knowledge graphs.
|
Bizer, C., Lehmann, J., Kobilarov, G., et al., 2009. DBpedia: A Crystallization Point for the Web of Data. Journal of Web Semantics, 7(3): 154-165. https://doi.org/10.1016/j.websem.2009.07.002
|
|
Cai, F. C., Qin, J. H., Qin, J. N., et al., 2021. Geochemical Characteristics and LA-ICP-MS Zircon U-Pb Dating of Ore-Bearing Granite of Chuankou Intrusion-Related Tungsten Deposit, Hunan Province. China Geology, 48(4): 1212-1224 (in Chinese with English abstract).
|
|
Chen, X. D., Liu, Y. P., Han, W., et al., 2025. A Vision-Language Foundation Model-Based Multi-Modal Retrieval-Augmented Generation Framework for Remote Sensing Lithological Recognition. ISPRS Journal of Photogrammetry and Remote Sensing, 225: 328-340. https://doi.org/10.1016/j.isprsjprs.2025.04.015
|
|
Church, K. W., Sun, J. M., Yue, R., et al., 2024. Emerging Trends: A Gentle Introduction to RAG. Natural Language Engineering, 30(4): 870-881. https://doi.org/10.1017/s1351324924000044
|
|
de Almeida, T. D., de Oliveira, N. N., He, C. D., et al., 2025. Using Generative Pre-Trained Transformer-4 (GPT-4), Ffmpeg, and Microsoft Azure to Aid in Creating a Text-to-Video Generation Tool to Improve Safety Shares and Incident Descriptions in the Mining Industry. Mining, Metallurgy & Exploration, 42(3): 1325-1343. https://doi.org/10.1007/s42461-024-01114-y
|
|
Dong, S. C., Li, Y., Lü, H. R., et al., 2020. An Editing Platform of Geoscience Knowledge System. Geological Journal of China Universities, 26(4): 384-394 (in Chinese with English abstract).
|
|
Dreyer, J., 2025. China Made Waves with Deepseek, but Its Real Ambition is AI-Driven Industrial Innovation. Nature, 638(8051): 609-611. https://doi.org/10.1038/d41586-025-00460-1
|
|
Floridi, L., Chiriatti, M., 2020. GPT-3: Its Nature, Scope, Limits, and Consequences. Minds and Machines, 30(4): 681-694. https://doi.org/10.1007/s11023-020-09548-1
|
|
Fu, Y., Wang, M. G., Wang, C. B., et al., 2025. GeoMinLM: A Large Language Model in Geology and Mineral Survey in Yunnan Province. Ore Geology Reviews, 182: 106638. https://doi.org/10.1016/j.oregeorev.2025.106638
|
|
Guo, F., Lai, P., Huang, F. M., et al., 2024. Literature Review and Research Progress of Landslide Susceptibility Mapping Based on Knowledge Graph. Earth Science, 49(5): 1584-1606 (in Chinese with English abstract).
|
|
Hosseini, S., Seilani, H., 2025. The Role of Agentic AI in Shaping a Smart future: A Systematic Review. Array, 26: 100399. https://doi.org/10.1016/j.array.2025.100399
|
|
Hu, Y. J., Mai, G. C., Cundy, C., et al., 2023. Geo-Knowledge-Guided GPT Models Improve the Extraction of Location Descriptions from Disaster-Related Social Media Messages. International Journal of Geographical Information Science, 37(11): 2289-2318. https://doi.org/10.1080/13658816.2023.2266495
|
|
Jiang, B., Yang, J. X., Yang, C., et al., 2020. Knowledge Augmented Dialogue Generation with Divergent Facts Selection. Knowledge-Based Systems, 210: 106479. https://doi.org/10.1016/j.knosys.2020.106479
|
|
Jiang, S. W., Zhang, J. W., Hua, L. S., et al., 2025. Implementation of Meteorological Database Question-Answering Based on Large-Scale Model Retrieval-Augmentation Generation. Computer Engineering and Applications, 61(5): 113-121 (in Chinese with English abstract).
|
|
Katz, D. M., Bommarito, M. J., Gao, S., et al., 2024. GPT-4 Passes the Bar Exam. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 382(2270): 20230254. https://doi.org/10.1098/rsta.2023.0254
|
|
Li, C. L., Wang, Z. X., Lü, Q. T., et al., 2021. Mesozoic Tectonic Evolution of the Eastern South China Block: A Review on the Synthesis of the Regional Deformation and Magmatism. Ore Geology Reviews, 131: 104028. https://doi.org/10.1016/j.oregeorev.2021.104028
|
|
Li, H., Yue, P., Tapete, D., et al., 2024. ESDC: An Open Earth Science Data Corpus to Support Geoscientific Literature Information Extraction. Science China Earth Sciences, 67(12): 3840-3854. https://doi.org/10.1007/s11430-023-1444-9
|
|
Li, H., Yue, P., Wu, H. R., et al., 2025. A Question-Answering Framework for Geospatial Data Retrieval Enhanced by a Knowledge Graph and Large Language Models. International Journal of Digital Earth, 18(1): 2510566. https://doi.org/10.1080/17538947.2025.2510566
|
|
Li, N. X., Zhang, R. Q., Zhu, L., et al., 2023. Tracing Tungsten-Tin Mineralization Processes with Tourmaline Geochemistry in the Wangxianling-Hehuaping District, Nanling Range (South China). Ore Geology Reviews, 163: 105806. https://doi.org/10.1016/j.oregeorev.2023.105806
|
|
Liang, J. Y., Hou, S. Y., Jiao, H. Y., et al., 2025. GeoGraphRAG: A Graph-Based Retrieval-Augmented Generation Approach for Empowering Large Language Models in Automated Geospatial Modeling. International Journal of Applied Earth Observation and Geoinformation, 142: 104712. https://doi.org/10.1016/j.jag.2025.104712
|
|
Ma, X. G., 2022. Knowledge Graph Construction and Application in Geosciences: A Review. Computers & Geosciences, 161: 105082. https://doi.org/10.1016/j.cageo.2022.105082
|
|
Memduhoğlu, A., Fulman, N., Zipf, A., 2024. Enriching Building Function Classification Using Large Language Model Embeddings of OpenStreetMap Tags. Earth Science Informatics, 17(6): 5403-5418. https://doi.org/10.1007/s12145-024-01463-8
|
|
Peng, N. L., Wang, X. H., Yang, J., et al., 2017. Re-Os Dating of Molybdenite from Sanjiaotan Tungsten Deposit in Chuankou Area, Hunan Province, and Its Geological Implications. Mineral Deposits, 36(6): 1402-1414 (in Chinese with English abstract).
|
|
Qin, J. H., Wang, D. H., Li, C., et al., 2020. The Molybdenite Re-Os Isotope Chronology, in Situ Scheelite and Wolframite Trace Elements and Sr Isotope Characteristics of the Chuankou Tungsten Ore Field, South China. Ore Geology Reviews, 126: 103756. https://doi.org/10.1016/j.oregeorev.2020.103756
|
|
Qiu, Q. J., Wu, L., Ma, K., et al., 2023. A Knowledge Graph Construction Method for Geohazard Chain for Disaster Emergency Response. Earth Science, 48(5): 1875-1891 (in Chinese with English abstract).
|
|
Song, H. B., Huang, M. X., Fan, Z. H., et al., 2002. Characteristics of the Ore-Controlling Structures of the Sanjiaotan Wolframite Deposit and Its Relationships with Ore Formation in Chuankou, Hunan. Geotectonica et Metallogenia, 26(1): 51-54 (in Chinese with English abstract).
|
|
Tong, B., Yin, Y. P., Li, B., et al., 2025. Review on Artificial Intelligence-Based Large Language Models for Geological Hazards. The Chinese Journal of Geological Hazard and Control, 36(2): 1-12 (in Chinese with English abstract).
|
|
Vidivelli, S., Ramachandran, M., Dharunbalaji, A., 2024. Efficiency-Driven Custom Chatbot Development: Unleashing LangChain, RAG, and Performance-Optimized LLM Fusion. Computers, Materials & Continua, 80(2): 2423-2442. https://doi.org/10.32604/cmc.2024.054360
|
|
Wang, C. B., Wang, M. G., Wang, B., et al., 2024. Knowledge Graph-Infused Quantitative Mineral Resource Forecasting. Earth Science Frontiers, 31(4): 26-36 (in Chinese with English abstract).
|
|
Wang, D. H., Liu, X. X., Liu, L. J., 2015. Characteristics of Big Geodata and Its Application to Study of Minerogenetic Regularity and Minerogenetic Series. Mineral Deposits, 34(6): 1143-1154 (in Chinese with English abstract).
|
|
Wang, G. Q., Xie, J. L., Zhang, T., et al., 2025. LLaMA-Unidetector: An LLaMA-Based Universal Framework for Open-Vocabulary Object Detection in Remote Sensing Imagery. IEEE Transactions on Geoscience and Remote Sensing, 63: 4409318. https://doi.org/10.1109/TGRS.2025.3564332
|
|
Wu, H. Y., Shen, Z. X., Hou, S. Y., et al., 2025. Large Language Model-Driven GIS Analysis: methods, Applications, and Prospects. Acta Geodaetica et Cartographica Sinica, 54(4): 621-635 (in Chinese with English abstract).
|
|
Wu, R. L., Guo, D. H., 2025. Research on Evaluation Standards for Spatial Cognitive Abilities in Large Language Models. Journal of Geo-Information Science, 27(5): 1041-1052 (in Chinese with English abstract).
|
|
Xu, C., Su, M. Y., Sun, B., et al., 2024. Tourism Knowledge Graph Construction Based on ChatGLM and Prompt-Tuning. Science Technology and Engineering, 24: 13484-13492 (in Chinese with English abstract).
|
|
Zhang, W., Cai, M. X., Zhang, T., et al., 2024a. EarthGPT: A Universal Multimodal Large Language Model for Multisensor Image Comprehension in Remote Sensing Domain. IEEE Transactions on Geoscience and Remote Sensing, 62: 5917820. https://doi.org/10.1109/TGRS.2024.3409624
|
|
Zhang, Y. F., Wei, C., He, Z. T., et al., 2024b. GeoGPT: An Assistant for Understanding and Processing Geospatial Tasks. International Journal of Applied Earth Observation and Geoinformation, 131: 103976. https://doi.org/10.1016/j.jag.2024.103976
|
|
Zhang, Z. J., Kusky, T., Gao, M., et al., 2023. Spatio-Temporal Analysis of Big Data Sets of Detrital Zircon U-Pb Geochronology and Hf Isotope Data: Tests of Tectonic Models for the Precambrian Evolution of the North China Craton. Earth-Science Reviews, 239: 104372. https://doi.org/10.1016/j.earscirev.2023.104372
|
|
Zhou, Y. Z., Zuo, R. G., Liu, G., et al., 2021. The Great-Leap-Forward Development of Mathematical Geoscience during 2010-2019: Big Data and Artificial Intelligence Algorithm are Changing Mathematical Geoscience. Bulletin of Mineralogy, Petrology and Geochemistry, 40(3): 556-573 (in Chinese with English abstract).
|
|
蔡富成, 秦锦华, 覃金宁, 等, 2021. 湖南川口岩体型钨矿赋矿花岗岩地球化学特征及LA-ICP-MS锆石U-Pb定年. 中国地质, 48(4): 1212-1224.
|
|
董少春, 李艳, 闾海荣, 等, 2020. 地球科学知识体系编辑平台. 高校地质学报, 26(4): 384-394.
|
|
郭飞, 赖鹏, 黄发明, 等, 2024. 基于知识图谱的滑坡易发性评价文献综述及研究进展. 地球科学, 49(5): 1584-1606. doi: 10.3799/dqkx.2023.058
|
|
江双五, 张嘉玮, 华连生, 等, 2025. 基于大模型检索增强生成的气象数据库问答模型实现. 计算机工程与应用, 61(5): 113-121.
|
|
彭能立, 王先辉, 杨俊, 等, 2017. 湖南川口三角潭钨矿床中辉钼矿Re-Os同位素定年及其地质意义. 矿床地质, 36(6): 1402-1414.
|
|
邱芹军, 吴亮, 马凯, 等, 2023. 面向灾害应急响应的地质灾害链知识图谱构建方法. 地球科学, 48(5): 1875-1891. doi: 10.3799/dqkx.2022.313
|
|
宋宏邦, 黄满湘, 樊钟衡, 等, 2002. 湖南川口三角潭黑钨矿床控矿构造特征及其与成矿的关系. 大地构造与成矿学, 26(1): 51-54.
|
|
佟彬, 殷跃平, 李昺, 等, 2025. 地质灾害人工智能大语言模型研究展望. 中国地质灾害与防治学报, 36(2): 1-12.
|
|
王成彬, 王明果, 王博, 等, 2024. 融合知识图谱的矿产资源定量预测. 地学前缘, 31(4): 26-36.
|
|
王登红, 刘新星, 刘丽君, 2015. 地质大数据的特点及其在成矿规律、成矿系列研究中的应用. 矿床地质, 34(6): 1143-1154.
|
|
吴华意, 沈张骁, 侯树洋, 等, 2025. 大语言模型驱动的GIS分析: 方法、应用与展望. 测绘学报, 54(4): 621-635.
|
|
吴若玲, 郭旦怀, 2025. 大语言模型空间认知能力测试标准研究. 地球信息科学学报, 27(5): 1041-1052.
|
|
徐春, 苏明钰, 孙彬, 等, 2024. 基于ChatGLM和提示微调的旅游知识图谱构建. 科学技术与工程, 24(31): 13484-13492.
|
|
周永章, 左仁广, 刘刚, 等, 2021. 数学地球科学跨越发展的十年: 大数据、人工智能算法正在改变地质学. 矿物岩石地球化学通报, 40(3): 556-573.
|