Enhancing Scientific Knowledge Graph Generation Pipelines with LLMs and Human-in-the-Loop

Stefani Tsaneva, Danilo Dessì, Francesco Osborne, Marta Sabou

Publikation: Beitrag in Buch/KonferenzbandBeitrag in Konferenzband

Abstract

Scientific Knowledge Graphs have recently become a powerful tool for exploring the research landscape and assisting scientific inquiry. It is crucial to generate and validate these resources to ensure they offer a compre- hensive and accurate representation of specific research fields. However, manual approaches are not scalable, while automated methods often result in lower-quality resources. In this paper, we investigate novel validation techniques to improve the accuracy of automated KG generation methodologies, leveraging both a human-in- the-loop (HiL) and a large language model (LLM)-in-the-loop. Using the automated generation pipeline of the Computer Science Knowledge Graph as a case study, we demonstrate that precision can be increased by 12% (from 75% to 87%) using only LLMs. Moreover, a hybrid approach incorporating both LLMs and HiL significantly enhances both precision and recall, resulting in a 4% increase in the F1 score (from 77% to 81%).
OriginalspracheEnglisch
Titel des SammelwerksProceedings of the 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment (ISWC 2024)
Untertitel des SammelwerksSci-K 2024, 12 Nov 2024, Baltimore
ErscheinungsortAachen
VerlagRWTH Aachen University
Seitenumfang10
PublikationsstatusVeröffentlicht - 2024

Publikationsreihe

ReiheCEUR Workshop Proceedings
Band3780
ISSN1613-0073

Zitat