Enhancing Scientific Knowledge Graph Generation Pipelines with LLMs and Human-in-the-Loop

Stefani Tsaneva, Danilo Dessì, Francesco Osborne, Marta Sabou

Publication: Chapter in book/Conference proceedingContribution to conference proceedings

Abstract

Scientific Knowledge Graphs have recently become a powerful tool for exploring the research landscape and assisting scientific inquiry. It is crucial to generate and validate these resources to ensure they offer a compre- hensive and accurate representation of specific research fields. However, manual approaches are not scalable, while automated methods often result in lower-quality resources. In this paper, we investigate novel validation techniques to improve the accuracy of automated KG generation methodologies, leveraging both a human-in- the-loop (HiL) and a large language model (LLM)-in-the-loop. Using the automated generation pipeline of the Computer Science Knowledge Graph as a case study, we demonstrate that precision can be increased by 12% (from 75% to 87%) using only LLMs. Moreover, a hybrid approach incorporating both LLMs and HiL significantly enhances both precision and recall, resulting in a 4% increase in the F1 score (from 77% to 81%).
Original languageEnglish
Title of host publicationProceedings of the 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment (ISWC 2024)
Subtitle of host publicationSci-K 2024, 12 Nov 2024, Baltimore
Place of PublicationAachen
PublisherRWTH Aachen University
Number of pages10
Publication statusPublished - 2024

Publication series

SeriesCEUR Workshop Proceedings
Volume3780
ISSN1613-0073

Cite this