Abstract
The medical context for a drug indication provides crucial information on how the drug can be used in practice. However, the extraction of medical context from drug indications remains poorly explored, as most research concentrates on the recognition of medications and associated diseases. Indeed, most databases cataloging drug indications do not contain their medical context in a machine-readable format. This paper proposes the use of a large language model for constructing DIAMOND-KG, a knowledge graph of drug indications and their medical context. The study 1) examines the change in accuracy and precision in providing additional instruction to the language model, 2) estimates the prevalence of medical context in drug indications, and 3) assesses the quality of DIAMOND-KG against NeuroDKG, a small manually curated knowledge graph. The results reveal that more elaborated prompts improve the quality of extraction of medical context; 71% of indications had at least one medical context; 63.52% of extracted medical contexts correspond to those identified in NeuroDKG. This paper demonstrates the utility of using large language models for specialized knowledge extraction, with a particular focus on extracting drug indications and their medical context. We provide DIAMOND-KG as a FAIR RDF graph supported with an ontology. Openly accessible, DIAMOND-KG may be useful for downstream tasks such as semantic query answering, recommendation engines, and drug repositioning research.
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | SWAT4HCLS’24: The 15th International SWAT4HCLS conference |
Untertitel des Sammelwerks | February 26-29, 2024 : Leiden, The Netherlands |
Publikationsstatus | Veröffentlicht - März 2024 |
Österreichische Systematik der Wissenschaftszweige (ÖFOS)
- 102015 Informationssysteme
- 102030 Semantische Technologien
- 102028 Knowledge Engineering
- 305905 Medizinische Informatik