Abstract
The medical context for a drug indication provides crucial information on how the drug can be used in practice. However, the extraction of medical context from drug indications remains poorly explored, as most research concentrates on the recognition of medications and associated diseases. Indeed, most databases cataloging drug indications do not contain their medical context in a machine-readable format. This paper proposes the use of a large language model for constructing DIAMOND-KG, a knowledge graph of drug indications and their medical context. The study 1) examines the change in accuracy and precision in providing additional instruction to the language model, 2) estimates the prevalence of medical context in drug indications, and 3) assesses the quality of DIAMOND-KG against NeuroDKG, a small manually curated knowledge graph. The results reveal that more elaborated prompts improve the quality of extraction of medical context; 71% of indications had at least one medical context; 63.52% of extracted medical contexts correspond to those identified in NeuroDKG. This paper demonstrates the utility of using large language models for specialized knowledge extraction, with a particular focus on extracting drug indications and their medical context. We provide DIAMOND-KG as a FAIR RDF graph supported with an ontology. Openly accessible, DIAMOND-KG may be useful for downstream tasks such as semantic query answering, recommendation engines, and drug repositioning research.
Original language | English |
---|---|
Title of host publication | SWAT4HCLS’24: The 15th International SWAT4HCLS conference |
Subtitle of host publication | February 26-29, 2024 : Leiden, The Netherlands |
Publication status | Published - Mar 2024 |
Austrian Classification of Fields of Science and Technology (ÖFOS)
- 102015 Information systems
- 102030 Semantic technologies
- 102028 Knowledge engineering
- 305905 Medical informatics
Keywords
- Knowledge Graph Construction
- Medical Knowledge Graph
- LLMs in KGC