Abstract
Implementing machine learning in healthcare, especially for heart disease predic- tion, is crucial for saving lives through accurate diagnosis. Yet, the effectiveness of these models is constrained by a lack of extensive, annotated datasets, which are crucial for training robust models. Additionally, there is an under-use of existing knowledge graphs (KGs), which contain structured domain knowledge that could improve the performance of models. To address this challenge, our study intro- duces approaches that integrate KG embeddings with tabular data, with the goal of improving the performance of machine learning algorithms for heart disease prediction. We conduct a comparative analysis of various methodologies to merge tabular data with KGs, focusing on heart disease, and evaluate the performance of two embedding algorithms in augmenting datasets for more accurate machine learning applications. Our methodology, which involves testing embeddings from diverse KGs, has consistently shown improvements in model performance. Specifi- cally, we increased the accuracy of the Feed-Forward Neural Network from 82% to 85% and the F2 score for the K-Nearest Neighbors model from 71% to 80%. This advancement offers a promising direction for leveraging semantic information from KGs for knowledge-enhanced machine learning in healthcare.
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Austrian Symposium on AI, Robotics, and Vision (AIROV24) |
Publikationsstatus | Veröffentlicht - 2024 |