Knowledge Graph Construction using Information Extraction of Indonesia Cosmetic Product Text in Bahasa Indonesia

Deborah Aprilia Josephine, Ayu Purwarianti, Fajar J. Ekaputra

Publikation: Beitrag in Buch/KonferenzbandBeitrag in Konferenzband


Knowledge graphs can be used for entity recognition in text, graph visualization, and to improve business processes, e.g. information retrieval in E-commerce. One of the information sources for building knowledge graphs is text data available in many digital system, such as E-commerce platform. In this paper, we proposed an approach to extract knowledge graph entities from product text available from E-commerce platforms. We utilize transfer learning technique with full-fine tuning from an existing trained model in order to recognize the entities due to the limitation of labeled data. Since some English terms are expressed in product texts, we used multilingual pretrained models with the Transformer Architecture, i.e. multilingual-BERT-base-cased (mBERT) and XLM-RoBERTa-base (XLMR) in our approach. The extracted entities were then mapped into a knowledge graph by adopting Text to Knowledge Graph (T2KG) framework components, i.e. using entity mapping and triple integration. The training data contains 1.500 labeled texts, while the test data contains 216 labeled texts conducted in three versions of data and four scenarios. Our evaluation result showed that the XLMR model performed better than mBERT for entity extraction task with an average F1-score of 0,895. Furthermore, we manually evaluate the knowledge graph mapping and construction using 1.445 product texts from two E-commerce platforms, which resulted in 338 entities formed in the knowledge graph with mapping precision 0,94.

Titel des SammelwerksProceedings : 2021 8th International Conference on Advanced Informatics
Untertitel des SammelwerksConcepts, Theory, and Application, ICAICTA 2021
ErscheinungsortNew York
VerlagInstitute of Electrical and Electronics Engineers Inc.
ISBN (elektronisch)9781665417433
ISBN (Print)978-1-6654-1744-0
PublikationsstatusVeröffentlicht - 2021
Extern publiziertJa
Veranstaltung8th International Conference on Advanced Informatics: Concepts, Theory, and Application, ICAICTA 2021 - Virtual, Bandung, Indonesien
Dauer: 29 Sept. 202130 Sept. 2021


Konferenz8th International Conference on Advanced Informatics: Concepts, Theory, and Application, ICAICTA 2021
OrtVirtual, Bandung

Bibliographische Notiz

Publisher Copyright:
© 2021 IEEE.
