Knowledge Graph Construction using Information Extraction of Indonesia Cosmetic Product Text in Bahasa Indonesia

Deborah Aprilia Josephine, Ayu Purwarianti, Fajar J. Ekaputra

Publication: Chapter in book/Conference proceedingContribution to conference proceedings

Abstract

Knowledge graphs can be used for entity recognition in text, graph visualization, and to improve business processes, e.g. information retrieval in E-commerce. One of the information sources for building knowledge graphs is text data available in many digital system, such as E-commerce platform. In this paper, we proposed an approach to extract knowledge graph entities from product text available from E-commerce platforms. We utilize transfer learning technique with full-fine tuning from an existing trained model in order to recognize the entities due to the limitation of labeled data. Since some English terms are expressed in product texts, we used multilingual pretrained models with the Transformer Architecture, i.e. multilingual-BERT-base-cased (mBERT) and XLM-RoBERTa-base (XLMR) in our approach. The extracted entities were then mapped into a knowledge graph by adopting Text to Knowledge Graph (T2KG) framework components, i.e. using entity mapping and triple integration. The training data contains 1.500 labeled texts, while the test data contains 216 labeled texts conducted in three versions of data and four scenarios. Our evaluation result showed that the XLMR model performed better than mBERT for entity extraction task with an average F1-score of 0,895. Furthermore, we manually evaluate the knowledge graph mapping and construction using 1.445 product texts from two E-commerce platforms, which resulted in 338 entities formed in the knowledge graph with mapping precision 0,94.

Original languageEnglish
Title of host publicationProceedings : 2021 8th International Conference on Advanced Informatics
Subtitle of host publicationConcepts, Theory, and Application, ICAICTA 2021
Place of PublicationNew York
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665417433
ISBN (Print)978-1-6654-1744-0
DOIs
Publication statusPublished - 2021
Externally publishedYes
Event8th International Conference on Advanced Informatics: Concepts, Theory, and Application, ICAICTA 2021 - Virtual, Bandung, Indonesia
Duration: 29 Sept 202130 Sept 2021

Conference

Conference8th International Conference on Advanced Informatics: Concepts, Theory, and Application, ICAICTA 2021
Country/TerritoryIndonesia
CityVirtual, Bandung
Period29/09/2130/09/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Keywords

  • entity
  • information extraction
  • knowledge graph
  • mapping
  • model

Cite this