LOD-a-lot: A single-file enabler for data science

Wouter Beek, Javier David Fernandez Garcia, Ruben Verborgh

Publikation: Beitrag in Buch/KonferenzbandBeitrag in Konferenzband

78 Downloads (Pure)

Abstract

Many data scientists make use of Linked Open Data (LOD) as a huge interconnected knowledge base represented in RDF. However, the distributed nature of the information and the lack of a scalable approach to manage and consume such Big Semantic Data makes it difficult and expensive to conduct large-scale studies. As a consequence, most scientists restrict their analyses to one or two datasets (often DBpedia) that contain at most hundreds of millions of triples.
LOD-a-lot is a dataset that integrates a large portion (over 28 billion triples) of the LOD Cloud into a single ready-to-consume file that can be easily downloaded, shared and queried with a small memory footprint. This paper shows there exists a wide collection of Data Science use cases that can be performed over such a LOD-a-lot file. For these use cases LOD-a-lot significantly reduces the cost and complexity of conducting Data Science.
OriginalspracheEnglisch
Titel des SammelwerksInternational Conference on Semantic Systems
Herausgeber*innen Rinke Hoekstra and Catherine Faron-Zucker and Tassilo Pellegrini and Victor de Boer
ErscheinungsortAmsterdam
VerlagACM Press
Seiten181 - 184
DOIs
PublikationsstatusVeröffentlicht - 2017

Zitat