Generating RDF from Streaming and Archival Data

Overview

Recent state-of-the-art approaches and technologies for generating RDF graphs from non-RDF data, use languages designed for specifying transformations or mappings to data of various kinds of format. This work is a new approach for the generation of ontology-annotated RDF graphs, linking data from multiple heterogeneous streaming and archival data sources, with high throughput and low latency. To support this, and in contrast to existing approaches, we propose embedding in the RDF generation process a close-to-sources data processing and linkage stage, supporting the fast template-driven generation of triples in a subsequent stage. This approach, called RDF-Gen, has been implemented as a SPARQL-based RDF generation approach. RDF-Gen is evaluated against the latest related work of RML and SPARQL-Generate, using real world datasets.

Experimental Results

Three different data sets, for typical or large volumes of data varying between 100 and 100,000 entries.

  • An artificial dataset of Persons, generated by GenerateData.com, mapping 8 properties
  • A real-life archival dataset of aircrafts (compiled from FlightRadar24.com), mapping 9 properties
  • Aircraft surveillance streaming data, mapping 5 properties

Figures 1,2,3 present the achieved throughput of RDF-Gen for each of the data sets, varying their size.


Figure 1: Achieved throughput on “Persons” data set

  • RML
  • SPARQL-Generate
  • RDF-Gen

Figure 2: Achieved throughput on “Aircrafts” data set

  • RML
  • SPARQL-Generate
  • RDF-Gen

Figure 3: Achieved throughput on “Surveillance” data set

  • RML
  • SPARQL-Generate
  • RDF-Gen

Publications

Georgios Santipantakis, George Vouros, Apostolos Glenis, Christos Doulkeridis, Akrivi Vlachou. Generating linked RDF data from heterogeneous streaming and archival data sources: Populating the datAcron ontology (more)

Georgios M. Santipantakis, Konstantinos I. Kotis, George A. Vouros, Christos Doulkeridis. RDF-Gen: Generating RDF from Streaming and Archival Data (more)

Giorgos Santipantakis, Apostolos Glenis, Nikolaos Kalaitzian, Akrivi Vlachou, Christos Doulkeridis, George Vouros. FAIMUSS: Flexible Data Transformation to RDF from Multiple Streaming Sources (more)

License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ or send a letter to Creative
Commons, PO Box 1866, Mountain View, CA 94042, USA.

(c) AI-Group/UNIVERSITY OF PIRAEUS RESEARCH CENTER (UPRC)