TripleGeo utility


Welcome to TripleGeo: An open-source tool for extracting geospatial features into RDF triples

TripleGeo is a utility developed by the Institute for the Management of Information Systems at Athena Research Center under the EU/FP7 project GeoKnow: Making the Web an Exploratory for Geospatial Knowledge. This generic purpose, open-source tool can be used for integrating features from geospatial databases into RDF triples.

TripleGeo is based on open-source utility geometry2rdf. However, this earlier tool (2010) has been substantially modified and enhanced to extract non-geographical attributes and also interact with diverse geographical and triple formats. TripleGeo is written in Java and is still under development; more enhancements will be included in future releases. However, all supported features have been tested and work smoothly in both MS Windows and Linux platforms.

The Java source code for TripleGeo is freely available from here.

Converting geospatial features into triples

From a user’s perspective, the utility works from command line in a transparent fashion according to some preconfigured settings. Execution is parameterized with a configuration file that declares user preferences for the conversion. TripleGeo provides the following functionality:

When initiated, this process iterates through all features in the original dataset and emits a series of triples per record. Every geometric feature is turned into properly formatted triple(s), according to the specified vocabulary. Additional descriptive attributes can be extracted, including identifiers, names, or feature types. For the time being, such attributes are exported as literals, without taking into account any underlying ontology.

Architecture

TripleGeo has been implemented with several Java classes in a modular fashion as illustrated in the following flow diagram:

TripleGeo

Input

The current version of TripleGeo utility can access geometries from:

Geospatial data must reside in a single table (in case of a database) or one shapefile. Currently, there is no support for combining information from several sources (e.g., by joining two or more tables).

Output

In terms of output serializations, triples can be obtained in one of the following formats:

Concerning geospatial representations, triples can be exported according to:

Results are written into a local file, so that they can be readily imported into a triple store.

Configuration settings

Before attempting any conversion using TripleGeo, a configuration file must be prepared. This file lists crucial properties that define how input data will be accessed, where they will be exported and into which format, as well as optional features (e.g., reprojection into another spatial reference system).

These settings include properties concerning:

You may consult these sample configurations that cover several indicative cases in terms of data access and supported geometric types.

Execution

In order to use TripleGeo for extracting triples from a spatial dataset, the user should follow these steps:

  1. Open a terminal window and navigate to the directory where TripleGeo has been extracted. Normally, this folder includes a lib/ subdirectory with the required libraries, as well as a configuration file (e.g., named options.conf).
  2. Verify that Java JRE (or SDK) ver 1.7 or later is installed. Currently installed version of Java can be checked using: java –version from the command line.
  3. Next, check all properties in the required configuration file, as explained in Section 3.3.2. This file must be located in the same folder as the executable TripleGeo.jar package. If triples are to be extracted from a DBMS, make sure that the correct credentials are given in the configuration file.
  4. In case that triples will be extracted from ESRI shapefiles, give the following command:
    java -cp lib/*;TripleGeo.jar eu.geoknow.athenarc.triplegeo.ShpToRdf options.conf
  5. Alternatively, if triples will be extracted from a geospatially-enabled DBMS (e.g., Oracle Spatial), give the following command:
    java -cp lib/*;TripleGeo.jar eu.geoknow.athenarc.triplegeo.wkt.RdbToRdf options.conf
  6. While conversion is running, it periodically issues notifications about its progress. Note that for large datasets (i.e., hundreds of thousands of records), conversion may take several minutes.

As soon as processing is finished and all triples are written into a file, the user is notified about the total amount of extracted triples and the overall execution time.

Resources for testing

License

The contents of this project are licensed under the GPL v3 License (UPDATE).

Top of Page


Development: © 2013 Institute for the Management of Information Systems,
Athena Research Center, Greece.
Last updated: 14 June 2013 11:00:00 EET.