-------------------------------------------------------------------------- ## Project: GeoKnow, http://geoknow.eu ## Testing geospatial support in Virtuoso RDF store ver. 7.1. ## Using INSPIRE-compliant DATA for Greece taken from geodata.gov.gr. ## Tests performed by Kostas Patroumpas ## Date: 3/4/2014 ## Revised: 9/5/2014 ------------------------------------------------------------------------- ## Tests against a VM installation of Virtuoso ColumnStore edition 7.1 in a Linux ubuntu 8 64-bit machine ------------------------------------------------------------------------- ## Create graph from ISQL interface: SPARQL CREATE GRAPH ; => RUNNING! ==> Done. -- 6 msec. ## SIMPLE METHOD for bulk loading of RDF triples: ## Use interactive SQL to import geometric points in RDF/XML format ## IMPORTANT: Use RDF/XML in order to keep Greek characters in string literals (other encodings like N-TRIPLES do NOT preserve encoding!) DB.DBA.RDF_LOAD_RDFXML (file_to_string_output('/home/virtuoso/scripts/data/tmp/inspire/data/ad_Kalamaria_addresses_GR_virt.rdf'), '', 'urn:x-geoknow-eu:sparql:virtuoso:data:inspire'); => LOADED! ==> Done. -- 19299 msec. 277838 triples DB.DBA.RDF_LOAD_RDFXML (file_to_string_output ('/home/virtuoso/scripts/data/tmp/inspire/data/au_Kallikratis_GR_virt.rdf'), '', 'urn:x-geoknow-eu:sparql:virtuoso:data:inspire'); => LOADED! ==> Done. -- 4090 msec. 9454 triples DB.DBA.RDF_LOAD_RDFXML (file_to_string_output ('/home/virtuoso/scripts/data/tmp/inspire/data/cp_Kalamaria_parcels_GR_virt.rdf'), '', 'urn:x-geoknow-eu:sparql:virtuoso:data:inspire'); => LOADED! ==> Done. -- 1083 msec. 13510 triples DB.DBA.RDF_LOAD_RDFXML (file_to_string_output ('/home/virtuoso/scripts/data/tmp/inspire/data/gn_settlements_GR_virt.rdf'), '', 'urn:x-geoknow-eu:sparql:virtuoso:data:inspire'); => LOADED! ==> Done. -- 2310 msec. 304957 triples DB.DBA.RDF_LOAD_RDFXML (file_to_string_output ('/home/virtuoso/scripts/data/tmp/inspire/data/ps_natura2000_GR_virt.rdf'), '', 'urn:x-geoknow-eu:sparql:virtuoso:data:inspire'); => LOADED! ==> Done. -- 1171 msec. 10894 triples DB.DBA.RDF_LOAD_RDFXML (file_to_string_output ('/home/virtuoso/scripts/data/tmp/inspire/data/tn_Kalamaria_roads_GR_virt.rdf'), '', 'urn:x-geoknow-eu:sparql:virtuoso:data:inspire'); => LOADED! ==> Done. -- 1117 msec. 59432 triples DB.DBA.RDF_LOAD_RDFXML (file_to_string_output ('/home/virtuoso/scripts/data/tmp/inspire/data/hy_rivers_GR_virt.rdf'), '', 'urn:x-geoknow-eu:sparql:virtuoso:data:inspire'); => LOADED! ==> Done. -- 2441 msec. 120372 triples ## Get total count of triples in the graph (initially, it must be empty) SPARQL SELECT (COUNT(*) AS ?num) WHERE { GRAPH { ?s ?p ?o } } ; ==> 796457 statements in the triple store (of 796457 originally submitted) --> ALL inserted successfully. ********************************PREFIXES************************************** ## Must use the following namespaces with data and spatial queries on INSPIRE schemata: ## For countries PREFIX gmd: ## GeoSPARQL (Virtuoso instantiation): PREFIX geo: ## INSPIRE schemata: PREFIX base: PREFIX gn: PREFIX au: PREFIX hy: PREFIX hy-n: PREFIX net: PREFIX ad: PREFIX cp: PREFIX tn: PREFIX tn-ro: PREFIX ps: *******************************QUERIES****************************** ## All queries about DATA should target the following named graph URI in the SPARQL enpoint: GRAPH ## Queries are very verbose, because the must reflect the corresponding INSPIRE schema. If this is respected, then they run successfully. ## In all queries, the condition '?f geo:hasGeometry ?fGeom .' can be used interchangebly with '?f au:geometry ?fGeom .' (or another INSPIRE data theme prefix instead of 'au') . ## D0: Count all stored triples in the graph (including those with blank nodes!): --------------------------------------------------------------------------------- SELECT (COUNT(*) AS ?num) WHERE { ?s ?p ?o } should be REWRITTEN as: SELECT (COUNT(*) AS ?num) WHERE { GRAPH { ?s ?p ?o } } ## D1: "Which administrative unit (AU) is at the specified geographic location": ------------------------------------------------------------------------------ PREFIX gn: PREFIX au: PREFIX geo: SELECT ?nCode ?aName WHERE { ?f au:nationalCode ?nCode . ?f au:name ?a . ?a gn:GeographicalName ?gnName . ?gnName gn:spelling ?spn . ?spn gn:SpellingOfName ?spt . ?spt gn:text ?aName . ?f geo:hasGeometry ?fGeom . ?fGeom geo:asWKT ?fWKT . FILTER (bif:st_contains (?fWKT, bif:st_point (22.952149, 40.582051))) } ==> RUNNING! (and it SHOWS Greek characters in the result!). ==> The given point is specified in WGS84 lon/lat coordinates, as well as the data. Currently, Virtuoso 7.1 does not support any other georeference systems. ==> Two results are returned (instead of one), because "in current version of Virtuoso, only a combination of bounding box and a point is supported; the functionality will be extended in the next release." (OpenLink comment) ## D2: "Find protected sites (PS) within a distance of 20 km from the given location": ------------------------------------------------------------------------------------ PREFIX gn: PREFIX ps: PREFIX geo: SELECT ?fName ?dist_km WHERE { ?f ps:siteName ?p . ?p gn:GeographicalName ?gnName . ?gnName gn:spelling ?spn . ?spn gn:SpellingOfName ?spt . ?spt gn:text ?fName . ?f geo:hasGeometry ?fGeom . ?fGeom geo:asWKT ?fWKT . BIND (bif:st_distance(bif:st_point(23.735933, 37.975598), bif:st_point( (bif:st_xmin(?fWKT)+bif:st_xmax(?fWKT))/2.0 , (bif:st_ymin(?fWKT)+bif:st_ymax(?fWKT))/2.0) ) AS ?dist_km) . FILTER (?dist_km < 20) } ORDER BY ?dist_km ==> RUNNING! ==> Currently, Virtuoso 7.1 does NOT support distances between shapes other than points. Therefore, centroids of the stored geometries must be used in such comparisons. ## D3: "Identify the administrative unit (AU) where each protected site (PS) belongs to": --------------------------------------------------------------------------------------- PREFIX gn: PREFIX au: PREFIX ps: PREFIX geo: SELECT ?adminName ?siteName WHERE { ?q au:name ?r . ?r gn:GeographicalName ?rnName . ?rnName gn:spelling ?rspn . ?rspn gn:SpellingOfName ?rspt . ?rspt gn:text ?adminName . ?q geo:hasGeometry ?qGeom . ?qGeom geo:asWKT ?qWKT . ?f ps:siteName ?p . ?p gn:GeographicalName ?gnName . ?gnName gn:spelling ?spn . ?spn gn:SpellingOfName ?spt . ?spt gn:text ?siteName . ?f geo:hasGeometry ?fGeom . ?fGeom geo:asWKT ?fWKT . FILTER (bif:st_intersects(bif:st_get_bounding_box(?qWKT),bif:st_get_bounding_box(?fWKT))) } LIMIT 50 ==> RUNNING, but exact geometric calculations based on predicate 'st_within' are NOT possible. ==> That's why it has been replaced by 'st_intersects' against the bounding boxes of the geometries involved in the comparison. ==> Still, "in current version of Virtuoso, st_intersects is not complete and does not support arcs of all sorts and rings of polygons; this will be fixed in the next release." ==> Query should better specify a 'LIMIT k' clause, otherwise it may take a long time to provide a response to this spatial join search. ## D4: "Find settlements (GN) that are contained within the given administrative unit (AU)": ------------------------------------------------------------------------------------------ PREFIX gn: PREFIX au: PREFIX geo: SELECT ?sName WHERE { ?f au:name ?a . ?a gn:GeographicalName ?gnName . ?gnName gn:spelling ?spn . ?spn gn:SpellingOfName ?spt . ?spt gn:text "Ρόδου"@el . ?f geo:hasGeometry ?fGeom . ?fGeom geo:asWKT ?fWKT . ?s gn:name ?p . ?p gn:GeographicalName ?stName . ?stName gn:spelling ?stn . ?stn gn:SpellingOfName ?sptn . ?sptn gn:text ?sName . ?s geo:hasGeometry ?sGeom . ?sGeom geo:asWKT ?sWKT . FILTER (bif:st_within(bif:st_point( (bif:st_xmin(?sWKT)+bif:st_xmax(?sWKT))/2.0 , (bif:st_ymin(?sWKT)+bif:st_ymax(?sWKT))/2.0), bif:st_get_bounding_box(?fWKT))) } LIMIT 20 ==> RUNNING! (and it supports Greek literals in condition!) ==> Again, geometric comparison must be based on centroids and bounding boxes, because topological precicates in the current version 7.1 of Virtuoso are not yet fully implemented. ## D5: "Retrieve all cadastral parcels (CP) of area greater than 20000 sq.m.": ---------------------------------------------------------------------------- PREFIX cp: SELECT ?pCode ?area WHERE { ?f cp:label ?pCode . ?f cp:areaValue ?area . FILTER ( ?area > 20000) } ORDER BY ?area ==> RUNNING! ## D6: "Find all road names (TN) in lexicographical order": --------------------------------------------------------- PREFIX tn: PREFIX gn: PREFIX geo: SELECT DISTINCT((?roName) AS ?roadName) WHERE { ?r geo:hasGeometry ?rGeom . ?rGeom geo:asWKT ?rWKT . ?r tn:geographicalName ?p . ?p gn:GeographicalName ?rdName . ?rdName gn:spelling ?rpn . ?rpn gn:SpellingOfName ?rpt . ?rpt gn:text ?roName . } ORDER BY ?roadName LIMIT 10 ==> RUNNING! ## D7: "Identify waterstreams (HY) of length longer than 30km": ------------------------------------------------------------- PREFIX gn: PREFIX hy-n: PREFIX xsd: SELECT ?riverName ?len WHERE { ?q hy-n:geographicalName ?r . ?r gn:GeographicalName ?rnName . ?rnName gn:spelling ?rspn . ?rspn gn:SpellingOfName ?rspt . ?rspt gn:text ?riverName . ?q hy-n:length ?len . FILTER ( xsd:decimal(?len) > 30000) } ==> RUNNING! (results refer to parts of rivers, due to geometry splits at intersections with tributaries). ## D8: "Find rivers (HY) that intersect with protected sites (PS)": ----------------------------------------------------------------- PREFIX gn: PREFIX hy-n: PREFIX ps: PREFIX geo: SELECT ?riverName ?siteName WHERE { ?q hy-n:geographicalName ?r . ?r gn:GeographicalName ?rnName . ?rnName gn:spelling ?rspn . ?rspn gn:SpellingOfName ?rspt . ?rspt gn:text ?riverName . ?q geo:hasGeometry ?qGeom . ?qGeom geo:asWKT ?qWKT . ?f ps:siteName ?p . ?p gn:GeographicalName ?gnName . ?gnName gn:spelling ?spn . ?spn gn:SpellingOfName ?spt . ?spt gn:text ?siteName . ?f geo:hasGeometry ?fGeom . ?fGeom geo:asWKT ?fWKT . FILTER (bif:st_intersects(bif:st_get_bounding_box(?fWKT), bif:st_get_bounding_box(?qWKT))) } LIMIT 10 ==> RUNNING! ==> Again, geometric comparison must be based on bounding boxes, because topological precicates in the current version 7.1 of Virtuoso are not yet fully implemented. ==> Query should better specify a 'LIMIT k' clause, otherwise it may take a long time to provide a response. ## D9: Identify all addresses (AD) that refer to a particular road: ----------------------------------------------------------------- PREFIX ad: PREFIX gn: PREFIX geo: SELECT ?sName ?addNum ?pCode WHERE { ?f ad:locator ?aLoc . ?aLoc ad:AddressLocator ?addDes . ?addDes ad:designator ?aDes . ?aDes ad:LocatorDesignator ?aLocDes . ?aLocDes ad:designator ?addNum . ?f ad:component ?cp . ?cp ad:postCode ?pCode . ?f ad:component ?t . ?t ad:name ?thName . ?thName gn:GeographicalName ?rnName . ?rnName gn:spelling ?rspn . ?rspn gn:SpellingOfName ?rspt . ?rspt gn:text "ΠΟΝΤΟΥ"@el . ?rspt gn:text ?sName } ==> RUNNING! and it supports Greek literals in condition!) ## D10: Identify addresses (AD) within 2km from the given location: ----------------------------------------------------------------- PREFIX ad: PREFIX gn: PREFIX geo: SELECT ?tName ?addNum ?pCode ?dist WHERE { ?f ad:locator ?aLoc . ?aLoc ad:AddressLocator ?addDes . ?addDes ad:designator ?aDes . ?aDes ad:LocatorDesignator ?aLocDes . ?aLocDes ad:designator ?addNum . ?f ad:component ?cp . ?cp ad:postCode ?pCode . ?f ad:component ?c . ?c ad:name ?t . ?t gn:GeographicalName ?rnName . ?rnName gn:spelling ?rspn . ?rspn gn:SpellingOfName ?rspt . ?rspt gn:text ?tName . ?f ad:position ?addr . ?addr ad:GeographicPosition ?aPos . ?aPos geo:hasGeometry ?fGeom . ?fGeom geo:asWKT ?fWKT . BIND (bif:st_distance(?fWKT, bif:st_point (22.952149, 40.582051)) AS ?dist) . FILTER (?dist < 2) } ORDER BY ?dist LIMIT 10 ==> RUNNING!!! ==> Query should better specify a 'LIMIT k' clause, otherwise it may take some time to provide a response. ****************************************************************************************