Stardog Geospatial

Learn how to manage geospatial data in the Knowledge Graph.

Background

Stardog’s geospatial index is a powerful tool. Many users have augmented their knowledge graphs with spatial data to great success, adding another layer of utility to the enterprise. However, it is often one of the more troublesome features, having the potential to cause a few headaches when getting started. In this post, I intend to provide a detailed primer to help alleviate those headaches.

Stardog supports two geospatial specs: W3C’s WGS 84 and OGC’s GeoSPARQL. In this post I will, for the sake of clarity and readability, combine the two by using GeoSPARQL’s hasGeometry predicate to map locations and areas to all to all nodes of type geo:Geometry. While this is technically unneeded for WGS 84 features, it makes the queries we will be running on the data much easier to follow. We will be using a DC Landmarks data set. Feel free to load it yourself and play along!

It is worth noting that geospatial features are not enabled in the Community version of Stardog. You must have an Enterprise license, or trial thereof, to enable spatial.

Creating Geographical Data

By default, the spatial index is not enabled when creating a new database. It can be enabled by setting the spatial.enabled=true option when creating the database via CLI, or setting GeospatialOptions#SPATIAL_ENABLED to true in the SNARL API.

Our toy data set has about 10 nodes representing various landmarks in the Washington, DC area. Besides any domain knowledge we wish to attach to these nodes, in order to perform any spatial operations on them we need to associate them with a Geometry entity.

Representing single points

WGS latitude and longitude

For a simple latitude/longitude pair, we have a couple of choices available, the simplest of which is to use WGS 84 to specify them in our Geometry:

@prefix wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix : <http://blog.stardog.com/geons/> .

# Create geo:Geometry
:WhiteHouseGeom a geo:Geometry ;
  wgs:lat "38.89761"^^xsd:float ;
  wgs:long "-77.03637"^^xsd:float .

# Link it to our entity
:WhiteHouse a :Location ;
  rdfs:label "The White House" ;
  geo:hasGeometry :WhiteHouseGeom .

WKT

Our second option is to define our point’s Geometry using the OGC’s Well-Known Text (WKT) format. While it’s a fair bit easier to make mistakes this way, representing points with WKT will be more congruous with the rest of our data set, not to mention others’ data sets, as WKT is very widely used.

@prefix wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix : <http://blog.stardog.com/geons/> .

# geo:Geometry is still used as the type
# A big tripping point is that WKT points are expressed
# as (LONG LAT), with no comma!
:WashingtonMonGeom a geo:Geometry ;
  geo:asWKT "Point(-77.03525 38.88956)"^^geo:wktLiteral .

# Link this Geometry to an entity in our graph
:WashingtonMon a :Location ;
  rdfs:label "Washington Monument" ;
  geo:hasGeometry :WashingtonMonGeom .

More complicated shapes

For any shape more complex than a latitude/longitude point, WKT is our only option. Lots of shapes are supported; here are some of the ones we most commonly see:

  • Point(LONG LAT): A single point as described above
    • Note the lack of a comma
  • Linestring(LONG1 LAT1, LONG2 LAT2, ..., LONGN LATN): A line connecting the specified points
    • Commas between each point
  • Envelope(minLong, maxLong, maxLat, minLat): A rectangle with the specified corners
    • Note the commas between each
    • Especially note the somewhat odd ordering of (min, max, max, min).

For more complex shapes, Stardog supports JTS. By downloading and enabling this library, you gain access to these shapes, most notably:

  • Polygon(LONG1 LAT1, LONG2 LAT2, ..., LONGN LATN, LONG1 LAT1): A filled-in shape with the specified points
    • Note that a polygon must start and end with the same point, i.e., be closed

Querying Geographical Data

Now that we have inserted Geometries into Stardog’s spatial index, it would be nice to query them spatially. Stardog supports five of the major operators defined by GeoSPARQL. These functions require units of measurement to be passed; we support the QUDT ontology for this, prefixed in our dataset by unit:.

geof:within

This will return true when a given Geometry is contained within another. It has a few accepted forms:

  • <Geometry> geof:within <WKT Literal>: Specifying a WKT Literal for an area
  • <Geometry> geof:within <Geometry>: Passing in another Geometry
  • <Geometry> geof:within (LAT1 LONG1 LAT2 LONG2): Specifying Lat/Long of the lower-left and upper-right corner of a box
  • <Geometry> geof:within (<WKT Literal> <WKT Literal>): Specifying the lower-left and upper-right corners as WKT Points

Imagine we wish to retrieve a list of DC landmarks in our dataset that are in the Arlington, VA area, we can do that a few different ways:

prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix : <http://blog.stardog.com/geons/>

# All of these SPARQL queries are equivalent
# Pay special attention to the various ways the lat/long pairs are ordered

SELECT ?geom ?feature {
  ?f a :Location ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  ?geom geof:within "ENVELOPE(-77.111, -77.052, 38.885, 38.855)"^^geo:wktLiteral .
}

SELECT ?geom ?feature {
  ?f a :Location ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  # We define :ArlingtonGeom elsewhere in our data set as the envelope from above
  ?geom geof:within :ArlingtonGeom ;
}

SELECT ?geom ?feature {
  ?f a :Location ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  ?geom geof:within (38.855 -77.111 38.885 -77.052) ;
}

SELECT ?geom ?feature {
  ?f a :Location ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  ?geom geof:within ("POINT(-77.111 38.855)"^^geo:wktLiteral "POINT(-77.052 38.885)"^^geo:wktLiteral) .
}

Retrieving landmarks in Arlington

We can also use geof:within as a filter by passing in our Geometry as the first argument and then using any of the accepted sets of paramters.

Retrieving landmarks in the DC Metro Area

geof:nearby

This will return all Geometries that are within a specified radius of a given point. It has two forms:

  • <Geometry> geof:nearby (<Geometry> <Number of units> <Unit>)
  • <Geometry> geof:nearby (LAT LONG <Number of units> <Unit>)
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix unit: <http://qudt.org/vocab/unit#>
prefix : <http://blog.stardog.com/geons/>

# Get all features within a mile of the White House
SELECT ?geom ?feature {
  ?f a :Location ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  ?geom geof:nearby (:WhiteHouseGeom 1 unit:MileUSStatute) ;
}

# Get all features within 2km of the Kennedy Center
SELECT ?geom ?feature {
  ?f a :Location ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  ?geom geof:nearby (38.896004 -77.054995 2 unit:Kilometer) ;
}

Retrieving landmarks within 2km of the Kennedy Center

geof:area

This returns the area of a given Geometry in the specified unit. It can be used either to bind a variable or as part of a filter.

  • geof:area(<Geometry|WKT Literal>, <Unit>)
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix unit: <http://qudt.org/vocab/unit#>
prefix : <http://blog.stardog.com/geons/>

# Retrieve the area in km^2 of each shape in our dataset
SELECT ?feature ?area {
  ?f a :Area ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  BIND(geof:area(?geom, unit:Kilometer) as ?area)
}

# Retrieve the shapes in our dataset that are bigger than 100 km^2
SELECT ?feature {
  ?f a :Area ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  FILTER(geof:area(?geom, unit:Kilometer) > 100)
}

Using geof:nearby

geof:distance

This returns the distance between two Geometries in the specified unit. It can also be used as a variable binding or as a filter.

  • geof:distance(<Geometry|WKT Literal>, <Geometry|WKT Literal>, <Unit>)
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix unit: <http://qudt.org/vocab/unit#>
prefix : <http://blog.stardog.com/geons/>

# Retrieve each feature and its distance in Yards from the White House
SELECT ?feature ?distance {
  ?f a :Location ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  BIND(geof:distance(?geom, :WhiteHouseGeom, unit:Yard) as ?distance)
}
ORDER BY DESC(?distance)

# Retrieve the features in our dataset that are at least 2 miles from the White House
SELECT ?feature  {
  ?f a :Location ;
    rdfs:label ?feature ;
    geo:hasGeometry ?geom .
  FILTER(geof:distance(?geom, :WhiteHouseGeom, unit:MileUSStatute) > 2)
}

Using geof:distance

geof:relate

This returns the relationship between two Geometries. Possible results are geo:contains, geo:within, geo:intersects, geo:equals, geo:disjoint.

This function has slightly different forms, depending on if you’re using it as a BGP or a filter:

  • ?relation geof:relate (<Geometry> <Geometry>)
  • FILTER(geof:relate(<Geometry>, <Geometry>, <desired result>))
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix unit: <http://qudt.org/vocab/unit#>
prefix : <http://blog.stardog.com/geons/>

# Retrieve each area and its relation to the others
SELECT ?feature1 ?feature2 ?rel {
  ?f a :Area ;
    rdfs:label ?feature1 ;
    geo:hasGeometry ?geom1 .
  ?f2 a :Area ;
    rdfs:label ?feature2 ;
    geo:hasGeometry ?geom2 .
  ?rel geof:relate (?geom1 ?geom2) .
}

# Retrieve the areas in our dataset where one contains the other
SELECT ?feature1 ?feature2 {
  ?f a :Area ;
    rdfs:label ?feature1 ;
    geo:hasGeometry ?geom1 .
  ?f2 a :Area ;
    rdfs:label ?feature2 ;
    geo:hasGeometry ?geom2 .
  FILTER(geof:relate(?geom1, ?geom2, geo:contains))
}

Using geof:relate


Applied
Geospatial

Read Next

Top