Geospatial: A Primer
Get the latest in your inbox
Get the latest in your inbox
Geospatial data in the Knowledge Graph is awesome. Let’s dive in!
Stardog’s geospatial index is a powerful tool. Many users have augmented their knowledge graphs with spatial data to great success, adding another layer of utility to the enterprise. However, it is often one of the more troublesome features, having the potential to cause a few headaches when getting started. In this post, I intend to provide a detailed primer to help alleviate those headaches.
Stardog supports two geospatial specs: W3C’s WGS
84 and OGC’s
GeoSPARQL. In this post I
will, for the sake of clarity and readability, combine the two by using
GeoSPARQL’s hasGeometry
predicate to map locations and areas to all to all
nodes of type geo:Geometry
. While this is technically unneeded for WGS 84
features, it makes the queries we will be running on the data much easier to
follow. We will be using a DC Landmarks data set.
It is worth noting that geospatial features are not enabled in the Community version of Stardog. You must have an Enterprise license, or trial thereof, to enable spatial.
By default, the spatial index is not enabled when creating a new database. It can
be enabled by setting the spatial.enabled=true
option when creating the database via
CLI, or setting GeospatialOptions#SPATIAL_ENABLED
to true in the SNARL API.
Our toy data set has about 10 nodes representing various landmarks in the Washington, DC area. Besides any domain knowledge we wish to attach to these nodes, in order to perform any spatial operations on them we need to associate them with a Geometry entity.
For a simple latitude/longitude pair, we have a couple of choices available, the simplest of which is to use WGS 84 to specify them in our Geometry:
@prefix wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix : <https://blog.stardog.com/geons/> .
# Create geo:Geometry
:WhiteHouseGeom a geo:Geometry ;
wgs:lat "38.89761"^^xsd:float ;
wgs:long "-77.03637"^^xsd:float .
# Link it to our entity
:WhiteHouse a :Location ;
rdfs:label "The White House" ;
geo:hasGeometry :WhiteHouseGeom .
Our second option is to define our point’s Geometry using the OGC’s Well-Known Text (WKT) format. While it’s a fair bit easier to make mistakes this way, representing points with WKT will be more congruous with the rest of our data set, not to mention others’ data sets, as WKT is very widely used.
@prefix wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix : <https://blog.stardog.com/geons/> .
# geo:Geometry is still used as the type
# A big tripping point is that WKT points are expressed
# as (LONG LAT), with no comma!
:WashingtonMonGeom a geo:Geometry ;
geo:asWKT "Point(-77.03525 38.88956)"^^geo:wktLiteral .
# Link this Geometry to an entity in our graph
:WashingtonMon a :Location ;
rdfs:label "Washington Monument" ;
geo:hasGeometry :WashingtonMonGeom .
For any shape more complex than a latitude/longitude point, WKT is our only option. Lots of shapes are supported; here are some of the ones we most commonly see:
Point(LONG LAT)
: A single point as described above
Linestring(LONG1 LAT1, LONG2 LAT2, ..., LONGN LATN)
: A line connecting the specified points
Envelope(minLong, maxLong, maxLat, minLat)
: A rectangle with the specified corners
(min, max, max, min)
.For more complex shapes, Stardog supports JTS. By downloading and enabling this library, you gain access to these shapes, most notably:
Polygon(LONG1 LAT1, LONG2 LAT2, ..., LONGN LATN, LONG1 LAT1)
: A filled-in shape with the specified points
Now that we have inserted Geometries into Stardog’s spatial index, it would be
nice to query them spatially. Stardog supports five of the major operators
defined by GeoSPARQL. These functions require units of measurement to be passed;
we support the QUDT ontology for this, prefixed in our
dataset by unit:
.
This will return true when a given Geometry is contained within another. It has a few accepted forms:
<Geometry> geof:within <WKT Literal>
: Specifying a WKT Literal for an area<Geometry> geof:within <Geometry>
: Passing in another Geometry<Geometry> geof:within (LAT1 LONG1 LAT2 LONG2)
: Specifying Lat/Long of the lower-left and upper-right corner of a box<Geometry> geof:within (<WKT Literal> <WKT Literal>)
: Specifying the lower-left and upper-right corners as WKT PointsImagine we wish to retrieve a list of DC landmarks in our dataset that are in the Arlington, VA area, we can do that a few different ways:
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix : <https://blog.stardog.com/geons/>
# All of these SPARQL queries are equivalent
# Pay special attention to the various ways the lat/long pairs are ordered
SELECT ?geom ?feature {
?f a :Location ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
?geom geof:within "ENVELOPE(-77.111, -77.052, 38.885, 38.855)"^^geo:wktLiteral .
}
SELECT ?geom ?feature {
?f a :Location ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
# We define :ArlingtonGeom elsewhere in our data set as the envelope from above
?geom geof:within :ArlingtonGeom ;
}
SELECT ?geom ?feature {
?f a :Location ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
?geom geof:within (38.855 -77.111 38.885 -77.052) ;
}
SELECT ?geom ?feature {
?f a :Location ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
?geom geof:within ("POINT(-77.111 38.855)"^^geo:wktLiteral "POINT(-77.052 38.885)"^^geo:wktLiteral) .
}
We can also use geof:within
as a filter by passing in our Geometry as the first argument and then
using any of the accepted sets of paramters.
This will return all Geometries that are within a specified radius of a given point. It has two forms:
<Geometry> geof:nearby (<Geometry> <Number of units> <Unit>)
<Geometry> geof:nearby (LAT LONG <Number of units> <Unit>)
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix unit: <http://qudt.org/vocab/unit#>
prefix : <https://blog.stardog.com/geons/>
# Get all features within a mile of the White House
SELECT ?geom ?feature {
?f a :Location ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
?geom geof:nearby (:WhiteHouseGeom 1 unit:MileUSStatute) ;
}
# Get all features within 2km of the Kennedy Center
SELECT ?geom ?feature {
?f a :Location ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
?geom geof:nearby (38.896004 -77.054995 2 unit:Kilometer) ;
}
This returns the area of a given Geometry in the specified unit. It can be used either to bind a variable or as part of a filter.
geof:area(<Geometry|WKT Literal>, <Unit>)
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix unit: <http://qudt.org/vocab/unit#>
prefix : <https://blog.stardog.com/geons/>
# Retrieve the area in km^2 of each shape in our dataset
SELECT ?feature ?area {
?f a :Area ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
BIND(geof:area(?geom, unit:Kilometer) as ?area)
}
# Retrieve the shapes in our dataset that are bigger than 100 km^2
SELECT ?feature {
?f a :Area ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
FILTER(geof:area(?geom, unit:Kilometer) > 100)
}
This returns the distance between two Geometries in the specified unit. It can also be used as a variable binding or as a filter.
geof:distance(<Geometry|WKT Literal>, <Geometry|WKT Literal>, <Unit>)
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix unit: <http://qudt.org/vocab/unit#>
prefix : <https://blog.stardog.com/geons/>
# Retrieve each feature and its distance in Yards from the White House
SELECT ?feature ?distance {
?f a :Location ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
BIND(geof:distance(?geom, :WhiteHouseGeom, unit:Yard) as ?distance)
}
ORDER BY DESC(?distance)
# Retrieve the features in our dataset that are at least 2 miles from the White House
SELECT ?feature {
?f a :Location ;
rdfs:label ?feature ;
geo:hasGeometry ?geom .
FILTER(geof:distance(?geom, :WhiteHouseGeom, unit:MileUSStatute) > 2)
}
This returns the relationship between two Geometries. Possible results are
geo:contains, geo:within, geo:intersects, geo:equals, geo:disjoint
.
This function has slightly different forms, depending on if you’re using it as a BGP or a filter:
?relation geof:relate (<Geometry> <Geometry>)
FILTER(geof:relate(<Geometry>, <Geometry>, <desired result>))
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix geof: <http://www.opengis.net/def/function/geosparql/>
prefix unit: <http://qudt.org/vocab/unit#>
prefix : <https://blog.stardog.com/geons/>
# Retrieve each area and its relation to the others
SELECT ?feature1 ?feature2 ?rel {
?f a :Area ;
rdfs:label ?feature1 ;
geo:hasGeometry ?geom1 .
?f2 a :Area ;
rdfs:label ?feature2 ;
geo:hasGeometry ?geom2 .
?rel geof:relate (?geom1 ?geom2) .
}
# Retrieve the areas in our dataset where one contains the other
SELECT ?feature1 ?feature2 {
?f a :Area ;
rdfs:label ?feature1 ;
geo:hasGeometry ?geom1 .
?f2 a :Area ;
rdfs:label ?feature2 ;
geo:hasGeometry ?geom2 .
FILTER(geof:relate(?geom1, ?geom2, geo:contains))
}
Geospatial processing is a very powerful tool and an excellent enhancement to an enterprise knowledge graph. Hopefully this blog post can help to ease some of the headaches that can come along with navigating the specs and getting this information properly into the Stardog spatial index.
Download Stardog today to start your free 30-day evaluation.
How to Overcome a Major Enterprise Liability and Unleash Massive Potential
Download for free