Conceptual friday 11 april 2014

Presentation slides are here Media:Kyzirakos-presentation.pdf

Minutes:

12 people were present at conceptual Friday, 11 april 2014:

  • Hans Overbeek
  • Rein van t veer
  • Pieter van Everdingen
  • Kostis Kyzirakos
  • Marc van Opijnen
  • Rob van Dort
  • Roel Stap
  • Jeroen Baltussen
  • Arjen Santema
  • Bart van Leeuwen
  • Linda van den Brink
  • Leo Meerman

Presentation by Kostis[bewerken]

Kostis Kyzirakos was present as guest speaker on spatial-temporal aspects of linked data. His PhD was about representing and querying spatial-temporal information in the semantic web. Currently he is working as a postdoc at CWI (UvA).

GeoSPARQL is very good for representing geographic information. So the spatial part is covered. It has been well described by OGC who have been working on standardization of geospatial information since 20 years. The work on temporal standardization however is only about 2 years old.

Temporal aspects:

  • Points in time
  • Valid time. Period during which a triple is valid.
  • Transaction time (snapshots)


There are many approaches for describing time, with may different requirements underlying them. There s not yet one standardized approach. W3C may make effort to standardize the Time ontology or something else; a working group on time is being created.

With regards to time there is a granularity problem: some contexts require recording of time in microseconds versus other contexts requiring prehistoric dating, geological periods. A suggestion of Kostis is that perhaps the solution lies in a general temporal vocabulary, like GML for the spatial aspect, and which can be extended with more specific application profiles, or more narrow / simple subsets, etc.

Kostis presents stRDF: spatial-temporal RDF, an extension of RDF. It supports spatial literals (WKT and GML, both already standardized encodings of geometry) and a temporal literal strdf:period. The temporal aspect is modeled as the fourth component of each triple: each triple has a period in which it is valid. These 'quads' are used and stored for convenience, but you can express them as triples as well.

In addition to stRDF there is also stSPARQL with extension query functions for spatial and temporal aspects of linked data. It has more functions than, for example, GeoSPARQL. In stSPARQL you can ask for geometries in a specific CRS, for example. In addition to spatial queries you can query using temporal keywords such as 'before', 'during'. Also, in stSPARQL you can add a time instant as a fourth query component to the WHERE clause, allowing you to ask only for triples that were valid at the time you specified.

stRDF and stSPARQL were used in an appliation for fire monitoring. For example, it lets you find all burned forest within 10km of a city.

stRDF and stSPARQL are implemented in Strabon. Strabon is an extension of Sesame and storage is handled with PostgreSQL and MonetDB. The system stores stRDF graphs; either stSPARQL or GeoSPARQL queries are possible. The output can be SPARQL results, KML, and GeoJSON. Performance is good compared to other current systems.

Sextant is an interesting component for visualizing time-evolving geometries.

Examples: http://bit.ly/sextant-rapid-mapping-attica http://bit.ly/FiresInGreece

The use case here is fire monitoring. The web application combines sensing data (satellite images etc) with other linked data such as data on hospitals, roads, and so on. Both the current situation and the changes over time of forest fires are available. Emergency response people thus have the data they need immediately. Combining all this data on a map also led to improvement of accuracy of the data. Errors in the data became visible and could be corrected.

A small ontology for fire data was created (with terms like 'hotspot') and combined with SWEET (an interesting ontology by NASA, with temporal aspects). The data was linked with GeoNames, OpenStreetMap, CORINE land use / land cover.

Questions and answers[bewerken]

  • Linda: From your experience with GeoSPARQL, what is your opinion on the coordinate reference system being stored together with the coordinates in a literal? I heard a lot of comments about this not being clean design. Kostis: I am in favour, because you cannot separate the coordinates from the CRS into two separate triples; you need the information directly with the coordinates. It has to be self-contained. Also, with the open world assumption, someone else could state that the coordinates are in another CRS.
  • Rein: With stSPARQL, can you convert on the fly to another projection? Kostis: Yes, this is possible during querying. Also stSPARQL has a transform function for this.
  • Jeroen: Does Strabon support the whole of GML? Kostis: No. The good thing of GML is that it is very expressive. However this also makes it more difficult. E.g. the support does not include arcs, because these are computationally problematic. Only simple features are supported. Note that Oracle Graph does not support GML in their implementation either and do not plan to.
  • Linda: GeoSPARQL supports three different topological relation families. This seems overly complex.What is your opinion? Kostis: There are small differences between them, and users from different backgrounds have different needs. E.g. people from the database world want simple features; for reasoning purposes you would want RCC8.

Insights from discussion:[bewerken]

  • Arjen: Time is connected to the triples, not to the subjects and objects. E.g. a building has a timeless URI; information about the building is time-dependent.
  • Hans: URI strategy should be just about identification. A URI identifier looks like a URL, but is only meant for identification.
  • Kostis agrees: prefers not to add things to the URI. Don't add metadata about the URI in the URI. It is just an identifier.
  • Then how do you point to a previous state of a thing? Different ways:
    • There can be URLs that let you find versions (e.g. the W3C approach: dateless URI for current thing, URIs with dates at the end for earlier versions)
    • The provenance ontology also offers a solutionl.
    • You can use the URI or the HTTP header to access data; e.g. type a date in the URI and get the information as it was on that time.
  • Bart: we should try to model this in various ways and then see if problems occur. Developers will not like having to do querying/reasoning to get to versions of things. Therefore having versions in URI will promote adoption.