Requirements for a Linked Open Data Roadmap

The development of the Internet can be described as follows: the web of documents (Web 1.0) via Web 2.0 where the Internet is regarded as an interactive communication medium where users can upload information to Web 3.0: the Web of Linked Data, where Internet applications and data can be linked via Web services to each other. These smart applications can follow the links between data sets and easily align, connect and integrate information. Linked Data is an essential part of the Semantic Web, as described by Tim Berners-Lee in 2006 in his five star model. In practice most organizations that publish open data get stuck at the second or third star. The data is publicly available online and provided it is accompanied by some documentation it is interpretable and reusable by other humans. Nevertheless, it is still a great effort to do something useful with the data. First it has to be understood by reading the documentation. Then it has to be imported and converted to a format that the user’s tools can work with. And only then, something useful can be done with the data. When open data is published as linked data at four or five stars, the life of a data consumer becomes much easier, because then the data becomes self-describing, machine-interpretable and accessible in a uniform manner. For a data publisher to take his dataset from three to four stars or even five takes considerable more effort than getting to the third star.

Although the 5 star model is often used and without doubt valuable, it is not sufficient to just focus on providing sustainable 5 star data. Therefore, we need to define additional requirements our LOD roadmap needs to address:

Metadata: rich metadata should be provided, including a preference for provenance data and including an insight in the quality of the data.

• Multiple formats: the roadmap needs to address how to deal with multiple data formats, depending on the demands from potential users, e.g. the advice to provide raw data when possible (potentially next to other “enhanced versions” of the data.

• Provide the data as service (query able, sparql/json-ld).

• Use existing vocabularies, or include links to existing vocabularies.

• Use an URI strategy, preferably an existing one in your context.

• Organize management, including governance, of the data (see BOMOD deliverable).

The five star model, and the additional requirements presented above, have been used as requirements for the roadmap. The goal of the roadmap is to give practical guidance to a data publisher that wants to publish good data, and actually wants his data to be easy to use for users and sustainable in the end.