BoekTNO/stappenplan

Introduction[bewerken]

“Data is the new gold” or “data is the new oil”. Probably most people have come across one of these expressions in the past years. Which is not a surprise as more data has been generated in the past 2 years than in the history of mankind . And data and analytics are changing the way companies make decisions creating a new world of opportunities. Open data, although not a new phenomenon, recently gets a lot of attention from governmental organizations as well as private companies. Open data is the idea that certain data, such as governmental data, should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time . Big data is often defined in terms of the following three dimensions: volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources). Not volume, but variety is the most challenging aspect nowadays, according to a Gartner survey. Linked Data describes a method of publishing structured data so that it can be interlinked and becomes more useful, tackling the variety dimension of big data. Linked Open Data (LOD) is a practical way (by defining a set of standards and guidelines) to contribute to the Semantic Web. Semantic means "learning about the meaning”. The semantic web can be defined as a web of connections between information which allows new insights to arise. Information published as Linked Open Data makes it easier to reuse that data, as it includes many references to other sources of knowledge which makes the access to the information more easy.

We hope that this roadmap inspires you to start working and experimenting with linked data, and that this roadmaps helps you in setting the first steps. We are open for suggestions about improving this roadmap, and finally if you need help you can either try the Platform Linked Data the Netherlands community, or contact us directly.

The need for Linked Open Data[bewerken]

Let’s show the need for LOD using the example of Den Haag. If you are using a search engine to search for information about “The Hague”, you will not find results where “Den Haag” occurs in instead of “The Hague” although both words refer to the same city. This can be explained by the fact that often web documents are linked together, but the content itself is not. A search engine can thus only search for key words not for the real content. Linked Data offers a solution to this problem by defining words as unique concepts and describing them in one or preferably several subject-predicate-object relations. A city is a concept and can therefore receive multiple attributes, each attribute also has its own concept again. Subject, predicate and object and thus in themselves again unique concepts. Each concept is becoming more significance as more descriptions are linked to. In this way, the contents of web documents becomes more meaningful and search results become more accurate. Eventually you get to a language independence where it does no longer matter if you search for 'The Hague' or 'Den Haag'.

Requirements for a Linked Open Data Roadmap

Importance of Data Quality

Roadmap Linked Open Data[bewerken]

There are already many guides, textbooks, tutorials and best practices available about linked open data. As part of our investigation we have reviewed several of these, but found none of them practical, concise and concrete enough for data publishers to apply directly. In this roadmap we have attempted to collect several of these best practices and compose them into a practical guide for publishing linked open data. That being said, our steps are largely based on the best practices from the W3C Linked Data Cookbook and Heath and Bizer’s Linked Data book. To provide context, we refer to the lifecycle model for linked open data.

We propose the following nine steps for our Linked Open Data Roadmap

Step 1: Select data

Step 2: Prepare the data

Step 3: Model the data

Step 4: Define a naming scheme

Step 5: Convert the data

Step 6: Organize Governance

Step 7: Add metadata

Step 8: Publish the data

Step 9: Link the data

Running Example[bewerken]

In order to illustrate the roadmap for publishing linked open data elaborated in this document we apply it to an example dataset. The example concerns an existing, non-governmental open dataset from Liander, one of the Dutch regional energy distributers.

Application of the roadmap to the example of Liander

Contact us![bewerken]

The information provided in this roadmap worked for us. We do not claim to be complete and invite you to share your experience and feedback with us! Get in contact with us via: silja.eckartz@tno.nl or erwin.folmer@kadaster.nl