While following the roadmap, and especially in the last step, the organization of governance, you will realize that metadata about your dataset is of crucial importance. In this step we will introduce three levels of metadata that you can use when describing your dataset.
In order to make the dataset self-describing and thus support the re-usage of data, extra information about the data needs to be added to the data by the data supplier. Self-describing data suggests that information about the encodings used for each representation is provided explicitly within the representation. Such data about data is called metadata and includes information about the data origin, the data production date and for which applications the data can be used. Metadata that describes the process of data development is also referred to as provenance [Freie et al. 2008]. Provenance gives an indication of the reliability of the data. Another metadata aspect interesting for reusing data is information about the usability of the data. It might be interesting for data users to learn about successful applications of other data users. Information about data usability is also very valuable for Linked Data. It can provide a good indication of the potential success of similar applications in the future. Metadata can be added by simply adding triples to the RDF version of the dataset obtained in Step 5 describing facts about the dataset.
Linked Data published on the Web should be as self-describing as possible in order to make it easier for clients to understand and use the data. Important aspects of self-descriptiveness are making vocabulary terms de-referenceable according to the best practices described in Publishing RDF Vocabularies, using terms from common vocabularies and providing vocabulary mappings for proprietary vocabulary terms.
We structure this section using the three levels of metadata described by CKAN:
We extend the aspects mentioned in that classification with aspects from our quality model developed during earlier research. The Dutch government has published a list with elements that metadata of datasets published at data.overheid.nl should include. Most of the elements are compulsory. The elements fall in four categories: context, data source, characteristics, involved organizations. We add these elements to the tables provided for the three levels of metadata using their original identifier from data.overheid.nl.
[Freie et al.] Freire, J., Koop, D., & Moreau, L. (2008). Second International Provenance and Annotation Workshop. Paper presented at the IPAW 2008, Salt Lake City, Utah.
Weergave van een feit, begrip of aanwijzing, geschikt voor overdracht, interpretatie of verwerking door een persoon of apparaat
Weergave van een feit, begrip of aanwijzing, geschikt voor overdracht, interpretatie of verwerking door een persoon of apparaat
Resource Description Framework (RDF) is een standaardmodel voor gegevensuitwisseling op het web. RDF heeft functies die het samenvoegen van gegevens vergemakkelijken, zelfs als de onderliggende schema's verschillen, en het ondersteunt specifiek de evolutie van schema's in de loop van de tijd zonder dat alle gegevensgebruikers moeten worden gewijzigd.
Een gegevensverzameling is een verzameling RDF-triples, die wordt gepubliceerd, onderhouden of geaggregeerd door één aanbieder.
De activiteiten van Platform Linked Data Nederland (PLDN) worden mede mogelijk gemaakt dankzij het Kadaster, TNO, Big Data Value Center (BDVC), ECP, Forum Standaardisatie, Kennisnet, SLO, Waternet, Taxonic, MarkLogic, Triply, Franz Inc., SemmTech, Rijksdienst voor het Cultureel Erfgoed (RCE), Beeld en Geluid, EuroSDR, de KVK en ArchiXL
Wilt u op de hoogte gehouden worden van nieuws en ontwikkelingen binnen PLDN?
Schrijf u dan in voor de nieuwsbrief