Level 2: minimal metadata

Level 2 metadata should include the basic aspects from Level 1 and in addition the aspects shown in the table below which are based on our extensive literature review performed during earlier research by the authors. In order to measure these aspects we make, where possible use of the metrics defined by Zaveri et al. [1]

Dimension Definition Source Metrics
Release One-off vs. ongoing release. Single vs. a set or series of related datasets. Is it a service or API for accessing data? ODI[2] Frequency of release
Potential Use What can users do with it? What sort of question can it answer? Topic tags can be used to structure this aspect. LATC [3] Topic Tags
Compliance To which regulations/rules does it comply? ISO 9126 [4] Link to regulations
Production Date The date the dataset has been created. This might also include information about the last modification date or version of the data set. Ehling & Körner [5] Date and time
Format: Open Format Information about the format in which the dataset is provided, especially focusing on if the data is available in a standard open format. ODI, Dutch Government Format used (JSON, XML, RDF, CSV etc. )
Kind of data (Type of data) Unstructured (human readable data), statistical data (counts, percentages), Geo data (points, boundaries), other structured data. ODI, Dutch Government various from Zaveri [1]
Language The language that is used in the dataset Dutch Government Natural language used
Spatial Describes the area/ territory covered by the dataset Dutch Government Province, national, international
Semantics Understandability: Extend to which data are clear without ambiguity and easily comprehended ODI, Knight & Burn [6] various from Zaveri
Data model Is there a data model describing the objects represented by a computer system together with their properties and relationships. Availability of data model
Links Coherent links to other datasets. Ehling & Körner various from Zaveri
Size The size of the datasets, e.g. the amount of triples, or megabytes. Dutch Government various from Zaveri
Concise Extend to which information is compactly represented without being overwhelming. Knight & Burn various from Zaveri
Complete Is the datasets complete or are there certain parts missing? Knight & Burn various from Zaveri
Believability Extent to which dataset is regarded as true and credible. Knight & Burn meta-information about the identity of information provider, various from Zaveri
Reputation Extent to which dataset is highly regarded in terms of source or content. Knight & Burn various from Zaveri

The last four metadata dimensions are of less objective than the other dimensions. Data publishers might first need to get input from users, such as subjective judgments, before they are able to provide metadata information about these aspects for a specific dataset.


[1] Zaveri, Amrapali, et al. "Quality assessment methodologies for linked open data." Submitted to Semantic Web Journal (2013).

[2]https://certificates.theodi.org

[3]https://docs.google.com/document/d/150dJSMZk5W5ucF23hGj62DaoKtTk9qeaEPBN_VCCihI/edit?pli=1

[4]ISO/IEC. (2003). ISO/IEC 9126-2 Software engineering - Product quality - Part 2: External metrics.

[5]Ehling, M., & Körner, T. (Eds.). (2007). Handbook on Data Quality Assessment Methods and Tools. Wiesbaden.

[6]Knight, S. A., & Burn, J. (2005). Developing a framework for assessing information quality on the World Wide Web. Informing Science, 8, 159-172.


Go back to Metadata overview