Example Step 1: Select data

Versie door Seckartz (overleg | bijdragen) op 12 dec 2014 om 10:48
(wijz) ← Oudere versie | Huidige versie (wijz) | Nieuwere versie → (wijz)

The first step is to identify and select the data to be published.

Liander collects data on energy consumption and (local) production, e.g. through the use of energy meters. Liander would like to publish this data in order to:

  • Be transparent as a public utility company
  • Stimulate open innovation
  • Gain insight into data needs
  • Improve data quality by receiving feedback


The table below shows a snapshot of the raw metering data. It contains the electricity consumption at 15 minute intervals of a number of households with smart meters.

LianderSnapshotStep1.PNG


Although this data can be interesting for data consumers, e.g., to visualize energy consumption for individual households at different periods during a day, there are several issues with this data and publication will have to be restricted in a number of ways.

Firstly, this data is subject to data protection laws. It is personal data and publication will violate the privacy of the households concerned. Therefore, it cannot be published as is.

Secondly, Liander is providing a commercial service based on this data to large energy consumers. Publishing the data as open data would cannibalize one of their own revenue streams.

Thirdly, the quality of the data varies a lot. Households with smart meters may provide measurements at 15 minute intervals, but not all households have smart meters yet. In the worst case, for households without smart meters meter readings are only validated once every three years. And even for households with smart meters readings are sometimes received only once a quarter.

In order to deal with these issues, the data is restricted in the following ways:

  • The data is anonymized. Rather than publishing the annual usage for each individual household, the annual usage is aggregated for all households in the geographical area determined by the 6-digit postcode. If there are less than ten households in one postcode, the annual usage of two or more consecutive postcode areas are aggregated.
  • Commercially sensitive data is removed from the dataset, i.e., only energy usage of private households, the so called small users, is published.
  • The data quality is standardized. Rather than publishing actual meter readings at regular intervals, Liander only publishes the estimated, standardized annual usage. This value is recalibrated once a quarter using recent readings, but will be published only once a year.


Go back to example overview