https://data.blog.gov.uk/2010/07/05/avoiding-a-standards-roadblock-in-releasing-public-data/

Avoiding a standards roadblock in releasing public data

We've had some great feedback on our draft guidance on publishing local government spend data both on the blog post and in conversation with practitioners. We hope to improve the guidance soon.

The role of data standards has come up in discussion several times. Along the lines of:

"Surely we need to agree data standards for machine readable data before we publish it?"

The answer of course is 'no'. Local authorities should not wait for the process of agreeing standards or ontologies - they should publish now in line with the Berners Lee principles noting the guidance we set out. By all means engage in standards setting processes, in the long term if you have the spare resources but the data should be published first.

Standards exercises can be valuable. But they can take months or years, consume scarce resources and blunt early enthusiasm. Tim Berners Lee notes:

"There are two philosophies to putting data on the web. The top-down one is to make a corporate or national plan, by getting committees together of all the interested parties, and make a consistent set of terms (ontology) into which everything fits. This in fact takes so long it is often never finished, and anyway does not in fact get corporate or national consensus in the end. The other method experience recommends is to do it bottom up. A top-level mandate is extremely valuable, but grass-roots action is essential. Put the data up where it is: join it together later."
http://www.w3.org/DesignIssues/GovData.html

Local authorities do need to share understanding about how to publish data. Our view is that such understanding will only arise through use of data by the public for transparency. And through local authorities physically going through the process of publishing data. We can use this blog and other fora such as the communities of practice to share the experience in real time. Boroughs in London are already using the London Data Store as a way of sharing experiences around publishing data. And colleagues in Warwickshire are using their Open Data store and a Google Group to collaborate.

Rooms full of officials setting abstract standards are unlikely to achieve timely, useful results that provide the transparency Ministers seek by the end of the year. Transparency is about the practical use of the data by people who are not inside government. This community must be engaged at all stages - both as core beneficiaries and to drive the process to meet their need. It is the people's data, after all.

William Perrin
Chris Taggart

4 comments

  1. Anonymous

    Good point well made; but it would be wrong to over look existing standards (including data ontologies), or to fail to reuse them where appropriate.

    Perhaps an early action should be to draw up a list of such reusable standards, and check to see which can be quickly and easily applied?

    Link to this comment
    • william perrin

      thanks - a caveat to your point would be 'existing standards and ontologies that are used and useful'

      this will be drawn out in iteration with the potential and actual users of the data

      but again, after the raw data has been published in a csv file or similar

      Link to this comment
      • Anonymous

        @WillPerrin - Quite; hence my choice of words: "reuse them *where appropriate*"

        BTW, the CAPTCHA here sucks.

        Link to this comment
  2. CountCulture

    It's worth drawing also pointing to the closed data approach taken in the local spending scandal, as another way of *not* doing it. Someone who's rather better than I am at putting this into words, came up with a concise summary:

    (1) transparency needs to be principles-led – and one of those has to be “it’s not open unless it’s open”

    (2) it’s better to consult potential data users on useful, practical formats than try and work them on in committee rooms.

    (3) it’s better for LAs to share experience and approaches – saving the risks and costs of working out afresh, and avoiding data users having to develop 400 different data handling functions

    (4) formal standards may come later

    Rather than struggling to do its own licensing a LA should just pick up the recognised data.gov.uk Crown Copyright-interoperable licence in the way Windsor and Maidenhead have done.

    Follow those, and we'll not only avoid the rooms full of quangos divvying things up to ensure their survival, and also the wholesale handing of data to selected private companies, who then publish the crumbs to the community

    Chris Taggart

    Link to this comment