https://data.blog.gov.uk/2016/10/05/what-we-discovered-about-how-government-data-is-published/

What we discovered about how government data is published

We want more citizens and businesses to find open government data - and for that data to be as useful as possible. Our users have told us that problems with the quality of the data available through data.gov.uk mean that they often have to do additional work in order to make it usable. They have also informed us that sometimes it’s hard to find what they need. We set out to find why this is the case, and we started by speaking to data.gov.uk publishers.

We spoke to 45 people from different parts of government including the Department of Environment, Farming and Rural Affairs, Department for Transport, Department for Business, Energy and Industrial Strategy, NHS Digital and Leeds Council. Between them, these organisations publish lots of different kinds of data, from geographic information to how money is spent in government departments.

We looked at how these organisations publish data and, by doing so, found out what problems publishers are having, how they’ve tried to solve them, and what they need from us.

It’s no longer just about publishing

As well as getting a better understanding of publishers’ internal processes, it was fascinating to learn more about how publishers’ interaction with data.gov.uk has evolved over the past 5 years.  

Data.gov.uk was designed to be a catalogue, providing the public with links to government data published, primarily, under an Open Government Licence. This was part of a wider government commitment to be more transparent, accountable, and unlock value to businesses and citizens by opening up government data for public reuse.

It was important to get some momentum behind delivering this commitment, and as a result quantity was prioritised over quality.  The upshot was the UK government became one of the most open governments in the world, and data.gov.uk an exemplar source of open data. So as of September 2016 we have 36,322 datasets of varying quality and 3,694 records of data that is held but not released.

With all of that data already out there, publishers across government are increasingly focused on updating existing datasets, not just pushing out new ones. This has implications for their publishing processes and, by extension, the ways they need data.gov.uk to support them.

Knowing your user

Publishers are motivated to make the information they make public, more transparent, accessible and useable. For example, publishers in local authorities maintain the data that drives transport information apps. They also find innovative ways of discovering and meeting users’ needs, such as working with regional data scientists to better understand the problems of their communities. More generally, several local authorities and central government departments work hard to make their internal processes more supportive of new publishers and help them to overcome their worries about making a mistake or being criticised by the media.

Publishers want to know if they’re doing a good job and they recognise that their users are a rich source of useful feedback. When you have a relationship with the people using your data, you are more likely to know what’s wrong with it and then be able to improve it.

Departments, agencies and local authorities are engaging more and more with data users, whether it’s through consultations or social media. Through feedback from users, data publishers find out which formats are most useful, how often a dataset needs to be updated and what information would make the data more complete.

Doing the hard work to make it simpler for publishers

While publishing a new dataset on data.gov.uk is fairly straightforward, keeping things up to date and consistent isn’t always so.

We have known for a while that there are quite a few problems with publishing to data.gov.uk. Some are due to internal processes in departments, others are issues with the service or existing IT. Some colleagues struggle to update their records on data.gov.uk, others find it difficult to turn unpublished records into published datasets.

There are different places that data can be published. This causes confusion for publishers and users who are looking for it. We need to understand how citizens and businesses look for the data that they need so we can make it easier for them to find it.

Users of data need consistent contextual information about datasets - such as a description of what’s in the data, how it was produced, when it was made open and by whom. This helps them know if the dataset is useful or reliable.  

Often this information isn’t published on data.gov.uk in a consistent way. The language we use in our publisher tool is quite technical. For some publishers, this can be a bit confusing.

How we can help

We now have an overview of some of the challenges faced by publishers and the workarounds they’ve been forced to develop to try to mitigate them. We want to transform data.gov.uk in order to reduce this friction. On the basis of this initial research, one of the first things we’ll do is improve the process of publishing and managing metadata so that it is easier to consistently publish higher quality data.

We’ve also started designing a prototype of a publishing workflow which is simpler and gives more contextual guidance. We will be testing this with publishers in the coming weeks.

How you can help

In the next phase of research we will look at the needs of the end users of data, which should help us shape the overall data.gov.uk service. If you haven’t yet shared your experience of using government data then please complete our short survey.

Leave a comment