https://data.blog.gov.uk/2015/03/24/progress-on-the-national-information-infrastructure-project/

Progress on the National Information Infrastructure Project

Data is the backdrop to our lives, and we are growing more dependent on it every day. In our personal lives, location data connects us with restaurants, parks and events, while our to do lists and emails keep us on top of what we must do, and our fitness band tells us how many calories we have consumed and how far we need to walk to offset them.

In our collective lives, as a society, data provides us with traffic information, tells us the status of the bus and rail networks, where hospitals are, how our government is performing, how much air pollution we are experiencing, and if our home is at risk of flooding.

There are now around 20,00 datasets released on data.gov.uk, and our commitment to releasing data as early as we can has been one of the hallmarks of the UK Government’s approach. Wherever we have been able to we have sought to release data even if imperfectly, knowing that making things open helps make them better.

But not all data is created equal. Organisations from community charities, hackers and major corporations have all asked us to also improve the quality and availability of the data we make public. This was one of the main findings from the Shakespeare Review of Public Sector Information. The National Information Infrastructure is a response to requests by you, the open data user-community, to:

  1. Make the most critical government data easier to find in one place
  2. Improve the quality and interoperability of the data through clarity about data standards
  3. Pay attention to the usability of the data, with API access alongside bulk download, and service level agreements about the persistence and timeliness of the data

The National Information Infrastructure pays particular attention to the fundamental layer of data most significant to the running of our nation. This core data forms a key component of our nation’s data assets. This data helps society to function well and it is the data that brings all the other data to life. As roads, sewers, railways were fundamental infrastructure innovations for the Victorians, today critical data assets are prominent in that list.

In identifying this core data - and documenting its availability, how it is managed, and how to access it - we strengthen both our ability to protect these important assets, and also enable conversations about the ways in which we may make better use of it.

The potential benefits are huge:

  • it can provide improved raw material for data-hungry businesses to help spur innovation and growth
  • it can facilitate benchmarking across government services, and identify opportunities for financial and social improvements
  • it can help government make savings by creating price transparency and driving up competitiveness
  • it can increase democratic engagement and trust in government by strengthening accountability
  • it can help us identify gaps in the data and prevent duplication of data collection
  • it can drive fresh insights that improve government policy and operations

If you wish to read a very good vision of the benefits of an NII, take a look at the work produced by the Open Data User Group here. You can also read the Open Data Institute’s Open Data Roadmap for the UK.

A collaborative approach

In September 2014 we began an active dialogue with the open data community about how we could strengthen the NII and maximise its impact (you can see the details here).We took an agile approach to the process, concentrating first on a discovery phase. We worked with over 100 representatives from government, business and civil society in order to progress through the alpha, beta and live stages (more on how government defines these stages).

Out of the discovery phase emerged the seed of an implementation document which over 30 experts and open data enthusiasts helped draft. In December 2014 we published this prototype document for people to comment on. We have now arrived at the beta implementation document that we are publishing today.

One thing that became clear from this process is that we don’t all agree on every aspect of the NII and how it should be managed - but we do see substantial consensus around the fundamentals. We have worked to incorporate these fundamental requirements, and we have made sure that the NII fits within the government’s wider approach to data. The NII will remain a living document, evolving and adapting over time as appropriate.

Exemplar departments

As part of the alpha process, we worked together with the Department of Health (and HSCIC which includes NHS Choices), the Department for Transport and theDepartment for Food, Environment and Rural Affairs to test the NII implementation document. The results of that collaboration are available today. In many cases, you will be able to see not only a more carefully chosen list of datasets for the pilot departments, but also vocabularies, code lists and file structure information, where applicable. API availability will follow for these datasets in due course.

 

The work with the exemplar departments has provided great insight into the challenges and issues we face to bring the management of our most important data assets into a coordinated framework and this will inform the work with other departments and the wider data agenda across government. It will be a process of continuous improvement, where we will evolve the NII until we achieve the right balance of user needs and data management.

New elements of the NII

One of the key elements that surfaced from the discovery process was that for the NII to work well it needs to be a data management framework not just a list of significant data, so we have made it a framework that includes:

  • a set of guiding principles
  • a curated list of the most strategically important data
  • a governance structure
  • a baseline quality criteria

and documents the following:

  • relevant legislation related to the data
  • vocabularies and code lists
  • licensing
  • standards applicable to the data and data services
  • guidance on use of the data
  • metadata

The NII provides a set of principles and a list of key components which together with good governance form the foundation for understanding and protecting our data.

Further enhancements

New functionality is being progressed on the data.gov.uk side to support the NII. Through the coming months and as we bring other departments on board, we will be deploying: a fuller metadata record for NII datasets; a registry for vocabularies and code lists; data APIs for eligible datasets, and a reporting dashboard to track progress and compliance with the NII quality principles, taking into account works such as the Common Assessment Framework for Open Data and the information offered by the Open Data Certificates.

So, we are not so much launching something as continuing a journey. This is a very exciting opportunity to get the most out of our data and we hope many of you will join us on our onward travel.

1 comment

  1. Comment by exstat posted on

    There are vague references in the documents to quality.  I have commented on other documents on this site which equate "quality" to following a certain set of rules on data release, accessibilty, API availability and so on.  I have a nasty feeling that these documents will bring us more of the same.  This hacks me off big time. What about the actual quality of the datasets themselves?  You could have a "Norwegian Blue" of a dataset which is absolute rubbish but it would be showing up as good quality on data.gov.uk because it has the right plumage. Quality should be regarded as the product of teamwork, between the statisticians or custodians of the data and the conduits via which the data are made available. Ignoring this is wrong per se but it could lead to too little resource going into the data themselves. 

    When I have made this sort of point, someone usually responds to the effect that something will be done either to broaden the definition of quality or to state explicitly that your definition does not guarantee a reliable dataset. 

    But nothing ever actually happens...

    Reply

Leave a comment

We only ask for your email address so we know you're a real person