I gave a talk to the Government ICT 2.0 conference this week on the work of the data group here in GDS.
The main reason for this talk was to show how deeply data and digital concepts are intertwined, and how much progress we have made so far in improving the way people can find and access government data.
“Data is one of four core elements of GDS, and GDS is here to is to help departments transform government to meet user needs.
The evolved GDS is about broad service transformation, not just fixing websites. And we are here to support departments, not trying to do it all from the centre.
There is a long and proud tradition of analysis in public services, but technology is evolving.
Enabled by shifts in technology, data is transforming our lives - you only have to look at the machine in your pocket to see services we could have only dreamed about a decade ago
And we are starting to see data-led transformation in our public services too, and it is our data programme’s mission to widen and deepen this work.
The UK has been at the forefront of the global movement to open up our data since 2009.
This data is used in services from travel apps on your phone through to companies wanting to perform due diligence checks.
But although we are proud of our successes, and our commitment to open data remains strong, in truth the reforms were too often surface level.
If we are to really harness the power of data for public service reform we need to roll up our sleeves and get stuck in deeper.
Our cross-government data programme, driven by a set of senior Data Leaders in departments, is designed to tackle the three underlying issues shown above.
Some of our reforms are not glamorous, but they are aimed at fixing the long-term problems we face with data in light of rapid technological change.
In going deep we often talk about data infrastructure, and in many ways we are embarking on our generation’s version of those great Victorian infrastructure projects.
I want to give you a flavour of what this transformation looks like, with an example from our work on open registers.
This image shows a common problem: 12 ways of spelling ‘Scotland’ in the Companies House database.
It’s a result of relying on free text entry on web forms, and adds all sorts of friction and problems when we come to analyse or use the data.
At the moment every team across government holds duplicated data that is not their core business.
They have their own list of local authorities, of countries, of businesses. They know they’re out of date and it makes analysis hard.
We want them to concentrate on the things that they do best. For an organisation like the Food Standards Agency, for instance, that means a green 1-5 star rating that you see on a restaurant window.
In a world of open registers, they pull the canonical source of data for addresses, local authorities etc. But they provide their 1-5 star ratings through registers to their own software and databases, and also make this available to others in government - and indeed beyond if necessary.
An example of what this looks like in practice is a register for local authorities in England. The alpha - created by Stephen McAllister in DCLG - is shown here:
It’s not very flash. But make no mistake, this is a giant leap forward, and credit is due to the DCLG team who made it happen.
The data is available in bulk download or in developer-ready JSON through an API, and when it moves into beta it will mean that there is now a definitive list of English local authorities available to build straight into services or ready for analysis.
You can see that it is early days for this work on open registers, with a countries register in beta and a territories register in alpha.
We have also identified another important issue. As you can see if you look at our registers pages, countries breaks down into two registers: countries and territories; and as we have noted, the local authorities register we are working on is only for England.
That is because an important feature of registers is that they are about getting data right at source; perhaps counter-intuitively in an age of big data in our team we often talk about creating minimum viable datasets. This is important if other services are to trust this core reference data, and end the practice of duplicating data within our silos.
There is a named custodian responsible for each register’s upkeep within the responsible department. A custodian can only be held accountable for the data they mint.
I’ve given you a taste of how we are helping to fix the plumbing. But of course data only really comes alive when it is used within a service, or when it is used for insight.
There has clearly been a big shift in the tools and techniques available to gain insight from data.
I like this diagram from Drew Conway that I think explains what is different about data science from the traditional analytics that we have been used to in government:
I have a small team of data scientists in GDS.
An example of our work is something we are undertaking with No10, DfE and CLG in prototyping a tool to remove the barriers for people building new free schools: helping them find suitable land.
Note that this is built into a product, not just analysis for policy. While I’ve given an example from my team, I could have given many other examples from:
- Detecting fraud in HMRC
- Predicting financial issues in schools in DfE
- Using earth observation data from satellites in DEFRA
So yes we build things with data in GDS, but our real aim is to create an understanding and demand for data scientists, and data science techniques, across government.
We have an accelerator scheme to build new skills, and help connect the 350+ data scientists across government with regular meet-ups, show and tells, and through internal social media channels.
While it’s all very well creating the capability to use data in new ways, and taking steps to fix our underlying data infrastructure, we also need to ensure that our policy framework for using data keeps up to date.
As part of this it is essential that we have a vision for what this looks like from a citizen’s perspective.
At the heart of our vision is a future with radically more visibility and control over transactional services.
An example is the way you can now see, amend and transport the data behind your driving licence.
As we improve our digital services, we are seeing the opportunities of using APIs to query data across government where this is appropriate.
This will mean a shift over time from the reliance on bulk data transfers between departments, and the opportunity for a more efficient, consent-based, and privacy-aware way of managing personal data.
But we need to be careful of simply transplanting perfectly valid arguments about consent and ownership of personal data from retail sectors.
Government isn’t a vending machine, and we make decisions on behalf of collective interests: it’s not always about you as an individual.
Although there are voices towards the libertarian end of the political spectrum who are uncomfortable with the idea of the collective good and collective decisions, this is a principle well established in Parliament.
And you can see examples of these areas where consent is not on its own a viable protection in the measures being proposed through the Digital Economy Bill before Parliament.
The government’s view is that we should be wary of a purely consent-based approach that would see individuals able to withdraw data that refers to them for research, for fraud and for crime. Or an approach that would step back and allow vulnerable older people to miss out on support for heating bills because they hadn’t proactively ticked the right box.
And no doubt finding the right balance between private and collective interests will be part of the discussions in Parliament as the Digital Economy Bill progresses.
Where consent is not on its own a suitable protection, we will need to rely on other safeguards to protect the public interest.
The Data Protection Act will remain a critical part of that environment, but while in some areas we are removing barriers for data access, elsewhere we will need to consider new protections for how we store, access and use data.
The latter is addressed through a data science code of ethics. But this is the first, not the last, word and it will no doubt need to iterate. We are seeking to build this into our emerging data science function in government.
So in summary, GDS is here to make it easy for government to be digital. And this would be impossible without good quality, reliable data.
Our data programme provides the connection between getting our underlying policy framework right, improving the core data infrastructure at source, and improving our capability so we can put this data to good use.
We have never pretended this is an easy or a quick job, but doing this hard work now will make it easier for others to reduce costs, and ultimately improve services, for citizens both now and over the next decade.”