The Local Public Data Panel is an independent panel that exists to promote and facilitate the release of public data. The Panel has played a key role in providing advice on the publication of fine grained and timely local authority spend data.
This paper sets out the Panel’s views in response to the Department of Health’s White Paper on health service reform. Although the consultation is not specifically about open data, it includes proposals about local transparency for choice and accountability and looks forward to a further consultation on the NHS information strategy later this year. The white paper proposes an ‘information revolution’ leading to increased transparency with clearer accountability for quality and results.
Our response is based on the principles published by the Transparency Board and seeks to apply them in the context of health data (see http://data.gov.uk/wiki/Public_Data_Principles). We trust it will be helpful in shaping the Department of Health’s further development of its information strategy.
- There should be a presumption that all data, other than personal, private data, will be published at the finest level of detail possible
- The forthcoming information strategy should include consideration of data about health that is held by bodies outside the NHS
- Data should be published locally and immediately, and linked to from data.gov.uk, rather than cleared through a central hub
- The priority should be to release data as soon as possible, rather than waiting until it is perfect.
Why local open data about health is important
Comprehensive, relevant and timely data is a key component of transparency, and can be a powerful driver of the Big Society, empowering people to take action locally to hold their services to account, make informed choices and decisions, and tackle local problems themselves. As the White Paper states, ‘information, combined with the right support, is the key to better care, better outcomes and reduced costs.
It is important to emphasise that as with other services, the release of local, detailed health data will be at least as useful to those involved in delivering health and related services as it will to those who are seeking to use services and hold them to account.
The release of local health data alongside other data held by local authorities and other bodies will make it possible for new links to be made and new understandings and insights to be generated, just as place-based budgeting is expected to do in relation to spending in local areas.
The potential power of health data has been recognised in the US, where the Community Health Data Initiative, launched earlier in 2010, is aiming to ‘make health data as useful as weather data’ for patients, the public, medical and health professionals.
What data should be released
The presumption should be that all data other than personal, private data will be published
One of the key assumptions underlying the public data movement is that individuals and organisations both inside and outside of government will come up with a whole range of uses that will not have been imagined at the time the data is released. Policy-makers in Whitehall cannot possibly foresee the range of uses that will be found for health data, and should not try to do so. This is the main reason for our advocacy of the presumption that all the data that is held will be published unless there are genuine reasons of confidentiality that would prevent release, for example in order to protect individual patients’ privacy.
A potential conflict may arise between the policies of removing targets and increasing transparency through the publication of open data. The White Paper expresses a desire to scrap centrally-set targets and reduce the number of information returns to the Department of Health from the current volume of 260,000 per year, whilst moving towards an emphasis on outcomes rather than processes. The policy to remove targets and focus on outcomes should not form the basis of decisions about what data should be collected and published – these are two different questions which must be addressed from different perspectives.
It is important to recognise that whilst data and information about outcomes are useful to inform choice and accountability, they do not provide any insight into what caused these outcomes. For data to properly allow people to hold services to account, it must encompass the key inputs and processes for services, as well as outcomes. This in turn will highlight new data dependencies and new data items that should be considered for publication.
Some degree of consistency is required if public data is to be useful and meaningful. It will therefore be necessary to identify a core list of categories of data that should be published by health services in all local areas. The core list should include all the categories of data referred to in the White Paper in relation to use of resources (including financial and workforce resources), performance data, complaint and feedback data and outcome-related data, regardless of its primary policy purpose.
Data relating to health held by other services and bodies
The most profound benefits of public access to health data might well be those that come from the ability to make links between health service data and related data held by others such as the police, local authorities, social care and social services providers and third sector organisations. The health information strategy that is due to be published later this year should consider health data in the widest meaning of the word, rather than as it strictly relates to data held by the NHS.
How data should be released
Data should be published locally and immediately, rather than cleared through a central hub
One of the key principles of open public data is that data should be published at the finest level of detail possible. This will be crucial if the vision of local choice and accountability outlined in the White Paper is to be realised.
The Department of Health and other bodies already publish huge amounts of health data. However, much of it is at the level of PCTs or regions. As well as this, data should be published at a much more local level, in order to enable people to find out data about their own local areas. For example, data about GP surgeries and primary healthcare centres should be published at the level of the centre rather than the new commissioning consortium.
The White Paper includes a proposal to ‘centralise all data returns in the information centre, which will have lead responsibility for data collection and assuring the data quality of those returns, working with other interested parties such as Monitor and the Care Quality Commission’. This approach could result in unnecessary bottlenecks in the publication of data, as demonstrated by the fact that the NHS Information Centre has in its publication schedule for October 2010 sets of data relating to the 2009/10 reporting year, which ended 6 months earlier. There may well be a useful role for the Information Centre in collating and analysing data over a longer time period, and in publishing raw data. However, this should not mean that local data is not published locally and more immediately. The aim should be to minimise the time taken between data being collected within each body and it being published.
The main obstacle to immediate and local publication of data is not technological. It is the fear of mistakes in and misinterpretations of the data. The Government has clearly stated its commitment to the principle that there is nothing wrong with datasets being published with caveats and imperfections – in fact publication can help to flush out inconsistencies, outliers, omissions and other deficiencies in the data. It offers the possibility of patient driven demand for richer more comprehensive data.
We are not advocating and would not support the creation of another national IT system for health services in order to support the publication of local health data. There is no reason why local services should not be able to publish the data they hold relatively straightforwardly, immediately and at minimal cost, as has been shown to be the case for those local authorities that have started to publish their spending data.
Progressing from publishing in any format to linked data
The question of how data should be released is as important as questions about what data should be included. If data is not released in a standardised, re-usable, machine-readable form without restrictions on re-use or re-purposing, then it cannot be used to its full potential and cannot be described as genuinely open data.
The approach that should be taken to releasing data should reflect the progression from taking the first step of publishing data in whatever form to the goal of providing linked data. Tim Berners-Lee, inventor of the Web, has outlines five stages of publishing public data:
The approach that should be taken to releasing data should reflect the progression from taking the first step of publishing data in whatever form to the goal of providing linked data:
- * - publish the available data on the Web in whatever format
- ** - make it available as structured data, for example in a spreadsheet rather than a PDF document
- *** - publish it in a non-proprietary format such as comma separated values (CSV)
- **** - user URLs to identify items so that people can 'point' to them
- ***** - link the data to other data to provide context.
The best must not become the enemy of the good – progress should be made to the next stage towards linked data at the earliest opportunity, as each step in this progression represents an improvement on the previous one. The ultimate ambition is the publication of linked data.
Stage ** should be a reasonable starting position, reflecting the principle that data should be published in machine-readable, re-usable format. However, any difficulty in achieving stage ** is not a reason to delay or not to do stage *.
The Government has clearly indicated that its priority is to promote the early release of data even if it is not perfect. The move is from a presumption of not publishing data in case risks might materialise to a presumption of publishing it whilst managing the risks and trying to minimise imperfections. This should be reflected in the approach taken to releasing data and the assessment and management of risks.
Licensing to enable free re-use of the data
There must be a standard licence, building on the Open Government License that has been developed for local authority spend data and general public sector re-use. This would enable people to re-use the data without unnecessary and restrictive copyright restrictions.
The importance of leadership
For the proposed ‘information revolution’ to be really effective, the Department of Health will need to provide a clear vision of what the revolution is intended to achieve, and what success will look like. There must also be leadership at the highest level to encourage and support the release of data.
The Web launched an information revolution twenty years ago. Publishing data in line with the Public Data Principles will continue this revolution. In the original Web the author of a document could have no idea how that content would be used and repurposed, exploited and enriched by others. So it is with public data available on the Web. Open health data has the opportunity to bring forth new applications, new levels of transparency, new forms of patient empowerment, and to deliver new insights and better services throughout the health sector.