data.gov.uk is announcing the launch of new 'harvesters' to help public bodies add their datasets to data.gov.uk in bulk. In particular this helps Local Authorities who are publishing data to their local websites to also automatically list those datasets on the national portal, data.gov.uk.
Public bodies have long been able to add datasets one at a time to data.gov.uk by using a web form. In addition, bulk addition has been possible for location/INSPIRE data, by using the data.gov.uk harvester to collect datasets stored in the public body's GIS system. Now the data.gov.uk harvesters have been extended to work for non-location datasets. This means that sets of data records (sometimes called 'inventories') can be added to data.gov.uk in bulk.
To work with the harvester, a public body's set of records are published on the internet in any one of the recognized formats. These formats represent the most common ones produced by 'open data websites' such as CKAN, DKAN, Socrata or DataShare.
The launch of the new harvesters are timed to support the Local Authorities to publish their metadata on data.gov.uk about the datasets that local authorities are required to publish locally due to the Local Government (Transparency Requirements) Regulations. Local authorities are also
encouraged to publish their open datasets on data.gov.uk through the Local Open Data Incentive Scheme. Additional fields which are specific to local authorities are imported by all of these harvesters – function & service categories and data schemas. data.gov.uk is being enhanced to make good use of this extra metadata, to aid navigation by dataset type and automatically validate data against schemas.
NB The new harvesters and this guide are only for datasets that are not covered by the INSPIRE legislation.
For full details, consult the technical guide: Harvesting dataset records into data.gov.uk
2 comments
Comment by Adrian Marsden posted on
.....any hints about easy ways to create the metadata, all very well if we can automate it all apart from this aprt.
Comment by peelm posted on
Hi,
I've read the technical guide for more information, and this looks fairly straightforward.
We publish some of our data via DataGM (see http://datagm.org.uk/), a CKAN based regional datastore. If I want to use the harvester to upload all of our data onto data.gov.uk, would I just give the URL to our page on DataGM or would this still pull all datasets from the whole site?
Thanks.