Skip to main content

New harvesting

Posted by: , Posted on: - Categories: Open Data is announcing the launch of new 'harvesters' to help public bodies add their datasets to in bulk. In particular this helps Local Authorities who are publishing data to their local websites to also automatically list those datasets on the national portal,

Public bodies have long been able to add datasets one at a time to by using a web form. In addition, bulk addition has been possible for location/INSPIRE data, by using the harvester to collect datasets stored in the public body's GIS system. Now the harvesters have been extended to work for non-location datasets. This means that sets of data records (sometimes called 'inventories') can be added to in bulk.

To work with the harvester, a public body's set of records are published on the internet in any one of the recognized formats. These formats represent the most common ones produced by 'open data websites' such as CKAN, DKAN, Socrata or DataShare.

The launch of the new harvesters are timed to support the Local Authorities to publish their metadata on  about the datasets that local authorities are required to publish locally due to the Local Government (Transparency Requirements) Regulations. Local authorities are also
encouraged to publish their open datasets on through the Local Open Data Incentive Scheme. Additional fields which are specific to local authorities are imported by all of these harvesters – function & service categories and data schemas. is being enhanced to make good use of this extra metadata, to aid navigation by dataset type and automatically validate data against schemas.

NB The new harvesters and this guide are only for datasets that are not covered by the INSPIRE legislation.

For full details, consult the technical guide: Harvesting dataset records into

Sharing and comments

Share this page


  1. Comment by Adrian Marsden posted on

    .....any hints about easy ways to create the metadata, all very well if we can automate it all apart from this aprt.

  2. Comment by peelm posted on


    I've read the technical guide for more information, and this looks fairly straightforward.

    We publish some of our data via DataGM (see, a CKAN based regional datastore. If I want to use the harvester to upload all of our data onto, would I just give the URL to our page on DataGM or would this still pull all datasets from the whole site?