Archive for February, 2020

Geospatial Commission: Best practice guide for data publishers

February 24, 2020 2 comments

The UK’s Geospatial Commission and its six partner bodies, British Geological Survey, Coal Authority, HM Land Registry, Ordnance Survey, UK Hydrographic Office and Valuation Office Agency, have published a new guide with practical advice on how to optimise access to geospatial datasets across all search engines. The main recommendations highlighted in the guide are:

  • Complete all the metadata fields on your data portal.
  • Restrict page titles and URLs to 50-60 characters; longer titles tend to be truncated and longer URLs tend to be give a lower ranking.
  • Optimise your data abstracts; write clearly, write concisely.
  • Use keywords appropriately in abstract and avoid keyword lists.
  • Avoid the use of special characters that may not display correctly on a webpage.
  • Page longevity improves search engine ranking; try to keep your URL the same even if your data changes..
  • Remove out of date or pages that are no longer maintained; these are less likely to be shown in search results that more recently updated pages.
  • Avoid metadata duplication; the same information available from multiple publishers or available in more than one place makes it likely the search results will have a lower ranking than one good metadata record.
  • Consider using user surveys, interviews and some of the analytic tools available, such as Google Analytics and Search Console applications, to understand what geospatial information the target audiences are looking for and finding. If necessary adjust data portal abstracts and keywords to better reflect those search parameters.
  • Apply all abstract and keyword changes to all pages on your portal to ensure a consistent search ranking.

It would interesting to know if the Geospatial Commission and partners will follow up with any agencies and organisations that implement these recommendations to see how much of an improvement is possible with respect to finding and accessing online geospatial datasets.

Creating fake data on web mapping services

February 16, 2020 2 comments

Aligned with our theme of this blog of “be critical of the data,” consider the following recent fascinating story:  An artist wheeled 99 smartphones around in a wagon to create fake traffic jams on Google Maps.  An artist pulled 99 smartphones around Berlin in a little red wagon, in order to track how the phones affected Google Maps’ traffic interface.  On each phone, he called up the Google Map interface.  As we discuss in our book, traffic and other real-time layers depend in large part on data contributed to by the citizen science network; ordinary people who are contributing data to the cloud, and in this and other cases, not intentionally.  Wherever the phones traveled, Google Maps for a while showed a traffic jam, displaying a red line and routing users around the area.

It wasn’t difficult to do, and it shows several things; (1) that the Google Maps traffic layer (in this case) was doing what it was supposed to do, reflecting what it perceived as true local conditions; (2) that it may be sometimes easy to create fake data using web mapping tools; hence, be critical of data, including maps, as we have been stating on this blog for 8 years; (3) the IoT includes people, and at 7.5 billion strong, people have a great influence over the sensor network and the Internet of Things.

The URL of his amusing video showing him toting the red wagon around is here,  and the full URL of the story is below:

I just wonder how he was able to obtain permission from 99 people to use their smartphones.  Or did he buy 99 photos on sale somewhere?

–Joseph Kerski




Key Global Biodiversity and Conservation Data Sources

February 2, 2020 Leave a comment

Advances in the following two resources and the sheer volume and diversity of data they contain merit mention in this data blog and, I recommend, considering investigating as part of your own work.

  1.  The Global Biodiversity Information Facility ( contains point data on an amazing number and diversity of species.  It also over 12 million research-grade  observations from the i-Naturalist citizen science using community.
  2. IUCN:  The International Union for Conservation of Nature:  You can filter and use the data with IUCN Spatial data downloads for polygon boundary layers from their data portal, at  The IUCN Red List of Threatened Species™ contains global assessments for 112,432 species. More than 78% of these (>88,500 species) have spatial data.  The spatial data provided on the site are for comprehensively assessed taxonomic groups and selected freshwater groups.  The site indicates that some species (such as those listed as Data Deficient) are not mapped and that subspecies, varieties and subpopulations are mapped within the parental species. The data are in Esri shapefile format and contain the known range of each species, although sometimes the range is incomplete. Ranges are depicted as polygons, except for the freshwater HydroBASIN tables.

To use either resource, all you need is a free account.  The data sets can be combined, after which you can examine potential outliers, perform hot spot analysis, use the data in space time cubes, create habitat suitability models and risk models, and much more.

Joseph Kerski


Some of the resources available from the Global Biodiversity Information Facility (GBIF).