Earlier this year, I discussed the CRAAP test on spatial data quality, focusing on measures of Currency, Relevance, Authority, Accuracy, and Purpose. Since then, data quality has been a topic of discussion more frequently than ever before–not just in GIS circles, but in general daily news. Why is data quality important, and how can it be measured? I thought it therefore appropriate to create a new video reflecting upon some of these considerations.
We can download a wide variety of data; we can also stream data from a variety of sources that Jill Clark and I describe in this blog and in our book The GIS Guide to Public Domain Data. As data become easier to use, they become easier to misuse. It is easy to pull data from a variety of different sources, scales, dates, organizations, and lineages without a second thought, and then use those disparate data sources to make a key decision.
Don’t get me wrong–I don’t pine for the days when simply getting any data set into a GIS environment was a long, laborious process. I still vividly recall, for example, the month-long effort I went through in spring 1993 to get one county’s worth of census tract demographic data, plus streets and the census tract polygons, into ArcInfo version 4. I love the ability we have today to quickly gather and analyze data–and more and more of it possible in a cloud-based environment. I just want people to be more mindful than ever about the implications to making decisions with GIS. All of those decisions are ultimately based on the data that were used as inputs. And the above test is one way to assess whether that data is any good.
An interesting review of open data portals appeared on the MangoMap blog site last week. As I read the article I couldn’t help but agree with just about every point the author was making. In particular:
- frustrations with the overall experience
- being ‘designed by committee and crippled by bureaucracy’
- lack of access to the raw data
We commented on similar frustration last year in the post Data data everywhere, not any point to map, bemoaning the amount of time that was wasted on fruitless data searches.
However, there are some data portals that are emerging as the exceptions to this norm. The MangoMap blog directs readers to the portal hosted by the Province of New Brunswick, where a data catalogue is provided in the form of a simple grid listing of dataset names in alphabetic order, formats and pricing. No obscure folder structures to navigate, no in-house naming conventions to translate and no proprietary formats to confound.
Given the number of datasets that are currently available from this site, the grid works well but if/when more datasets are made available additional search options may be useful to quickly refine the selection. That said, the New Brunswick portal is refreshingly easy to use and demonstrates what is possible with some thought.
Another data portal that merits a mention is the beta release of European Union Open Data Portal. Launched early in 2013, the EU portal provides a single access point for a variety of data from the institutes and organizations in the EU.
Two clicks, one search and I had a short list of candidates …. and the data are available to download in a variety of open formats. It isn’t rocket science.
The mission of the Western Wildland Environmental Threat Assessment Center is to generate and integrate knowledge and information to provide credible prediction, early detection, and quantitative assessment of environmental threats in the western United States. The Center hosts a geospatial search engine that is remarkably detailed and deserves a review in this blog.
To use the geospatial search engine, go to http://www.wwetac.us/GSE/GSE.aspx. After entering a search term such as “rivers”, you will receive results by data type and service type, such as WMS, ArcGIS Server, ArcIMS, and Shapefile. These terms and interface indicate that obviously site is designed for GIS users, and some knowledge of data formats and services is necessary before using the site. Resources are returned for more than the region of the western USA, despite the site’s name.
Sadly, this is another site where the user’s expectations are higher than the results obtained from the site. The site is somewhat unresponsive. A little persistence, and experimenting with different browsers, yields some useful results. But the main challenges to the site are that (1) there does not seem to be a browse function, so search terms need to be in mind beforehand; and (2) so much data is returned that the user needs to wade carefully through the stack, and choose search terms carefully. Still, the resource might be useful as a back up or “last resort.” Choices in data and how to obtain it are usually good things to know about.
During the recent Open Knowledge Conference in Geneva, the Swiss Federal Government announced the launch of a pilot release its new open data portal. The pilot project will initially run from September 2013 for at least six months, with a possibility of extension after the trial period ends. Working in conjunction with a number of public agencies, such as the Federal Office of Meteorology and the Federal Office of Topography, the project aims to test the feasibility of providing a single interface to government data and to establish the foundation for a national Open Government Data Portal.
The currently available datasets are grouped into six main categories – population, education and science, legislation, policy, territory and environment, and management. All the data is available free of charge and may be downloaded as required. Download formats include .xls files, georeferenced TIF files, .pdf and .txt files. The geographic datasets available include national borders, municipal and district boundaries, cantonal boundaries, and postcodes and place names.
The pilot portal also provides a set of pre-built applications and visualizations for certain datasets.