Despite the growing volume of geospatial data available, and the ease of use of much of this data, finding and using data remains a challenge. To assist data users in these ongoing challenges, I have written a new activity entitled “Key Strategies for Finding Content and Understanding What You’ve Found.” The goal of this activity ” Key Strategies for Finding and Using Spatial Data” is to enable GIS data users to understand what spatial analysis is, effectively find spatial data, use spatial data, and become familiar with the ArcGIS platform in the process. I tested the activity with a group of GIS educators and now would like to share it with the broader GIS community.
The document makes it clear that we are still in a hybrid world–still needing to download some data for our work in GIS, but increasingly able to stream data from online data services such as those in ArcGIS Online. But these concepts don’t make as much sense unless one actually practices doing this–hence the activity.
In the activity, I ask the user to first practice search strategies in ArcGIS Online, using tags and keywords. Then, I guide the user through the process of downloading and using a CSV file with real-time data. After a brief review of data types and resources, I guide the user of the activity through the process of downloading data from a local government agency to solve a problem about flood hazards. The next step asks users to compare this process of downloading data with streaming the same data from the same local government’s site (in this case, using data from Boulder County, Colorado) into ArcGIS Online. The activity concludes with resources to discover more about these methods of accessing data.
Jill Clark and I have created other hands-on activities on this theme of finding and understanding data as well, available here. We look forward to hearing your comments and I hope this new activity is useful.
One of the exercises in our book involves accessing Boulder County Colorado’s GIS site to make decisions about flood hazards. We chose Boulder County for this activity in large part because their data covers a wide variety of themes, is quite detailed, and is easy to download and use. Recently, Boulder County went even further, with the launch of their new geospatial open data platform. This development follows other essays we have written about in this blog about open data, such as the ENERGIC OD, ArcGIS Open Data, EPA flood risk, Australian national map initiative, and open data institute nodes. Other open data nodes are linked to a live web map on the ArcGIS Open Data site.
Accessible here, Boulder County’s open data platform expands the usability of the data, such as providing previews of the data in mapped form and in tabular form. The new platform allows for additional data themes to be accessed; such as the lakes and reservoirs, 2013 flood channel, floodplain, and streams and ditches, all accessible as a result from a search on “hydrography” below. Subsets of large data sets can also be accessed. In addition, the services for each data set are now provided, such as in GeoJSON and GeoService formats, which allows for the data to be streamed directly to such portals such as ArcGIS Online, and thus avoid downloading the data sets altogether.
Why did the county do this? Boulder County says they are “committed to ensuring that geospatial data is as open, discoverable and usable as possible in order to promote community engagement, stimulate innovation and increase productivity.” The county is providing an incredibly useful service to the community through their newest innovative efforts, and I congratulate them. I also hope that more government agencies follow their lead.
The UK Government’s Department for Environment, Food and Rural Affairs (Defra), recently announced the release of a LIDAR point cloud, the raw data used to generate a number of digital terrain models (DTMs) that were released last year. In addition to providing terrain models for flood modelling and coastline management, the LIDAR data have also been revealing much about long-buried Roman roads and buildings, such as the Vindolanda fort just south of Hadrian’s Wall in northern England.
Environment Agency/Defra LIDAR data
The point cloud data have been released as part of the #OpenDefra project, which aims to make 8,000 datasets publicly available by mid 2016. The first release of point cloud data contains over 16,000 km 2 of survey data and is available to download from:
The data are licensed under version 3.0 of the Open Government Licence.
Data discover-ability, accessibility, and integration are frequent barriers for scientists and a major obstacle for favorable results on environmental research. To tackle this issue, one that is raised in our book and in this blog, the Group on Earth Observations (GEO) is leading the development of the Global Earth Observation System of Systems (GEOSS), a voluntary effort that connects Earth Observation resources world-wide, acting as a gateway between producers and users of environmental data.
Barbara Ryan, Director, GEO Secretariat, says that, “The primary goal is the assurance of Earth observations so that we can address society’s environmental problems. While many of our activities are targeted toward monitoring global change, we’re actually more concerned about the assurance, continuity, sustainability and interoperability of observing systems, so that monitoring across multiple domains can be done. Governments, research organizations and others actually do the monitoring. We just want to make sure that the assets are in place, and that the data from these monitoring efforts is shared broadly. One of GEO’s primary objectives is to advocate broad, open data sharing, particularly if the data was collected at taxpayer expense—the citizens of the world should have access to that information”
“In this regard, during the first part of GEO, 2004-2009, we looked at the GEO mission as a massive cataloging effort. Then, about two years ago, we changed strategies. We transitioned to a brokering approach whereby interoperability agreements were established with institutions that have datasets and/or databases, rather than us seeking out individual datasets. An example of this approach is illustrated with our agreement with the World Meteorological Organization (WMO). WMO
members have generally registered their data in the WMO Information System (WIS). So we worked on an interoperability arrangement between GEOSS and the WIS resulting in data from one system being discovered by the other system. We are now hearing, particularly from some members in the developing world, that they are getting access to information that they didn’t know existed.”
“WMO members are getting biodiversity and ecosystem information that wouldn’t normally be delivered through the WIS that focuses on weather, climate and water, and GEO members are gaining increased visibility to information in the WIS. It’s a win-win story, and we’d like to have interoperability brokering agreements with any institution that wants its environmental information broadly viewed and accessible throughout the world.”
“Many of the 25 countries that produce 80% of the world’s crops have global forecasting capabilities. GEO is advocating that information from these countries be shared more broadly and openly, and that algorithms be harmonized so that forecasts are improved around the world. Global transparency will help create more stability and a more food-secure world. A related aspect of the security issue is that governments do not want another government having easy access to what is happening over their domain with the fear that this information will be used against them. While this concern is recognized, most of the information that GEO is interested in transcends national boundaries. Atmospheric, oceanic and many terrestrial processes do not respect national boundaries, and actions in one part of the world often have wide-spread consequences. The benefits of broader data sharing almost always outweigh the risks associated with not sharing data.”
These are welcome words to us here as authors of Spatial Reserves and also most likely will be welcome words for the entire geospatial community. I look forward someday soon to be able to search for and use data using the GEOSS.
Over the last three years we’ve written about a few of the problems associated with some data portals, which although well-intentioned, haven’t always provided the level of access to geospatial information that they promised. Interoperability issues, interface design and a lack of on-going support have contributed to many such initiatives failing to deliver. With the experience gained from those earlier efforts and perhaps the benefit of hind-sight, new initiatives are being developed to provide better access to the plethora of public domain and open data geospatial information that is available online.
Among those new initiatives is the ENERGIC OD project (European NEtwork for Redistributing Geospatial Information to user Communities – Open Data). Launched at the end of 2014, the project aims to address some of the problems that have resulted from the evolution of disparate and heterogeneous GI systems and technologies by providing what are referred to as Virtual Hubs. These hubs will provide a single point of access to geospatial datasets, including access to INSPIRE compliant systems and Copernicus satellite and sensor data (Copernicus was previously known as GMES). The brokering framework at the centre of the solution will allow the hubs to connect to a wide range of European data sources making it easier for end users, public authorities and private organisations, and developers alike to access the data without having to resolve the interoperability and standardisation issues themselves.
The ENERGIC OD project will run for three years and deploy five national virtual hubs in France, Germany, Italy, Poland and Spain.
According to Esri’s 2014 Open Data year in review, over 763 organizations around the world have joined ArcGIS Open Data, publishing 391 public sites, resulting in 15,848 open data sets shared. These organizations include over 99 cities, 43 countries, and 35 US states. At the beginning of 2015, the organizations represent 390 from North America, 157 from Europe, 121 from Africa, 39 from Asia, and 22 from Oceania. Over 42,000 shapefiles, KML files, and CSV files were downloaded from these sites since July 2014. Recently, we wrote about one of these sites, the Maryland Open Data Portal, in this blog. Another is the set of layers from the city of Launceton, in Tasmania, Australia.
While these initiatives are specifically using one set of methods and tools to share, that of the ArcGIS Open Data, the implications on the data user community are profound: First, the adoption of ArcGIS Open Data increases availability for the entire user community, not just Esri users. This is because of the increased number of portals that result, and also because the data sets shared, such as raster and vector data services, KMLs, shapefiles, and CSVs, are the types of formats that can be consumed by many types of GIS online and desktop tools. Second, as we have expressed in our book and in this blog, while there were noble attempts for 30 years on behalf of regional, national, and international government organizations to establish standards, to share data, and to encourage a climate of sharing, and while many of those attempts were and will continue to be successful, the involvement of private industry (in this case, Esri), nonprofit organizations, and academia will lend an enormous boost to government efforts.
Third, the advent of cloud-based GIS enables these portals to be fairly easily established, curated, and improved. Using the ArcGIS Open Data platform, organizations can leave their data where it is–whether on ArcGIS for Server or in ArcGIS Online–and simply share it as Open Data. Esri uses Koop to transform data into different formats, to access APIs, and to get data ready for discovery and exploration. Organizations add their nodes to the Open Data list and their data can then be accessed, explored, and downloaded in multiple formats without “extraneous exports or transformations.” Specifically, organizations using ArcGIS Open Data first enable the open data capabilities, then specify the groups for open data, then configure their open data site, and then make the site public.
I see one of the chief ways tools like ArcGIS Open Data will advance the open data movement is through the use of tools that are easy to use, and also that will evolve over time. Nobody has an infinite amount of time trying to figure out how to best serve their organization’s data, and then to construct the tools in which to do so. The ability for data-producing organizations to use these common tools and methods represents, I believe, an enormous advantage in the time savings it represents. As more organizations realize and adopt this, all of us in the GIS community, and beyond, will benefit.
The signing of the Open Data Charter by G8 leaders in 2013 promised to make public sector data open, free of charge and available to all in re-usable formats. However, despite the attention open data subsequently received, a recent report by the World Wide Web Foundation (featured in a BBC article) highlighted some ongoing problems making the pledges enshrined in the Open Data Charter a reality. Many countries have failed to deliver what the report referred to as a policy framework for open data.
Although the UK and USA were at the top of the global rankings for countries providing access to open data, they and many other countries still have a lot of work before they can claim to have fully open government. Of particular note in the UK is the ongoing debate over access to the Royal Mail’s Postcode Address File (PAF). Although the PAF dataset is cited as the ‘definitive source of postal address information’ in the UK and used in many digital mapping applications, the current charges and licensing arrangements deter many potential users of the dataset. Many commentators have argued that the PAF dataset could become the standard address resource for commercial and non-commercial uses in the UK if it was made available in an easy to use and open format. This would encourage much wider adoption of the dataset and prevent the further proliferation of alternatives sources of address information. With the spotlight back on open access to address data, will 2015 be the year the PAF joins the growing list of open, and free of charge, spatial datasets?
The Environment Agency announced at the end of 2014 that it was releasing the Risk of Flooding from Rivers and Sea dataset (formerly known as the National Flood Risk Assessment dataset, NAFRA) as an open data resource. The flood data are to be made available under the Open Government Licence (OGL) and provide an indication of the likely flood risk (low, moderate or significant) from rivers and sea. The data are available to download from the data.gov.uk site and the Environment Agency also plans to publish the data on their DataShare site ASAP.
Important as this latest open data resource is, especially given the extent and severity of flooding in many parts of England last winter, the usefulness of this type of flood data is often best illustrated in combination with other datasets such as flood outlines and waste site boundaries; the factors contributing to flood risk are both complex and varied. However, many of these other datasets are not available under the same open licence agreement and are subject to restrictions on commercial use and re-sharing. This variation in licensing poses a number of issues for data analysts working to provide holistic interpretations of past trends and recent events, and potentially limits both the scope of the analysis and the audience for the results.
The release of the flood risk data under the OGL is a significant move for the Environment Agency; will this prompt the release of other related environmental datasets under the same open access licence?
The Open Data Institute (ODI), founded by Sir Tim Berners-Lee and Prof. Nigel Shadbolt, has been working collaboratively with many partners around the globe to develop a network of open data ‘Nodes‘. Nodes, which aim to bring individuals and organisations together to collaborate and promote the use open data in business, government and education, are split into three levels:
- Country: Independent NGOs building national centres of excellence, working across public and private sectors, NGOs, educational institutions and other Nodes within a country.
- City or Regional: Deliver projects, and can provide training, research, and development. For example, ODI Dubai, ODI Chicago, and ODI North Carolina, ODI Paris, ODI Trento, ODI Brighton, ODI Manchester, ODI Leeds.
- Communications: Promoting global open data case studies. For example ODI Moscow, ODI Buenos Aires and ODI Gothenburg.
Although not a data portal, the ODI provides a variety of resources for those work with open data, including research into how open data is used, how it is published and how to certify open data. Given the current plethora of data sites and portals, not all of which are well thought out and useful as we have commented before on this blog, this invaluable resource of data trends and issues provides many useful references for those working with the various types of open data, including location based data. For example, a recent blog post from ODI North Carolina discussed how important quality is for open data.
It is always helpful for others who are considering working with open data, or who are in the process of collecting and publishing open data, to benefit from the experiences of others. Given the ease with which data can be published online these days, the next challenges are to provide data that are easy to find, well documented, current, accurate and ultimately ….. useful. As Charlie Ewen (UK Met Office) remarked, ‘Digital isn’t done once you have a website’.
Following on from last week’s post on the National Atlas and changes to the National Map in the USA, The Australian Government has recently announced the National Map Open Data Initiative to provide improved access to publicly available government datasets. A beta version of the National Map website, hosted by NICTA (Australia’s Information and Communications Technology (ICT) Research Centre of Excellence), is now available and provides map-based access to a variety of Australian spatial data from government agencies including the Bureau of Meteorology, the Bureau of Statistics and data.gov.au. The currently available data themes include:
- Broadband – availability and quality across Australia
- Land – including cover, geology and earthquake hazards
- Transport – roads, railways, foot tracks
- Infrastructure – waste management, wind pumps, mines and wells
- Groundwater – aquifers, salinity
Developed on an open source platform, the National Map site will ultimately be hosted by Geoscience Australia when it goes into full production later on in 2014.
Visitors to the site can load their own data, either as a data file or WMS/WFS service, or download data (supported formats include GeoJSON, KML, KMZ and CSV) subject to the licensing arrangements of the data providers.