When the GIS Guide to Public Domain Data was published in 2012, we produced an accompanying set of exercises to help illustrate some of the issues that could be encountered when locating, manipulating and analysing public domain spatial data. Among the issues we discussed were the problems of data sources disappearing or data portals that were no longer maintained.
As a number of the online resources we used for the original exercises have not been immune to such changes, we have updated the exercises to provide modified or alternate data resources for the activities. The new exercises and the answer key (.doc format) are available to download from Google Drive (no password required).
A theme running throughout our book The GIS Guide to Public Domain Data is to be critical of the data that you are using–even data that you are creating. Thanks to mobile technologies and the evolution of GIS to a Software as a Service (SaaS) model, anyone can create spatial data, even from a smartphone, and upload it into the GIS cloud for anyone to use. This has led to incredibly useful collaborations such as Open Street Map, but this ease of data creation means that caution must be employed more than ever before, as I explain in this video.
For example, analyze a map that I created using Motion X GPS on an iPhone and mapped using ArcGIS Online. It is shown below, or you can interact with the original map if you prefer. To do so, access www.arcgis.com/home (ArcGIS Online) and search for the map entitled “Kendrick Reservoir Motion X GPS Track” or go directly to http://bit.ly/Rx2qVp. Open the map. This map shows a track that I collected around Kendrick Reservoir in Colorado USA. This map was symbolized on the time of GPS collection, from yellow to gradually darker blue dots as time passed.
Note the components of the track to the northwest of the reservoir. These pieces were generated when the smartphone was just turned on and the track first began, indicated by their yellow color. They are erroneous segments and track points. Notice how the track cuts across the terrain and does not follow city streets or sidewalks. Change the base map to a satellite image. Cutting across lots would not have been possible on foot given the fences and houses obstructing the path. When I first turned on the smartphone, not many GPS satellites were in view of the phone. As I kept walking and remained outside, the phone recorded a greater number of GPS satellites, and as the number of satellites increased, the triangulation was enhanced, and the positional accuracy improved until the track points mapped closely represented my true position on the Earth’s surface.
Use the distance tool in ArcGIS Online to answer the following question: How far were the farthest erroneous pieces from the lake? Although it depends on where you measure from, some of the farthest erroneous pieces were 600 meters from the lake. Click on each dot to access the date and time each track point was collected. How long did the erroneous collection continue? Again, it depends on which points you select, but the erroneous components lasted about 10 minutes. At what time did the erroneous track begin correctly following my walk around the lake? This occurred at 11:12 a.m. on the day of the walk. [Take note of the letters I drew along the southwest shore of the reservoir!]
This simple example points to the serious concern about the consequences of using data without being critical of its source, spatial accuracy, precision, lineage, date, collection scale, methods of collection, and other considerations. Be critical of the data, even when it is your own!
In the past, we have written about Robin Smith’s free geospatial data listing. Dr Karen Payne at the University of Georgia has published a geospatial data list which is also quite useful. Her list of geodata links is published on Google Spreadsheets and contains over 1,000 links to different portals, data types, and services. Because it is listed in a spreadsheet, make sure you pay attention to and investigate each of the tabs. Categories include scales (global, regional, country), thematic (disaster, imagery, physical, conservation), data types (web apps, tabular, live services), and more. The list’s focus is on freely available data sets used in international humanitarian work, which is Dr Payne’s major concentration in her work. The challenge with all GIS data listings, as we point out in our book, is the updating and curation of such lists, but Dr Payne is committed to updating this one as is evident in the breadth and scope of the listing and in my conversations with her.
Dr Payne is also working with the United Nations, converting their Common Operational datasets into services; specifically, populated places, admin boundaries, their names and codes. This effort could prove to be very helpful to all of us in the geospatial technology community.
The Open Data Institute (ODI), founded by Sir Tim Berners-Lee and Prof. Nigel Shadbolt, has been working collaboratively with many partners around the globe to develop a network of open data ‘Nodes‘. Nodes, which aim to bring individuals and organisations together to collaborate and promote the use open data in business, government and education, are split into three levels:
- Country: Independent NGOs building national centres of excellence, working across public and private sectors, NGOs, educational institutions and other Nodes within a country.
- City or Regional: Deliver projects, and can provide training, research, and development. For example, ODI Dubai, ODI Chicago, and ODI North Carolina, ODI Paris, ODI Trento, ODI Brighton, ODI Manchester, ODI Leeds.
- Communications: Promoting global open data case studies. For example ODI Moscow, ODI Buenos Aires and ODI Gothenburg.
Although not a data portal, the ODI provides a variety of resources for those work with open data, including research into how open data is used, how it is published and how to certify open data. Given the current plethora of data sites and portals, not all of which are well thought out and useful as we have commented before on this blog, this invaluable resource of data trends and issues provides many useful references for those working with the various types of open data, including location based data. For example, a recent blog post from ODI North Carolina discussed how important quality is for open data.
It is always helpful for others who are considering working with open data, or who are in the process of collecting and publishing open data, to benefit from the experiences of others. Given the ease with which data can be published online these days, the next challenges are to provide data that are easy to find, well documented, current, accurate and ultimately ….. useful. As Charlie Ewen (UK Met Office) remarked, ‘Digital isn’t done once you have a website’.
A new web resource from Texas Tech University of playas and wetlands for the southern High Plains region of Texas, Oklahoma and New Mexico offers a wide variety of spatial data on this key resource and region. The playa and wetlands GIS data are available for download here, including shapefile, geodatabase, and layer package formats. The data include 64,726 wetland features, of which 21,893 are identified as playas and another 14,455 as unclassified wetlands; in other words, they appear to be a playa but have no evidence of a hydric soil. The remaining features include impoundments, riparian features lakes, and other wetlands.
As we discuss in our book, (1) Many spatial data depositories seem to have been created without the GIS user in mind. Not this one. Careful attention has been paid to the data analyst. That’s good news! (2) Resources such as this don’t appear without a great deal of time and expertise invested. Here, approximately 5,000 person hours were dedicated to create the geodatabase and website. This project was made possible by Texas Tech University with funding from the USDA Agricultural Research Service – Ogallala Aquifer Program.
For users who only wish to view playas and other wetlands, a web map application exists and can be launched via the playa viewer. A “citizen science” feature is that the map viewer allows interactive comments to be added to the map for future consideration.
Southern Ogallala Aquifer Playa and Wetlands Geodatabase.
Just as the open government data and free public access movement continues to go from strength to strength, it seems that personal data could soon be a new currency in the digital information markets, where companies and other interested parties bid for the right to use that data for their own purposes.
Jacopo Staiano at the University of Trento in Italy recently conducted an experiment to the perceived value of personal location information. The study, reported in the MIT Technology review, involved 60 participants using smartphones that collected a variety of information including the number of calls made, applications used, the participant’s location throughout the day and the number of photographs taken. Using an auction system, the participants were given the opportunity to sell either the raw data or the data after it had been processed in some way to add value. Of all the information collected during the experiment, personal location data emerged as the most highly valued, and perhaps not surprisingly those who travelled more each day generally placed a higher value on their location data than those who didn’t.
The valuable insights into personal behaviour and preferences provided by such information are what compel the marketers to find ever pervasive ways to tap into that resource. Mobile location-aware applications and services are now commonplace and for many recording location data is the default setting; users have to proactively opt out to avoid being tracked. During the course of the experiment the participants were also asked who they trusted most when it came to managing their personal location data; the responses indicated concerns about the trustworthiness of financial institutions, telecom and insurance companies when it came to collecting and using this information.
The research suggests the emergence of ‘.…a decentralised and user-centric architecture for personal data management‘, one that gives users more control over what data is collected, how it is stored and who has access to it. The study also reports that several research groups are already starting to design and build such personal data repositories and it is increasingly likely that some type of market for personal location information will soon emerge.
One of the most useful sites of the past 15 years for GIS users, in my judgment, has been the National Atlas of the United States. It contains a “map maker” that allows you to create online maps of climate, ecoregions, population, crime, geology, and many other layers, and a “map layers” repository that houses all of the raster and vector data layers that are displayable in the map maker. All of those hundreds of layers are downloadable in standard formats that are easy to use with GIS.
Sadly, the National Atlas is scheduled to disappear on 30 September 2014. According to the transition FAQ, “the National Atlas and The National Map will transition into a combined single source for geospatial and cartographic information. This transformation is projected to streamline access to maps, data and information from the USGS National Geospatial Program (NGP). This action will prioritize our civilian mapping role and consolidate core investments while maintaining top-quality customer service.” Thus, the National Map is scheduled to be the content delivery mechanism for the National Atlas content.
But, data users take note: Not all of the National Atlas content is migrating to the National Map. According to the FAQ’s question of “Will I still be able to find everything from the National Atlas on The National Map web site”, the answer is, “No. Most National Atlas products and services that were primarily intended for a broad public audience as well as thematic data contributions from outside the National Geospatial Program (NGP) will not be available from nationalmap.gov.”
I think this is most unfortunate news. In my opinion, and that of many students and educators that I work with in courses and institutes, and the other data users I have worked with over the years, the National Map is almost as clunky and difficult to use as it was 10 years ago. I use it frequently because it is still one of the richest sources of data, but it is by no means easy to obtain that data. And equally importantly, it serves a different audience than the National Atlas does. Yes, the National Atlas viewer is dated, but it requires little bandwidth, making it accessible to schools and other institutions contending with poor connectivity. How much effort is required just to leave national atlas alone and leave it online, with an understanding that it will not be updated?
In an era where more geospatial data are needed, not less, and improved geographic literacy is increasingly critical to education and society, the disappearance of the National Atlas seems like a giant step backward.