This installment of Spatial Reserves is authored by: Shelley James and Molly Phillips. iDigBio, Florida Museum of Natural History. We thank these authors very much for their contribution!
If you’ve ever had a need to document where a plant or animal species occurs today, or 100 years ago, perhaps the 1 billion biological specimens housed in natural history collections across the USA, and 5 billion around the world can help! Each of these specimens imparts knowledge about their existence in time at a specific location. Fish, fossils, birds, skeletons, mushrooms, skins – all with a date and location of collection. The data, found on the labels attached to the specimens, in field notebooks and catalogues, is being transcribed by museum professionals and citizen scientists alike, revealing information about the world’s living organisms dating back to the 1600’s, some with very accurate spatial data, others much less so depending on the geographic knowledge of the collector at the time. iDigBio – Integrated Digitized Biocollections – a project supported by the US National Science Foundation – is collaborating with biological collections across the globe to help combine and mobilize voucher specimen data for research, education, and environmental management uses.
All of this biodiversity data is in a format known as Darwin Core, a standardized set of descriptors enabling biological data from different sources to be combined, indexed, and shared. The iDigBio data Portal allows open access to this aggregated data, allowing filtering for types of organisms, a spatial region using latitude-longitude co-ordinates, polygons or place descriptions, and many other options. The data is delivered dynamically, and can be downloaded for use. Currently about 50% of the biological records in iDigBio (over 30 million records) have a geopoint and error, and georeferencing is something the collections community continues to work on in order to improve this valuable dataset. Any tools or improvements to data the geospatial community can provide would be a great help as iDigBio expands beyond 65 million specimen records, and we invite you to join the conversation by participating in the iDigBio Georeferencing Working Group.
Pigeons and doves from around the world. The iDigBio Portal maps the distribution of species and provides specimen record details “on the fly” as filters are applied by the user. The dataset can be downloaded, or data can be accessed through the iDigBio API.
The Un-Spider Knowledge Portal (United Nations Platform for Space-based information for Disaster Management and Emergency Response) recently reported the launch of the Bhuvan Ganga web portal and the Bhuvan Ganga mobile application. This new monitoring initiative will use existing geospatial information and crowd-sourced reporting to monitor pollution levels in the River Ganga (Ganges). The data portal already provides access to a variety of geospatial information including as flood hazard zones and environmental data and visitors to the site will be able to contribute to the project by uploading shapefiles and WMS layers. The accompanying mobile app will also allow users to collect and report information on pollution sources affecting water quality in the River Ganga basin.
The host geospatial platform, Bhuvan, was one of the projects we discussed in The GIS Guide to Public Domain Data. Impressed by geospatial resources such as Google Earth but concerned about potential misuses of the information following the terrorist attacks in Mumbai in 2008, the Indian Government launched its own version, describing Bhuvan as a gateway to the geospatial world. The benefits of providing open access to national, regional and local geospatial information outweighed lingering concerns over potential future attacks. Over the last seven years the site has developed into a comprehensive resource of geospatial datasets and services.
One of the most robust data portals is The Open Geoportal (OGP). It is a collaboratively developed, open source, federated web application framework to rapidly discover, preview and retrieve geospatial data from multiple curated repositories. The Open Geoportal Federation is a community of geospatial professionals, developers, information architects, librarians, metadata specialists and enthusiasts working together to make geospatial data and maps available on the web and contribute to global spatial data infrastructure. Patrick Florance at Tufts University and others have been diligently working to make this resource one that will be valued and useful for the GIS community for years to come. The project’s code repository is hosted on github. Documentation can be found here. To search the repository, you can enter information using the “where” and/or “what” search fields or zoom in on a location using the map,
Like any large data depository, this one takes some getting used to–but I found it to be straightforward: You enter where you are interested in searching, and what you are interested in searching for. Where and What: It doesn’t get much more straightforward than that. The only thing I could not get to work was the “Help” link on the page. After selecting and viewing your data on the map, you add it to a Cart. The Cart acts like something you would see on Amazon, and you can add to it and delete from it as you are searching, which I found to be quite convenient. Another nice touch is that you can adjust the symbology of the data that you are examining on the map before you download it. Even better, you can stream web services directly to your desktop, web, or mobile applications from the Cart. After you have made your selections, you access your Cart, whereupon you are presented with download options. If a layer is restricting by licensing agreement, you can add them to the cart but you must log in to preview or download restricted layers. Spending time with the OpenGeoportal will be well worth it given its ease of use, but moreso for the thousands of international data layers accessible here.
Additional tools that the OpenGeoPortal community is in the process of building include a Harvester–an open source web application that provides the automation of customized harvesting from partner metadata nodes and XML metadata files within a web or local directory. Also in progress is a Metadata Toolkit–a publicly available website that provides tools to easily create guided, geospatial metadata, and a Dashboard to analyze and visualize massive spatial data collections.
NASA recently announced the launch of a new data portal, hosting a data catalog of publicly available terrestrial and space-based datasets, APIs and data visualisations.
NASA’s Open Innovation team has been established to meet government mandates to make their data publicly available. The datasets, posted in a number of categories including applied and earth science, will be available to download in a variety of formats although at present not all the formats are available for all of the categories. However the data portal is work in progress so worth checking back as new datasets are posted.
From a quick search for some earth science data I found a sea surface temperature dataset acquired by the Moderate Resolution Imaging Spectroradiometer (MODIS) instruments aboard NASA’s Terra and Aqua satellites that I could download in a number of image formats, Google Earth or CSV format. One feature of the data portal I found useful was the accompanying basic, intermediate or advanced dataset descriptions, helping portal users identify the right datasets for their requirements.
One of my students recently shared something that I considered to be a thought-provoking analogy in the “fee vs. free” geospatial data debate that we included in our book and discuss on this blog. The debate, in sum, revolves around the issue, “Should government data providers charge a fee for their geospatial data, or should they provide the data for free?”
The student commented, “I tend toward the “at cost” position of the debate for local governments and free side of the debate for federal data. For me, the “tax dollars are used to create the data so it has already been paid for argument” does not hold water. Taxpayers have no expectation (or shouldn’t have) of walking into the local parks department to borrow a shovel that in theory their tax dollars paid for. The same logic could be applied to spatial assets.” The student went on to say that the above argument should be applied to local and regional government data, because “federal level data […] tends to be more directly reflective of the population and the federal government more directly benefits from the economic opportunities created by free data.”
While I have tended to advocate on the side that geospatial data should be freely available, I believe that the student’s snow shovel analogy for local governments has merit. Following this argument, a small fee for data requested that is over and above what that government agency provides on its website seems reasonable. But I still am firmly on the side of that government providing at least some geospatial data for free on its website, citing the numerous benefits as documented in case studies in this blog and in our book. These benefits range from positive public relations, saving lives and property in emergency situations, and saving time in processing requests from data users. Consider what one person can do with the snow shovel versus what one person could do with a geospatial data such as a flood dataset. The shovel might help dredge a small section to help a few neighbors get out of their houses, but the flood dataset could help identify hundreds of houses at risk and provide a permanent, effectively managed solution. There is an order of magnitude difference in the benefit to be gained from making geospatial data easily and freely available.
What are your thoughts on this important issue? We invite you to share your thoughts below.
Billed as a stop-gap solution on the path towards emulating some of the larger data portals (such as data.gov.au and open-data.europa.eu), GovPond is an Australian public sector data portal providing access to over 3,600 hand-curated datasets and 11 Government catalogues, including:
- Landgate SLIP
- Australian Ocean Data Network
The motivation to develop the site stemmed from a previous exercise to collate public sector data sets after the site hosts discovered ‘an enormous number of tables and tools and maps and spreadsheets that were tucked away in dark, dusty corners of the internet, near-impossible to find with a quick search.’
For all the recent advances in liberating public sector data, it seems there’s still a niche for initiatives like these to get to those corners of the Internet and provide access to data resources that might otherwise elude all but the most determined data tracker.
According to Esri’s 2014 Open Data year in review, over 763 organizations around the world have joined ArcGIS Open Data, publishing 391 public sites, resulting in 15,848 open data sets shared. These organizations include over 99 cities, 43 countries, and 35 US states. At the beginning of 2015, the organizations represent 390 from North America, 157 from Europe, 121 from Africa, 39 from Asia, and 22 from Oceania. Over 42,000 shapefiles, KML files, and CSV files were downloaded from these sites since July 2014. Recently, we wrote about one of these sites, the Maryland Open Data Portal, in this blog. Another is the set of layers from the city of Launceton, in Tasmania, Australia.
While these initiatives are specifically using one set of methods and tools to share, that of the ArcGIS Open Data, the implications on the data user community are profound: First, the adoption of ArcGIS Open Data increases availability for the entire user community, not just Esri users. This is because of the increased number of portals that result, and also because the data sets shared, such as raster and vector data services, KMLs, shapefiles, and CSVs, are the types of formats that can be consumed by many types of GIS online and desktop tools. Second, as we have expressed in our book and in this blog, while there were noble attempts for 30 years on behalf of regional, national, and international government organizations to establish standards, to share data, and to encourage a climate of sharing, and while many of those attempts were and will continue to be successful, the involvement of private industry (in this case, Esri), nonprofit organizations, and academia will lend an enormous boost to government efforts.
Third, the advent of cloud-based GIS enables these portals to be fairly easily established, curated, and improved. Using the ArcGIS Open Data platform, organizations can leave their data where it is–whether on ArcGIS for Server or in ArcGIS Online–and simply share it as Open Data. Esri uses Koop to transform data into different formats, to access APIs, and to get data ready for discovery and exploration. Organizations add their nodes to the Open Data list and their data can then be accessed, explored, and downloaded in multiple formats without “extraneous exports or transformations.” Specifically, organizations using ArcGIS Open Data first enable the open data capabilities, then specify the groups for open data, then configure their open data site, and then make the site public.
I see one of the chief ways tools like ArcGIS Open Data will advance the open data movement is through the use of tools that are easy to use, and also that will evolve over time. Nobody has an infinite amount of time trying to figure out how to best serve their organization’s data, and then to construct the tools in which to do so. The ability for data-producing organizations to use these common tools and methods represents, I believe, an enormous advantage in the time savings it represents. As more organizations realize and adopt this, all of us in the GIS community, and beyond, will benefit.