Despite the growing volume of geospatial data available, and the ease of use of much of this data, finding and using data remains a challenge. To assist data users in these ongoing challenges, I have written a new activity entitled “Key Strategies for Finding Content and Understanding What You’ve Found.” The goal of this activity ” Key Strategies for Finding and Using Spatial Data” is to enable GIS data users to understand what spatial analysis is, effectively find spatial data, use spatial data, and become familiar with the ArcGIS platform in the process. I tested the activity with a group of GIS educators and now would like to share it with the broader GIS community.
The document makes it clear that we are still in a hybrid world–still needing to download some data for our work in GIS, but increasingly able to stream data from online data services such as those in ArcGIS Online. But these concepts don’t make as much sense unless one actually practices doing this–hence the activity.
In the activity, I ask the user to first practice search strategies in ArcGIS Online, using tags and keywords. Then, I guide the user through the process of downloading and using a CSV file with real-time data. After a brief review of data types and resources, I guide the user of the activity through the process of downloading data from a local government agency to solve a problem about flood hazards. The next step asks users to compare this process of downloading data with streaming the same data from the same local government’s site (in this case, using data from Boulder County, Colorado) into ArcGIS Online. The activity concludes with resources to discover more about these methods of accessing data.
Jill Clark and I have created other hands-on activities on this theme of finding and understanding data as well, available here. We look forward to hearing your comments and I hope this new activity is useful.
In a white paper entitled Transforming Our World: Geospatial Information Key to Achieving the 2030 Agenda for Sustainable Development, DigitalGlobe and Geospatial Media and Communications tie the need for geospatial data to meeting the UN Sustainable Development Goals.
On related topics, we have written about the UN resolution on geospatial data, and the UN Future Trends in geospatial information management, and in our book we wrote about the 8 Millennium Development Goals adopted by UN member states. The white paper brings together some key connections between the Sustainable Development Goals (SDGs) and GIS. The 17 goals include–no poverty, zero hunger, good health and well being, quality education, gender equality, clean water and sanitation, affordable and clean energy, decent work and economic growth, industry, innovation, and infrastructure, reduced inequalities, sustainable cities and communities, responsible consumption and production, climate action, life below water, life on land, peace and justice/strong institutions, and partnerships to achieve the goals. The 17 SDGs and the 169 associated targets seek to achieve sustainable development balanced in three dimensions–economic, social, and environmental. The article focuses on a topic that is central to this blog and our book--the need for data, specifically geospatial data, to monitor progress in meeting these goals but also to enable those goals to be achieved.
The report ties the success of the SDGs to the availability of geospatial data. One finding of the report was that many countries had not implemented any sort of open data initiatives or portals, which is an issue we have discussed here and in our book. The main focus of the report is to identify ways that countries and organizations can work on addressing the data gap, such as creating new data avenues, open access, mainstreaming Earth observation, expanding capacities, collaborations and partnerships, and making NSDIs (National Spatial Data Infrastructures) relevant. For more information on the authors of the paper, see this press release by Geospatial World.
I especially like the report because it doesn’t just rest upon past achievements of the geospatial community to make its data holdings available to decision makers To be sure, there have been many achievements. But one thing we have been critical of in this blog in our reviews of some data portals is that many sound fine in press releases, but when a data user actually tries to use them, there are many significant challenges, including site sluggishness, limited data formats and insufficient resolution, and the lack of metadata about field names, to name a few. The report also doesn’t mince words–there have been advancements, but the advancements are not coming fast enough for the decisions that need to be made.
The report’s main message is that the lack of available geospatial data is not just a challenge to people in the geospatial industry doing their everyday work, but that the lack of available geospatial data will hinder the achievement of the SDGs if not addressed fully and soon.
White paper connecting the UN Sustainable Development Goals (SDGs) to geospatial information, from DigitalGlobe and Geospatial Media and Communications.
One of the exercises in our book involves accessing Boulder County Colorado’s GIS site to make decisions about flood hazards. We chose Boulder County for this activity in large part because their data covers a wide variety of themes, is quite detailed, and is easy to download and use. Recently, Boulder County went even further, with the launch of their new geospatial open data platform. This development follows other essays we have written about in this blog about open data, such as the ENERGIC OD, ArcGIS Open Data, EPA flood risk, Australian national map initiative, and open data institute nodes. Other open data nodes are linked to a live web map on the ArcGIS Open Data site.
Accessible here, Boulder County’s open data platform expands the usability of the data, such as providing previews of the data in mapped form and in tabular form. The new platform allows for additional data themes to be accessed; such as the lakes and reservoirs, 2013 flood channel, floodplain, and streams and ditches, all accessible as a result from a search on “hydrography” below. Subsets of large data sets can also be accessed. In addition, the services for each data set are now provided, such as in GeoJSON and GeoService formats, which allows for the data to be streamed directly to such portals such as ArcGIS Online, and thus avoid downloading the data sets altogether.
Why did the county do this? Boulder County says they are “committed to ensuring that geospatial data is as open, discoverable and usable as possible in order to promote community engagement, stimulate innovation and increase productivity.” The county is providing an incredibly useful service to the community through their newest innovative efforts, and I congratulate them. I also hope that more government agencies follow their lead.
The Census Business Builder app and the Opportunity Project are two new tools from the US Census Bureau that make accessing and using data, and, we hope, making decisions from it, easier for the data analyst. Both of these applications are good representatives of the trend we noted in our book and in this blog— the effort by government agencies to make their data more user-friendly. While I would still like to see the Census Bureau address what I consider to be the still-cumbersome process of downloading and merging data from the American Community Survey and the Decennial Census with the TIGER GIS files, these two efforts represent a significant step in the right direction. While GIS users may still not be fully satisfied by these tools, the tools should expand the use of demographic, community, and business data by non-GIS users, which seems to be sites’ goal.
The Opportunity Project uses open data from the Census Bureau and from communities along with a Software Development Kit (SDK) to place information in the hands of decision makers. Because these decision makers are not likely to be familiar with how to conduct spatial analysis within a GIS, the appeal of this effort is for wiser decision making with the geographic perspective. A variety of projects are already on the site to spark ideas, including Streetwyze, GreatSchools, and Transit Analyst.
The Census Business Builder is a set of web based mapping services that provides selected demographic and economic data from the Census Bureau. You can use it to create customized maps and county and city level reports and charts. A small business edition presents data for a single type of business and geography at a time, while the regional analyst edition presents data for all sectors of the economy and for one or more counties at a time. These tools are based on Esri’s online mapping capabilities and offer some of the functionality of Esri’s Business Analyst Online. Give them a try and we look forward to your comments below.
This week’s guest post is courtesy of Brian Goldin, CEO of Voyager Search.
The Needle in the Haystack
Every subculture of the GIS industry is preaching the gospel of open data initiatives. Open data promises to result in operational efficiencies and new innovation. In fact, the depth and breadth of geo-based content available rivals snowflakes in a blizzard. There are open data portals and FTP sites to deliver content from the public sector. There are proprietary solutions with fancy mapping and charting applications from the private sector. There are open source and crowd sourced offerings that grow daily in terms of volume of data and effectiveness of their solutions. There are standards for metadata. There are laws to enforce that it all be made available. Even security stalwarts in the US and global intelligence communities are making the transition. It should be easier than ever to lay your hands on the content you need. But now, we struggle to find the needle in a zillion proverbial haystacks.
Ironically, GIS users and data consumers need to be explorers and researchers to find what they need. We remain fractured about how to reach the nirvana where not only is the data open, but also it is accurate, well documented, and available in any form. We can do better, and perhaps we learn some lessons from consumer applications that changed the way we find songs, buy a book, or discover any piece of information on the web.
Lesson one: Spotify for data.
In 1999, Napster landed a punch, knocking the wind out of the mighty music publishing industry. When the dust settled, the music industry prevailed, but it did so in a weakened state with their market fundamentally changed. Consumers’ appetite for listening to whatever they wanted for free made going back to business as usual impossible. Spotify ultimately translated that demand into an all-you-can-eat music model. The result is that in 2014 The New Yorker reported that Spotify’s user base was more than 50 million worldwide with 12.5 million subscribers. By June 2015, it was reportedly 20 million subscribers. Instead of gutting the music publishers, Spotify helped them to rebound.
Commercial geospatial and satellite data providers should take heed. Content may well be king, but expensive, complicated pricing models are targets for disruption. It is not sustainable to charge a handful of customer exorbitant fees for content or parking vast libraries of historical data on the sidelines while smaller players like Skybox, gather more than 1 terabyte of data a day and open source projects gather road maps of the world. Ultimately, we need a business model that gives users an all-you-can-eat price that is reasonable rather than a complex model based on how much the publisher thinks you can pay.
Lesson two: Google for GIS.
We have many options for finding the data, which means that we have a zillion stovepipes to search. What we need is unification across those stovepipes so that we can compare and contrast their resources to find the best content available.
This does not mean that we need one solution for storing the data and content. It just means we need one place for searching and finding all of the content no matter where it exists, what it is, what software created it or how it is stored. Google does not house every bit of data in a proprietary solution, nor does it insist on a specific standard of complex metadata in order for a page to be found. It if did, Internet search would resemble the balkanised GIS search experience we have today. But when I want GIS content, I have to look through many different potential sources to discover what might be the right one.
What is required is the ability to crawl all of the data, content, services and return a search page that shows the content on a readable, well formatted page with some normalised presentation of metadata that includes the location, the author, a brief description and perhaps the date it was created, no matter where it this resides. We need to enable people to compare content with a quick scan and then dig deeper into whatever repository houses it. We need to use their search results to inform the next round of relevancy and even to anticipate the answers to their questions. We need to enable sharing and commenting and rating on those pages to show where and how user’s feel about that content. This path is well-worn in the consumer space, but for the GIS industry these developments lag years behind as limited initiatives sputter and burn out.
Lesson 3. Amazon for geospatial.
I can find anything I want to buy on Amazon, but it doesn’t all come from an Amazon warehouse nor does Amazon manufacture it. All of the content doesn’t need to be in one place, one solution or one format; so long as it is discoverable in and deliverable from one place. Magically, anything I buy can be delivered through a handy one-click delivery mechanism! Sure, sometimes it costs money to deliver it, other times it’s free, but consumers aren’t challenged to learn a new checkout system each and every time they buy from a new vendor. They don’t have to call a help desk for assistance with delivery.
Today, getting your hands on content frequently requires a visit an overburdened GIS government staffer who will deliver the content to you. Since you might not be able to see exactly what they have, you almost always ask for more than you need. You’ll have no way of knowing when or how that data was updated. What should be as easy as clip-zip-and-ship delivery — the equivalent of gift-wrapping a package on Amazon — seems a distant dream. But why is this?
While agency leadership extols the virtues of open government initiatives, if their content is essentially inaccessible, the risk of being punished for causing frustration is minimal compared with that of exposing bad data or classified tidbits. So why bother when your agency’s first mandate is to accomplish some other goal entirely and your budget is limited? Government’s heart is certainly behind this initiative, but is easily outweighed by legitimate short-term risks and the real world constraints on human and financial resources.
The work of making public content discoverable in an open data site as bullet proof as Amazon’s limitless store seems can and should be done by industry with the support of the government so that everyone may benefit. In the private sector, we will find a business model to support this important work. But here’s the catch. This task will never be perceived as being truly open if it is done by a company that builds GIS software. The dream of making all GIS content discoverable and open, requires that it everyone’s products are equally discoverable. That’s a huge marketing challenge all by itself. Consider that Amazon’s vision of being the world’s largest store does not include making all of the stuff sold there. There really is a place for a company to play this neutral role between the vendors, the creators of the content and the public that needs it.
On the horizon
We have come so far in terms of making content open and available. The data are out there in a fractured world. What’s needed now isn’t another proprietary system or another set of standards from an open source committee. What’s really needed is a network of networks that makes single search across all of this content, data and services possible whether it’s free or for a fee. We should stop concerning ourselves with standards for this or that, and let the market drive us toward those inevitable best practices that help our content to be found. I have no doubt that the brilliant and creative minds in this space will conquer this challenge.
Brian Goldin, CEO of Voyager Search.
After three months in beta, the European Data Portal has been launched. The portal is set to replace the publicdata.eu site and hosts over 400,000 datasets in a variety of formats including shape, csv, xls and ogc:wms. For the spatial datasets, the site provides a dataset extent visualisation and filter by location option, against a basemap of OpenStreetMap data.
The portal also provides a metadata quality assessment section, with reports on a variety of metrics including popular formats and the top source data catalogs.
One of the innovative inclusions in the new site is the accompanying training companion and e-learning programme, with sessions covering licensing, platforms, formats, linked data and data quality. The European Commission and partners involved in the development of the site have recognised that it’s no longer just about providing access to data, it’s also about providing the necessary information to support the best use of the data.
In an article in Elementa–Science of the Anthropocene, Dr Dawn Wright states how “digital tools can help make communities resilient by providing data, evidence-based advice on community decisions”. However, “the resilience of the tools themselves can also be an issue.” For Dr. Wright, “digital resilience means that to the greatest extent possible, data and tools should be freely accessible, interchangeable, operational, of high quality, and up-to-date so that they can help give rise to the resilience of communities or other entities using them.” She cites Pitroda (2013), who predicts that the future of democratic governance lays not only in the pillars of the executive, legislative and judicial, but also in a fourth pillar of information.
In another article, in Ensia, Dr. Wright says that “If we want a resilient world, we need to start with resilient data”, and that “it’s not just data for data’s sake. The same digital technologies we use to understand how the Earth works are also helping communities in very practical ways.” Two of her recommendations in the article are to be open to partnerships and to tell stories.
I couldn’t agree more. Many of these themes are what motivated us to write our GIS and Public Domain Data book and to write in this blog over these past four years. I also salute Dr. Wright’s recommendation that we must share not only our data, but workflows and use cases. In my own field of GIS education, I encounter this situation daily–educators, for example, need not just the data, but need to know how to use that data for teaching, learning, research, and campus administration.
Pitroda, S. 2013. Series Esri E380 Videos, ed. Esri International User Conference Plenary.