A Map to Access the Open Data Portals of the World

September 13, 2021 14 comments

My colleague Nicolas Holm has created a very useful map showing the locations of the open data portals of the world. In our geospatial data-rich environment in which we all now work, this map is valuable because it allows data users to zoom in on specific locations where data is likely to be served for that location or region. It helps fill the gap in the need for a ‘central repository or library’ for geospatial data and acts as “data on data”. Nicolas created the map from the database from Open Knowledge International and OpenDataSoft which he wishes to gratefully acknowledge.

The resource is simple yet powerful–zoom in on area of interest, find points that represent data portals, and click on the data portal where you suspect your desired data set(s) will be. Most of the portals that are featured on this map are what we have featured in this blog as exemplifying “modern data portals”; ones that offer streaming and/or downloading options, many formats to choose from, and the ability to view the data before accessing it, typically using ArcGIS Hub or other open data sharing tools.

After testing the map, I was pleased to find many of my favorite international and also local and regional data portals included. We have reviewed many of these in this blog over the past decade.

In keeping with the theme of this blog, be critical and closely examine each portal to determine whether it meets your needs. Also keep in mind that the data you are seeking for a specific area might not necessarily be served by a data portal located in that area. For example, wetlands data for Area X may actually be served by a portal in Area Y, which may be on the other side of the state/province or even in the national capital by the national mapping or science agency. Still, for many applications, the local data portal might be the most suitable starting point. For example, if I needed data on a specific county’s floodplains, buildings, geologic hazards, and other layers, with this map I can zoom in to my area of interest and find the local and regional data portals from which I could stream and/or download the data that I need.

The map of the open data portals of the world.

–Joseph Kerski

Categories: Public Domain Data

A local cycling example of GIS as a system of engagement

August 30, 2021 2 comments

For many years I and others have been speaking about two intertwining forces in geotechnologies: (1) That GIS has moved from a system of records to a system of engagement, and (2) the connection of mapping to the citizen science community. On (1), to be sure, GIS still is fundamentally tied to records, and indeed, without the spatial and attribute data, you have no GIS. However, GIS is not primarily simply about recording natural and physical objects on, above, or below the surface. And I salute the many people involved in encoding this volume of information, as I used to do in the past at the USGS, and what thousands of dedicated individuals do on a daily basis today.

But these records are not collected just to populate a database or even just to map things: They are created to serve a higher purpose–to enable organizations to make smarter decisions about what is there and what should be there; to forecast, to model, to plan. In addition, data-as-services and software-as-a-service together with field tools allows the public to be engaged in their community as never before. Coupled with that is (2), the citizen science or community science movement, which is nearly 150 years old, but now is seeing rapid expansion given the community’s newfound ability to map the data that they are collecting. Recently I experienced both of these at a personal level.

My city of Lakewood, Colorado USA, has a great web GIS map of trails, and also a request portal through which community members can report about and request things that are in need of repair or otherwise of concern. My community also has a number of data services through the efforts of its excellent GIS staff, including cycling, walking, and hiking trails. Why not, I thought, put my interest in my community, my interest in GIS, and my interest in cycling to the test and try out this request portal? I went cycling and identified a few places of concern to me and surely to other cyclists. I then went to the citizen portal and identified those areas with the photos I had taken. To my surprise, the parks and recreation people called me and asked for further clarification! I provided it, and as you can see in the photographs below, two days later, the places were marked, and two days after that, they were repaired! I called the parks people back to express my gratitude and also emailed and called the GIS staff as well.

Once again, the power of GIS at work! Call me a “satisfied community member”.

This crack in this trail is not a big deal if you are just walking, but to a cyclist, it is just wide enough and at a steep part of the trail to cause significant jarring of your front tire when you hit it.
The crack has been filled! Hooray.
Due in large part to some pretty active roots, the asphalt along this section of trail is full of bumps and cracks. Again not that big of a deal to a hiker, but a big deal to a cyclist.
Just two days later, I noticed these spots had been marked for repair!
Two days after that, the cracks had been filled (see right side) and the broken parts of the fence had also been repaired!

Categories: Public Domain Data

What is the value of location data?

August 16, 2021 Leave a comment

My colleagues here at Esri wrote what I consider to be an excellent essay on the business value of location data and why it matters. In their essay, (https://www.esri.com/about/newsroom/publications/wherenext/what-is-the-business-value-of-location-data/), they make it plain that while this value is more difficult to quantify than, say, a HVAC unit or the revenue of a store, the effort of pinpointing the value is definitely worth doing.

For example, Highways England pegged the value of its physical road infrastructure at £115 billion, a staggering amount, but yet the intangible value it delivered to the country was calculated to be even more, at £200 billion!  The latter figure was not a simple estimate, but arrived at only after nearly a year of extensive data gathering, including interviews with those touched in some way by the work of the highway agencies.

Why does all this matter? The article points out that only 16% of business assets are tangible assets nowadays. The implications of this percentage are clear: Businesses need to be able to understand the value that location data brings, and articulate this value to their stakeholders, customers, and CEOs. In addition to building this shared understanding, purpose, and lexicon, doing so yields additional benefits. For example, through examining the value that data brings to an organization and to the greater society, the business featured in the article put together a case for new investment, as well.

For a related article, see our Spatial Reserves essay on Putting a Value on Geospatial Data and this recent Forbes article. on “how much is your data worth?”

Joseph Kerski

Categories: Public Domain Data Tags:

AirTags: Who’s watching who?

August 9, 2021 3 comments

When I saw the announcement about Apple’s new AirTags, my first thought was to forget tracking the location of my personal possessions, I could use one of these to track my elderly, and occasionally forgetful, Mother. Attach an AirTag to her bag, subject to her consent, and I’d be able to keep an eye on her whereabouts when she heads out to walk her dog.

However, not long after the initial release and reassurances that location privacy was an integral part of the design, a software update for AirTags was made available to counter unintended or surreptitious tracking by other suitably enabled devices in the vicinity. The initial configuration for sending safety alerts for an AirTag separated from its paired iPhone or the presence of AirTag not owned by you but in some way tied to your location (nearby or slipped into a pocket?) and tracked by others, meant alerts were not triggered for three days if you didn’t have an iPhone with IOS 14.5 or had an Android phone. Given the number of iPhones in circulation and the extent of Apple’s Find My network, millions of people could be tracked unwittingly through AirTags and be none the wiser for three days. Even after an upgrade to iOS 14.5 and the AirTags software update, it could still take a couple of hours to alert an iPhone owner to the presence of a so-called stalker AirTag. Chances are nothing would happen but is broadcasting your location like this worth the risk?

In this day and age of heightened awareness of creepy apps, issues related to location tracking and so on, it seems odd this particular scenario hadn’t been considered as a potential security threat. As Brenda Stoylar noted in her Mashable article …

AirTags are easy to use and effective, but their extensive location tracking and ability to go beyond Bluetooth range is also what makes them dangerous for the rest of us.

What makes AirTags potentially dangerous to use is the lack of detailed information describing how they work and a lack of transparency in how location information is, or could be, collected.

50-60 year old Spy Imagery as a source of historic data

August 2, 2021 Leave a comment

Throughout the 1960s and 1970s a top-secret US program dubbed Corona by the Central Intelligence Agency (CIA) captured 800,000 images via satellite of many places around the planet. As an interesting piece of GIS and remote sensing history, the film canisters from these satellites was periodically jettisoned, and physically retrieved in midair via aircraft. The logistics involved in such operations are amazing for their time, and even now. The imagery was declassified in 1995. At that time, I was working as a cartographer for the USGS, and we started offering prints of the Corona imagery then. I remember poring over these prints at our large mapping facility in Denver, considering the vast amount of change evident in a relatively short (30 years) time since they had been collected.

New developments in GIS technology are breathing new life into this set of imagery, making it increasingly available and applied to a wider variety of uses. For example, my colleague Mariah Petrovic wrote a fascinating article about how this imagery is being used to address climate change, here. These include past habitats, water scarcity, and shoreline change. In addition, the imagery is being used to identify archaeological sites and much more.

Now some of this imagery is available in digital form as part of the growing array of truly Big Data sets. One way to access is via the USGS Earth Explorer portal. Another is via the Corona Atlas and Referencing System at the University of Arkansas. Some of the data accessible via the university’s data portal is from satellites that were equipped with two panoramic cameras, one facing forward and another aft, with a 30º angle of separation, producing an approximate ground resolution of 6 feet (1.8m) at nadir. They also offer the capability for stereo viewing, and the extraction of topographic data. Images were originally recorded on black-and-white film. The USGS scanned the images at 7 micron (3600 dpi) resolution. Additional technical details regarding the CORONA program and image characteristics can be read here.

Begin by exploring the data, available here. The atlas allows you to measure, identify, search, swipe, and perform other visualizations on the data. Map layers can be toggled using the Map Contents menu. Some map layers are expandable, allowing sub-layers to be turned on or off. Use the plus sign (+) next to a layer to access sub-layers. Use the blue down-arrow next to an image to download the source data, as shown below:

Downloading the Corona imagery.

The data formats include: (1) GeoTIFF – orthorectified and reprojected to use the “Web Mercator” projection. These should be ready for use in any GIS package that can read GeoTIFFs. The NITF version of this image will require the user to obtain appropriate elevation data to be used in the orthorectification process necessary to display the image in the correct position on the earth’s surface. More information about this process can be found here. This image has not been resampled as has the GeoTIFF listed above, so it is closer to the original imagery. The website helpfully provides coordinates that can be manually input as the data selection coordinates to download SRTM elevation data from OpenTopography.org; for example, for the image above: X-min: 32.428226 Y-min: 35.190579 X-max: 35.394672 Y-max: 35.827378.

Corona imagery for a section of Missouri USA.
Corona imagery for the Great Pyramids at Giza, Egypt.

I encourage you to investigate this amazing resource.

Joseph Kerski

Categories: Public Domain Data

A maintained archive of Open Geospatial Datasets for Education

A data science professor at the University of London, Dr Andrea Ballatore, maintains a wonderfully vast archive of open geospatial datasets for education. The archive resides on GitHub; Dr Ballatore adds to it and organizes it frequently. The datasets are beautifully eclectic, ranging from ancient Roman roads to global air temperature, earthquakes, political boundaries, historical shipwrecks, and much more.

The datasets are in open formats including CSV and TSV (tab separated), GeoJSON, GeoPackage, GeoTiff, and R datasets (.rds). Dr Ballatore has made extra efforts for the data users, making sure that the data are smaller than 50 MB, in an effort to make them suitable for slow speeds or low-end computers. Some files are compressed with gzip (.gz extension), which is available natively on Mac and Linux, and I have used 7 Zip to work with gzip files on Windows hundreds of times over the years.

Just as we always advocate on this blog, you should seek permission if you want to use these datasets outside of education or research.

Dr Ballatore also provides links to additional eclectic archives, such as “awesome public datasets” (climate, neuroscience, and more), Automatic Knowledge (buildings, populated places, indices of deprivation, and many more), and FiveThirtyEight (librarian employment, and more).

As an example, I worked with the site and was able to access a historical shipwrecks GeoJSON file in Dr Ballatore’s collection; I brought it into ArcGIS Online, below. Fascinating!

Shipwrecks from the Open Geospatial Datasets archive, shown in ArcGIS Online.

I highly recommend browsing and using the above archives!

–Joseph Kerski

Categories: Public Domain Data

More Big Data: New York State building footprints online

The New York State GIS Program Office (NYSGPO) announced a short while back the release of an amazing web service that hosts millions building footprints across 30 counties made available by the New York State Energy Research and Development Authority (NYSERDA).  NYSERDA has worked closely with Columbia University’s Center for International Earth Science Information Network (CIESIN) to generate these building footprints.  They recently incorporated the  Microsoft version 2 building footprints as well.  This service also uses data made available by local governments in the state as part of their own funded photogrammetric and planimetric mapping missions.

We have written extensively about CIESIN in this blog and in our book:  CIESIN is one of our very favorite organizations in terms of useful data and visualizations.  I know many of the fine researchers there personally and have enormous respect for them.  For more information on the program that made the building footprints available, see:  http://fidss.ciesin.columbia.edu/home and to directly download the footprints by county visit http://fidss.ciesin.columbia.edu/building_data_adaptation.   Note that the work is in progress; as of this writing, not all counties are finished yet, but keep checking the links in this essay.

The NYSGPO created a Map Service and a Feature Service so you can load these building footprints directly into a GIS such as ArcGIS Pro and ArcGIS Online.  Each footprint contains attributes including the source, date, 100-year  and 500-year flood impact and county building footprint download links.  The site includes the statement that “NYS ITS is not responsible for the data quality that feed the services, users should contact the sources of the data with questions or comments.”

My test of adding this data from the feature service and examining it (below, using ArcGIS Online) was easy and straightforward. The downloading links above worked without a problem, as well.  I salute this initiative and look forward to more data and further initiatives. building_footprintsNYSGPO will continue to work with NYSERDA and CIESIN as more footprints become available as well as incorporating the publicly available Microsoft building footprints.   This is another amazing manifestation of a trend we have written about in GIS in additional essays in this Spatial Reserves blog:  Big Data.  For more, see all of our posts on this topic:   Big Data.

On a related data access note, many of us have grappled with sites that use outdated FTP services for years that are difficult to access in a modern web browser.  This same office, the NY State GIS Program Office, reported recently that they have added https access to download files.  This replaces the FTP links used in the Discover GIS NY applications (orthos.dhses.ny.gov) and in the data index services listed below.  The change means the download links will work in Chrome, Edge, Firefox and Internet Explorer (IE).  Users will not need to use only IE to get the download links to work.  Good news for data users!

The data index services from this organization are wonderful and are listed below:

Joseph Kerski 

Be critical of and aware of default settings in GIS software

June 7, 2021 2 comments

We recently wrote about another reason to be critical of the data–especially imagery–when it can be misinterpreted and when it can be deliberately faked. Included in that essay was a brief paragraph encouraging the community to also be aware and critical of default settings in GIS software when rendering and analyzing imagery. Why? Default settings are there to accommodate a wide variety of users, but can lead to conclusions that are at best, not as rigorous or as accurate as they could be, or, at worst, in error.

Below are images from my Esri colleague and one of my favorite people in all of geospatial, cartographer John Nelson, that represent a set of 8 NASA images. Says John, “They are designed to mosaic together and the native images match each other perfectly. But because of the image appearance default settings assigned by, in this case, ArcPro from Esri, you can easily see seams between them. The default “percent clip” stretch type eliminates 5% of each end of the image histogram, throwing out 10% of the data. Because each image histogram is slightly different, this inherently introduces variability between them. The default “gamma” setting is dynamic based on image and is different for each, in an attempt to find an ideal visual contrast. A gamma of 1 renders the image in its native value. Most of the eight images in this map were given different gamma values (ranging from 1.4 to 1.6) so the visual variability between images is especially stark.” See one of John’s videos illustrating the benefits of and how to quickly and powerfully override the defaults in one imagery example, here.

If a GIS user were to manually reset all of these overrides, which in this case are not ideal, the eight images render cohesively, as designed. Says John, “There is no way to opt out of default rendering overrides, and there is no way to multi-select the image layers and re-set their parameters all at once. If (the GIS user) wants to now adjust their stretch and gamma settings in unison, they have to do that individually or create a new mosaic.”

This obviously applies to any GIS and remote sensing software–all software has default settings and those settings need to be understood. Certainly, in many cases, the defaults are useful, time saving, and appropriate. But knowing what they are–from smart mapping symbology or rendering imagery to many more GIS workflows–are a critical part to our central message of our book and this blog–be critical of the data. I would also argue that since part of misinterpretation of imagery is a result of the lack of knowledge of the electromagnetic spectrum and images rendered in specific bands–all the more reason to include remote sensing in educational curricula!

–Joseph Kerski

Categories: Public Domain Data

Faked Satellite Imagery: Another opportunity to be critical of the data

As we have written about frequently in this blog, all geospatial data should be viewed critically. The user needs to carefully assess the attributes, resolution, date, source, and other characteristics before deciding whether that data is fit for use. The same is true with satellite imagery, for reasons we have described here (Be critical of the data–imagery too!) and here (Imagery–It is what it is. Well, not always).

But a new and disturbing reason for critical thinking has appeared more recently, and that is faked imagery. One of a growing number of articles about this issue is entitled A Growing Problem of ‘deepfake geography’: How AI (Artificial Intelligence) Falsifies Satellite Images. In the research article referred to here, entitled Deep fake geography? When geospatial data encounter Artificial Intelligence, by Bo Zhao, Shaozeng Zhang, Chunxue Xu, Yifan Sun, and Chengbin Deng in Cartography and Geographic Information Science, the authors describes their study. The goal of the study was not to show that geospatial data can be falsified, but rather, “the authors hoped to learn how to detect fake images so that geographers can begin to develop the data literacy tools, similar to today’s fact-checking services, for public benefit.” They suggest timely detections of deep fakes in geospatial data and proper coping strategies when necessary, with a goal to cultivate critical geospatial data literacy and “understand the multi-faceted impacts of deep fake geography on individuals and human society.”

Fake satellite images of a neighborhood in Tacoma with landscape features of other cities. (a) The original CartoDB basemap
tile; (b) the corresponding satellite image tile. The fake satellite image in the visual patterns of (c) Seattle and (d) Beijing, from Zhao et al. article in Cartography and Geographic Information Science.

Situating the issue of images that have been purposefully falsified in a broader context is this very useful article by Pierre Markuse, who advocates that a user needs to differentiate between three different ways an image could be understood (or really debunked) as being a fake: 1. Perceived as fake but in fact just a different representation of the data, 2. Perceived as fake but just a misrepresentation of facts, and 3. Actually faked satellite images. Pierre provides excellent illustrations of each of these three ways, including a supposed fire in Central Park in New York City and a “pollution plume” spilling from a river into a sea. Pierre very helpful concluding section on how to determine if an image is faked or out of context focuses on the themes of this blog–providing practical advice on what questions to ask as you examine and work with images. I highly recommend both of these articles for students, instructors, and researchers.

Along these lines, I would also advocate any user of GIS or remote sensing software to pay close attention to the defaults when images are brought into your software and displayed and rendered. These defaults are not nefarious, to be sure, but they are created to encompass the needs of a wide variety of users. Your needs might very well be different, so make sure you understand what the defaults are and how to change them, so that you are not misunderstanding your data or inadvertently leading others into misunderstanding.

These developments are not unexpected, and while the deliberately faked images are unfortunate, they provide more opportunity to assist students and colleagues around us to always be vigilant and critical of the data–including and perhaps especially geospatial data.

Joseph Kerski

Categories: Public Domain Data