Archive for the ‘Public Domain Data’ Category

An introduction to Ethics in GIS

November 10, 2019 Leave a comment

One of the objectives of this blog and our book is to not only help you gain technical knowledge about GIS and data, but also to help you understand the societal issues surrounding data.  Ethics is central to many of these societal issues.  We have written about ethics in geospatial decision making, ethics in using images in mapping projects, company ethics vs. technical reputation, and ethics surrounding data quality issues.  But here let us discuss one way of introducing ethics to co-workers and to students with an example of how I have integrated ethics into one of my own courses on cartography and geo-visualization.  The following is the actual text and readings that I use in this course.  I look forward to your reactions.

Ethics in GIS.  Ethics in science is an expansive topic; it is introduced here, but you will have the opportunity to explore it further later in this course.  Ethics matter in GIS because:  (1) Knowing that maps are powerful means of communication, you should take that responsibility as map author seriously.  (2) Knowing from our brief discussion on crowdsourcing and citizen science that everyone is now a potential map producer, and no longer just a map consumer, there are more maps in existence than ever before–with a wide variety of quality and purposes–some well documented, some not so.  That said, maps still have an aura of authenticity–they tend to be believed.  Again, take that responsibility seriously, and do not intentionally mislead your audience.

The Social implications for GIS began to be examined during the mid-1990s with books such as Ground Truth.  (Links to an external site.) Another oft-cited book on this topic is How to Lie with Maps (Links to an external site.) by geographer Mark Monmonier, which examined the ways that maps are distortions–intentionally and unintentionally–of reality.

Code of Ethics.  There are several key items that are generally thought to be included in a code of ethics for people working in the field of GIS.  The first is to have a straightforward agenda, ensuring that the purpose of your map is evident to the map reader.  It should not be deceiving or confusing, but rather, transparent in its purpose.  The second code is to get to know your intended audience as much as you can, so you can effectively communicate through maps.  The third code is to not intentionally lie with data–do not symbolize or classify the data with the intent to deceive.  The fourth code is that a map should show all relevant data as completely as possible–do not intentionally leave things or context out that could help the reader understand the phenomenon, again, balancing this with the guidelines about abstracting and generalizing.

The fifth code is that a map should not discard contrary data just because it is contrary.  Rather, your map should be as much as possible a neutral representation of reality, just as your research often should be.   The sixth code is that the map should strive for an accurate portrayal of the data, where the data is not diminished or exaggerated.  The seventh code is to avoid plagiarizing.  Just like your research, you should always properly cite your sources of information. You can cite sources via the map’s metadata.  The eighth code is to select symbols that will not bias the map. The symbols should be neutral representation of features.  The classification and projection, too, should be chosen so that potential bias is minimized.  Code nine is that the map should be repeatable, such that another GIS professional should be able to independently create a similar map using the same data and focusing on the same message. The 10th code is to be sensitive to different cultural values and principles when making your map, such as color and symbols.  In summary, when creating a map, you should strive to provide a truthful, neutral representation of reality targeted specifically for your audiences’ level of knowledge so that your map can effectively convey your intended message.

(Source: (Links to an external site.) for the 10 cartographer’s codes of ethics in this document, with modifications by Joseph Kerski).

For more on geospatial ethics, (1) see the GIS Certification Institute’s Code of Ethics: and (2) see these articles:

(1) (Links to an external site.)  – The GIS Professional Ethics Project:  Practical Ethics Education for GIS Pros.  by David DiBase et al. 2009).

(2) A new National Academy of Sciences report:  National Academy of Sciences.  2018.  Data Matters.  Ethics, Data, and International Research Collaboration in a Changing World: Proceedings of a Workshop. (Links to an external site.)

Joseph Kerski


–Photograph by Joseph Kerski at a high school that is active in preparing students for business careers.   It is my hope that ethics are included in the discussion here and in all other science, business, GIS, and all other academic programs.

Categories: Public Domain Data Tags: ,

Research tying spatial data to resiliency and development goals

October 27, 2019 Leave a comment

One goal of this blog and our book is to raise awareness and action to develop data sets, data standards, and data portals so that decisions will be increasingly made with geospatial information.  One of the chief challenges to this is the persistent lack of geospatial information.  It isn’t just “us” as the GIS practitioners talking with each other about this.  As far back as 1992, Goodchild, Haining and others were pointing out this very thing in their article in the International Journal of GIS.

More recently, research studies have appeared that tie spatial data to much broader resiliency and development initiatives–specifically, that the lack of data is hindering some much broader planet-wide goals.   A white paper entitled Transforming Our World:  Geospatial Information: Key to Achieving the 2030 Agenda for Sustainable Development ties the need for geospatial data to meeting the UN Sustainable Development Goals.  In yet another example, both of the following studies indicate that the lack of data is one of the biggest obstacles to progress toward the UN development goals.

  • United Nations Independent Expert Advisory Group (UN). 2014. A World That Counts: Mobilising the Data Revolution for Sustainable Development. A Report to the UN Secretary General. New York, NY: United Nations, p. 28.
  • Stuart, E., E. Samman, W. Avis, and T. Berliner. 2015. The data revolution: finding the missing millions. ODI Research Report 03. London: Overseas Development Institute, p. 51.

The following study states that advances in research on resilience and vulnerability are hampered by access to reliable data.

  • Barrett, C. B. and D.D. Headey. 2014. Measuring resilience in a risky world: Why, where, how, and who? 2020 Conference Brief, 1. May 17-19, Addis Ababa, Ethiopia. Washington, D.C: International Food Policy Research Institute.

The biological and conservation community has been particularly active in this area, pointing out the unequal distribution of biodiversity data across the globe, by region, over time, and also in the coverage of certain taxa and ecosystems, such as in the following articles.

  • Amano, T., Sutherland, WJ.  2013.  Four barriers to the global understanding of biodiversity conservation: Wealth, language, geographical location, and security.  Proceedings of the Royal Society B  280.  (article 20122649)
  • Gaiji, S., Chavan, V., Ariño, AH., Otegui, J., Hobern, D., Sood, R., Robles, E.  2013.  Content assessment of the primary biodiversity data published through GBIF network: Status, challenges, and potentials.  Biodiversity Informatics 8(94):  172.
  • Osawa, T., Jinbo, U, Iwasaki, N.  2014.   Current status and future perspective on “Open Data” in biodiversity science, Japan.  Japanese Journal of Ecology 64:  153-162.
    If you have others to add to this list, please comment and share!

    The lack of geospatial information hinders the ability to meet the UN Sustainable Development Goals and other major global initiatives.  —Photograph by Joseph Kerski.

Teaching Location Privacy and Resolution with a Big Pixel Image

October 13, 2019 Leave a comment

Ever since those ultra-high-resolution “gigapan” images began appearing from Microsoft and other sources a decade ago, I have been fascinated by them for their use in education.  Today, I frequently use the following image taken off of the Oriental Pearl Tower in China (at 468m, the tallest tower in China from 1994-2007):     This image, compiled from billions of pixels, is amazing in its resolution.  A video on how I teach with it is here. 


Big pixel image from Oriental Pearl Tower in China–initial view.

I have, for example, included this image in a university cartography and geo-visualization course that I teach online.  I first ask the students to examine the cultural geography, assessing the land use, zoning, traffic, and other aspects.  Then, I ask them to examine the physical geography–the terrain, the vegetation, the river winding through the city, and so on.

Third, I ask them to consider the resolution, reflecting on what we have discussed thus far in the course.  I ask them: Can you see inside office buildings and residential windows? Can you read license plates on cars?  Can you determine what pedestrians look like?  I ask them to think about:  Do your answers and the resolution of this image bring up any ethical concerns?


Big pixel image from the Oriental Pearl Tower in China–detailed view. 

Fourth, I ask them to consider another topic we have discussed:  The Internet of Things and our connected world.  Where does information come from?  Increasingly, it is from webcams, sensors, and humans.  We have a chat about face recognition software and how none of the faces in this image (as of this writing) are blurred.  What are the implications for blurring and not blurring?  Finally, I ask them to take a random sample of 10 people in the gigapixel image.  How many people are holding a tablet or smartphone?  What implications does this have on information, and for society?

–Joseph Kerski

Faces to Places: Location tracking and Facial Recognition Technology

October 7, 2019 Leave a comment

We have written many times over the years about insidious and invasive location tracking practices; the apps and devices we use that capture our location information until an outcry forces a rethink about personal rights and institutional ‘transparency’. Just when we start to think it’s all under control, another reason to be concerned emerges. Facial Recognition Technology (FRT) is now widely used in many countries, with live tracking via CCTV infrastructure now common practice. By comparing a database of existing photos with live images of crowds and individuals, possible matches at specific locations are flagged for further investigation.

Source: Skitterphoto (

Previously mobile devices – phones, tablets, activity trackers – could identify an individual at a particular location. That information could prove the device was at a location but not necessarily that the person owning the device was also present. With facial recognition large organisations and public authorities can now link a face to a place without the need to rely on the device-in-the-middle.

However, once again the widespread adoption of this technology has raced ahead of the legal safeguards governing its use. With frequent claims of the misuse of personal information, such as the recently reported case at London’s Kings Cross station, bias and the potential for misidentification (American Bar Association report) many groups are now calling for a review of FRT. San Francisco became the first US city this year to ban the use of FRT by its government although private companies are exempt from the regulation. The Chinese government recently announced plans to regulate the use of facial recognition technology in schools. Both the European Commission (EC) and the United Nations are also currently investigating how best to restrict the use of such technology. The EC is seeking to introduce additional regulations that will safeguard citizen rights over the use of their facial recognition data. 

Is FRT the ultimate personal location metric for the trackers? 2019 has seen an increase in awareness of the issues surrounding the use of this technology. Will 2020 see the introduction of additional regulation governing that use?

Detailed Political Boundaries and Thematic Data from the William and Mary geoLab

September 29, 2019 Leave a comment

Obtaining detailed, accurate political boundaries of countries and administrative districts within those countries has long been a challenge.  The William and Mary geoLab, run by geographer Dan Runfola, provides a novel boundary dataset (geoBoundaries) designed to improve on the Database of Global Administrative Areas (GADM) by providing meticulous metadata on boundary sources, including detailed license information.  This dataset is accessible through either the geoLab site (, or through GeoQuery, an online tool designed to enable easy use to GIS data.  Funded by a combination of USAID, the Cloudera Foundation, and a variety of other non-profits, GeoQuery allows users to generate a CSV based on selected administrative boundaries, satellite data, and survey data (including dollar amounts of international aid sent to each administrative zone).  Behind-the-scenes, a large-scale cluster computing environment (SciClone) processes raster and vector boundaries dynamically to respond to each user query, leveraging a combination of open source python (rasterIO) and Hadoop (CDH) tools.

I began my test of the resource by requesting Armenia’s internal administrative boundaries.  It was easy to select what I needed because of the text-based folder structure of the site.  Oftentimes, despite the wonderful map-based selection tools we now have at our fingertips, this text method is the fastest and most straightforward.


Folder for Armenia boundary data.

I unzipped the resulting data set and brought it into ArcGIS Pro, shown below in light red atop the topographic basemap from ArcGIS Online.


Boundaries (in light red) downloaded from the site vs. the ArcGIS topographic basemap.

As we have advocated throughout this blog and our book, be critical of the data.  For example, on the map above, my newly downloaded boundaries do not completely match either this basemap or other basemaps.  Why?  I am tempted to say that given the focus and rigor of the WM GeoLab data, that their boundaries are the more accurate.  But again, dig deep into the data sets you are using, make you understand how they were obtained, and make sure they meet your needs.

With the geoLab’s detailed boundary information, you can also obtain a selected set (approximately 50 layers) of curated thematic data within specific regions in specific countries.  You can also generate your own thematic data based on flows of international aid to each country on the fly (for example, all environmental aid to each region, or infrastructure aid from the World Bank).   In the example below, my goal was to obtain mean temperature for provinces within Armenia.


Downloading the thematic data from the William and Mary research lab.

Once I submitted my request, within a few minutes, I received an email about the data’s availability and was able to download a zip file containing everything I needed.   The email contained a link for me to review my request and download the results and documentation, and a link to download the results directly, (and one nice feature is that this link is permanent, so you can go back to it later), and another link to review all of my current and previous requests.

The site also helpfully provided a citation reminder and information:  “Don’t forget to cite both AidData’s GeoQuery tool as well as each dataset you selected within GeoQuery. All citations can be found in the Documentation PDF at the link above. Here’s the correct citation for GeoQuery:   Goodman, S., BenYishay, A., Lv, Z., & Runfola, D. (2019).   GeoQuery:   Integrating HPC systems and public web-based geospatial data tools. Computers & Geosciences, 122, 103-112. ”

I brought the CSV into ArcGIS Pro and joined it on the province name to the boundaries layer that I previously downloaded.   Within a few minutes, I had the data joined, classified, and symbolized, and was ready to start my analysis.  The only thing that was less than ideal was that I had to join on province name, as I could not find a common code between the boundaries and the thematic data.  This worked fine in my case but prior experience with joining makes me always want to join on a code, if possible.   This was because I used GADM for the boundaries.  If I had selected “GeoBoundaries” on the first page of GeoQuery (where I chose my boundary), then I would have been provided a unique ID for joining across the entire world.


Mapping the thematic data with the province data. 

According to the site manager, they are moving towards a system that simply sends the boundary data to the user when they make the request, rather than me having to download the boundary information separately.   Stay tuned, as the site continues to evolve and improve.

I highly encourage you to give this resource a try!

–Joseph Kerski with gratitude to Dan Runfola at the College of William and Mary.

Categories: Public Domain Data

A review of the Oak Hill West Virginia Open Data Site

September 1, 2019 Leave a comment

Recently at the Esri User Conference, I met the amazing and innovative GIS coordinator for the city of Oak Hill West Virginia.  The open data portal that this coordinator created represents an excellent example of what we have been describing in this blog–the open data movement combining with tools that enable GIS administrators to create and maintain the resources that will serve their internal and external data users.

Oak Hill Open Data works alongside its main website to provide information and enhance transparency to constituents through the power of GIS. Oak Hill Open Data is the repository for data, maps, and apps being generated by the City of Oak Hill. It features the city’s zoning map viewer, “Oak Hill CPR” for submitting citizen complaints directly to the city, an Operations Dashboard for dilapidated structures, and Story Maps about Needleseye Park and the city’s monthly city council meetings.

The site is easy to use, and includes web applications, pages, and even social media feeds and videos.  A search in the top search box on zoning, structures, transportation, and other words and phrases immediately netted me exactly what I was looking for, as streaming data services or as downloadable files in a variety of formats.  I have already used it to create several GIS-based lessons, such as investigating traffic accidents, and intend to do so in the future.  I salute those involved with putting this resource together and I encourage other governmental, nonprofit, academic, and private organizations to make use of the tools such as ArcGIS Hub to do something similar.  Fortunately, many are doing just that!

oak_hill_open_dataA section of the open Data Portal for Oak Hill, West Virginia (  

–Joseph Kerski