Aqua People? Reflections on Data Collection and Quality

November 20, 2016 3 comments

Data quality is a central theme of this blog and our book.  Here, we focus on quality of geospatial information, which is most often in the form of maps.  One of my favorite maps in terms of the richness of information and the choice of symbology is this “simple map of future population growth and decline” from my colleague at Esri, cartographer Jim Herries.  Jim symbolized this map with red points indicating areas that are losing population and green points indicating areas that are gaining population.  This map can be used to learn where population change is occurring, down to the local scale, and, with additional maps and resources, help people understand why it is changing and the implications of growth or decline.

But the map can also be an effective tool to help people understand issues of data collection and data quality.  Pan and zoom the map until you see some rivers, lakes, or reservoirs, such as Littleton Colorado’s Marston Reservoir, shown on the map below. If you zoom in to a larger scale, you will see points of “population” in this and nearby bodies of water. Why are these points shown in certain lakes and rivers?  Do these points represent “aqua people” who live on houseboats or who are perpetually on water skis, or could the points be something else?


The points are there not because people are living in or on the reservoir, but because the dots are randomly assigned to the statistical area that was used.  In this case, the statistical areas are census tracts or block groups, depending on the scale that is being examined.  The same phenomena can be seen with dot density maps at the county, state, or country level.  And this phenomenon is not confined to population data.  For example, dot density maps showing soybean bushels harvested by county could also be shown in the water, as could the number of cows or pigs, or even soil chemistry from sample boreholes.  In each case, the dots do not represent the actual location where people live, or animals graze, or soil was tested.  They are randomly distributed within the data collection unit.  In this case, at the largest scale, the unit is the census block group, and randomly distributing the points means that some points fall “inside” the water polygons.

Helping your colleagues, clients, students, or some other audience you are working with understand concepts such as these may seem insignificant but is an important part of map and data interpretation.  It can help them to better understand the web maps that we encounter on a daily basis.  It can help people understand issues and phenomena, and better enable them to think critically and spatially.  Issues of data collection, quality, and the geographic unit by which the data was collected–all of these matter.  What other examples could you use from GIS and/or web based maps such as these?

National and Subnational Population Data and Maps from US AID and US Census Bureau

March 30, 2014 1 comment

As readers of this blog and our book are aware, when a geodata portal is confusing or inadequate, we are not afraid to say so.  And conversely, when a resource comes along that contains a wealth of content and is actually intuitive to use at the same time, we share that as well.  An example of a new, useful, and intuitive resource comes from the Demographic and Health Surveys of the US AID program and the US Census Bureau.   The site provides detailed demographic data primarily for countries that receive assistance via the President’s Emergency Plan for AIDS Relief (PEPFAR).  The data are available for single countries and also multiple countries through a data package, all of which the user chooses and customizes.  Through the site, the US Census Bureau has added to and updated the online collection of subnational population data linked to maps.  

To access the maps and data, begin at the main website for the project, select Data, select countries, select indicators (variables), select the format (shapefile or geodatabase), and indicate whether you want to download it now in a browser or receive an email when the package is ready.  You can choose up to 25 variables at a time to be included in the package. I tested it and it worked marvelously.  Also, in the near future, the US Census Bureau will release a seamless global map containing population estimates for tens of thousands of subnational administrative areas globally. Wouldn’t it be grand if all sites were this simple to use?  

The Spatial Data Repository provides health and demographic data from The Demographic and Health Surveys Program and the U.S. Census Bureau.

WorldPop – Population distribution data

November 18, 2013 1 comment

Last year we blogged about the AsiaPop and AfriPop projects that had been established to produce detailed and freely available population distribution maps for Asia and Africa. Last month the WorldPop project was launched, combining the AfriPop, AsisPop and AmeriPop population mapping projects into ‘a single open access archive of spatial demographic data for Central & South America, Africa and Asia for development and health application‘.

 The following free datasets can be downloaded (as a zipped GeoTiff file) from the WorldPop site:

– Population distribution datasets for African and Asian countries
– Age/sex structured population distribution datasets for Africa 2000-2015
– New 2000-2010 Asia population distribution datasets incorporating satellite-derived urban growth maps
– Births/pregnancies distribution datasets for eleven countries
– Multidimensional and consumption-based poverty rate datasets for five countries

WorldPop poverty data

Metadata is provided with each data download and for those interested in finding out about how the data were  produced, the methodological details are available on the WorldPop website. In the coming months additional datasets will be made available including:

– Population distribution datasets for Central & South America
– Age/sex structured datasets for Central & South America and Asia
– Births/pregnancies datasets for 75 countries
– Multiple updates on existing population rate datasets through new input data and methods