Posts Tagged ‘data collection’

Aqua People? Reflections on Data Collection and Quality

November 20, 2016 3 comments

Data quality is a central theme of this blog and our book.  Here, we focus on quality of geospatial information, which is most often in the form of maps.  One of my favorite maps in terms of the richness of information and the choice of symbology is this “simple map of future population growth and decline” from my colleague at Esri, cartographer Jim Herries.  Jim symbolized this map with red points indicating areas that are losing population and green points indicating areas that are gaining population.  This map can be used to learn where population change is occurring, down to the local scale, and, with additional maps and resources, help people understand why it is changing and the implications of growth or decline.

But the map can also be an effective tool to help people understand issues of data collection and data quality.  Pan and zoom the map until you see some rivers, lakes, or reservoirs, such as Littleton Colorado’s Marston Reservoir, shown on the map below. If you zoom in to a larger scale, you will see points of “population” in this and nearby bodies of water. Why are these points shown in certain lakes and rivers?  Do these points represent “aqua people” who live on houseboats or who are perpetually on water skis, or could the points be something else?


The points are there not because people are living in or on the reservoir, but because the dots are randomly assigned to the statistical area that was used.  In this case, the statistical areas are census tracts or block groups, depending on the scale that is being examined.  The same phenomena can be seen with dot density maps at the county, state, or country level.  And this phenomenon is not confined to population data.  For example, dot density maps showing soybean bushels harvested by county could also be shown in the water, as could the number of cows or pigs, or even soil chemistry from sample boreholes.  In each case, the dots do not represent the actual location where people live, or animals graze, or soil was tested.  They are randomly distributed within the data collection unit.  In this case, at the largest scale, the unit is the census block group, and randomly distributing the points means that some points fall “inside” the water polygons.

Helping your colleagues, clients, students, or some other audience you are working with understand concepts such as these may seem insignificant but is an important part of map and data interpretation.  It can help them to better understand the web maps that we encounter on a daily basis.  It can help people understand issues and phenomena, and better enable them to think critically and spatially.  Issues of data collection, quality, and the geographic unit by which the data was collected–all of these matter.  What other examples could you use from GIS and/or web based maps such as these?