Posts Tagged ‘web GIS’

Best Available Data: “BAD” Data?

August 14, 2017 3 comments

You may have heard the phrase that the “Best Available Data” is sometimes “BAD” Data. Why?  As the acronym implies, BAD data is often used “just because it is right at your fingertips,” and is often of lower quality than the data that could be obtained with more time, planning, and effort.  We have made the case in our book and in this blog for 5 years now that data quality actually matters, not just as a theoretical concept, but in day to day decision-making.  Data quality is particularly important in the field of GIS, where so many decisions are made based on analyzing mapped information.

All of this daily-used information hinges on the quality of the original data. Compounding the issue is that the temptation to settle for the easily obtained grows as the web GIS paradigm, with its ease of use and plethora of data sets, makes it easier and easier to quickly add data layers and be off on your way.  To be sure, there are times when the easily obtained is also of acceptable or even high quality.  Judging whether it is acceptable depends on the data user and that user’s needs and goals; “fitness for use.”

One intriguing and important resource in determining the quality of your data can be found in The Bad Data Handbook, published by O’Reilly Media, by Q. Ethan McCallum and 18 contributing authors.  They wrote about their experiences, their methods and their successes and challenges in dealing with datasets that are “bad” in some key ways.   The resulting 19 chapters and 250-ish pages may make you want to put this on your “would love to but don’t have time” pile, but I urge you to consider reading it.  The book is written in an engaging manner; many parts are even funny, evident in phrases such as, “When Databases attack” and “Is It Just Me or Does This Data Smell Funny?”

Despite the lively and often humorous approach, there is much practical wisdom here.  For example, many of us in the GIS field can relate to being somewhat perfectionist, so the chapter on, “Don’t Let the Perfect be the Enemy of the Good” is quite pertinent.   In another example, the authors provide a helpful “Four Cs of Data Quality Analysis.”  These include:
1. Complete: Is everything here that’s supposed to be here?
2. Coherent: Does all of the data “add up?”
3. Correct: Are these, in fact, the right values?
4. aCcountable: Can we trace the data?

Unix administrator Sandra Henry-Stocker wrote a review of the book here,  An online version of the book is here, from, but in keeping with the themes of this blog, you might wish to make sure that it is fair to the author that you read it from this site rather than purchasing the book.  I think that purchasing the book would be well worth the investment.  Don’t let the 2012 publication date, the fact that it is not GIS-focused per se, and the frequent inclusion of code put you off; this really is essential reading–or at least skimming–for all who are in the field of geotechnology.


Bad Data book by Q. Ethan McCallum and others. 



Data Quality on Live Web Maps

June 19, 2017 3 comments

Modern web maps and the cloud-based GIS tools and services upon which they are built continue to improve in richness of content and in data quality.  But as we have focused on many times in this blog and in our book, maps are representations of reality.  They are extremely useful representations, to be sure, particularly so in the cloud, but still are representations.   These representations are dependent upon the data sources, accuracy standards, map projections, completeness, processing and rendering procedures used, regulations and policies in place, and much more.  A case in point are offsets between street data and the satellite image data that I noticed in mid-2017 in Chengdu in south-central China.  The streets are about 369 meters southeast of where they appear on the satellite image (below):


Puzzled, I panned the map to other locations in China.  The offsets varied, but they appeared everywhere in the country; for example, note the offset of 557 meters where a highway crosses the river at Dongyang, again to the southeast:


As of this writing, the offset appears in the same cardinal direction and only in China; indeed; After examining border towns with North Korea, Vietnam, and other countries, the offset appears to stop along those borders.  No offsets exist in Hong Kong nor in Macao.  Yahoo Maps Bing Maps both show the same types of offsets in China (Bing maps example, below):


MapQuest, which uses an OpenStreetMap base, showed no offset.  I then tested ArcGIS Online with a satellite image base and the OpenStreetMap base, and there was no offset there, either (below).  This offset is a datum issue related to national security that is documented in this Wikipedia article.  The same data restriction issues that we discuss in our book and in our blog touch on other aspects of geospatial data, such as fines for unauthorized surveys, lack of geotagging information on many cameras when the GPS chip detects a location within China, and seeming unlawfulness of crowdsourced mapping efforts such as OpenStreetMap.

But furthermore, as we have noted, the satellite images are processed tiled and data sets, and like other data sets, they need to be critically scrutinized as well.  They should not be considered “reality” despite their appearance of being the “actual” Earth’s surface.  They too contain error, may have been taken on different dates or seasons, may be reprojected on a different datum, and other data quality aspects need to be considered.


Another difference between these maps is the wide variation in the amount of detail in terms of the streets data in China.  The OpenStreetMap was the most complete; the other web mapping platforms offered a varying level of detail; some of which were seriously lacking, surprisingly especially in the year 2017, in almost every type of street except major freeways.  The streets content was much more complete in other countries.

It all comes back to identifying your end goals in using any sort of GIS or mapping package.  Being critical of the data can and should be part of the decision making process that you use and the choice of tools and maps to use.  By the time you read this, the image offset problem could have been resolved.  Great!  But are there now new issues of concern? Data sources, methods, and quality vary considerably among different countries. Furthermore, the tools and data change frequently, along with the processing methods, and being critical of the data is not just something to practice one time, but rather, fundamental to everyday work with GIS.

Copyright in Today’s Web Mapping World

September 27, 2015 1 comment

GIS professional Nicholas Duggan has written an excellent article and a flowchart to help mapmakers and GIS analysts decide if they can legally use specific data sets in their work.

As Duggan points out, “anyone can make maps”, and as we emphasize in our book, today’s web mapping environment makes accessing data easier than ever.  But even though it is possible to use a specific data set, does that mean that we legally have the right to do so?  Duggan’s flowchart can help make these decisions.  About his flowchart, Duggan says that “it does not cover the plethora of data or map license types, this chart provides an easy reference as to whether you may or may not use the material you intend to use. Of course this may vary from country to country and on a case by case basis; also this does not serve as a legal document; legal advice should be obtained in case of dispute.”  I find the flowchart to be very useful and applaud Mr Duggan for creating and sharing it.

And yes, following his own advice and ours that we wrote about in the book, I did ask his permission if we could refer to his resource in this blog!  Thank you Nicholas.

Nicholas Duggan's article and flowchart to help GIS users decide if they can use a data set.

Nicholas Duggan’s helpful article and flowchart to help GIS users decide if they can use a data set.