Crowd Sourcing | Spatial Reserves

Georeferencer Project: Crowdsourcing location data for historic maps

June 22, 2015 Jill Clark 1 comment

In 2011 the British Library set up the Georeferencer project to crowdsource the georeferencing of its collections of scanned historic maps. By adding georeference (coordinate) data to the old maps, they can be viewed alongside modern maps via the Old Maps Online data portal and the catalog of georeferenced maps.

Georeferencer Project

Using illustrations extracted from digital books and public domain images posted on Flickr, many of the maps were identified and geo-tagged by a team of volunteers as part of a Maps Tag-a-thon event that ran from Nov 2014 to January this year. Among the collections of maps released so far are the Ordnance Surveyors’ Drawings (one inch to a mile maps for England and Wales 1780 – 1840) and the Amercian Civil War collection.

American Civil War Maps

To date, over 8000 maps have been successfully georeferenced and quality checked by a panel of reviewers.

Categories: Public Domain Data Tags: British Library, Crowd Sourcing, Georeferencer, Public domain

Track on Track: Reflections on GPS Accuracy on a Running Track

November 23, 2014 josephkerski 2 comments

Recently, while at the Applied Geography Conference in Atlanta, I decided to test the spatial accuracy of my smartphone’s GPS in a challenging environment–a rooftop running track. Although located on a roof, the track was surrounded by buildings far taller, and in downtown Atlanta, a location with many other buildings impeding signals from GPS, wi-fi hotspots, and cell phone towers. A further challenge to the GPS positional accuracy was that each lap on the track was only 0.10 miles (0.16 km), and therefore, I would not travel very far across the Earth’s surface.

After an hour of walking, and collecting the track on my smartphone with a fitness app (Runkeeper), I uploaded my track as a GPX file and created a web map of it in ArcGIS Online. As I expected, the track’s position was compromised by the tall buildings–I only had a view of about half the sky during my time on the roof. As you can measure for yourself on the map linked above, the track lines formed a band about 15 meters wide, but interestingly, were more spatially precise along the eastern side of the track, where the signal was better, as you can see in my video that I recorded at the same time.

Also, as I have encountered numerous times in the past, a line about 100 meters long stretches to the north. Rest assured that I did not leap off the building, but rather, the first point that the GPS app laid down as I opened the doors to walk outside was about a block away. Then, as I remained outside, the points became more accurate. When you collect data, the more time you spend on the point you are collecting, typically the more accurate that point is spatially.

Track on Track: Reflections of GPS Accuracy on a Running Track.

Another interesting aspect of this study is that if the basemap is changed to satellite imagery, it appears that the track overlaps the tall building to the west. Try it, using the map link above. However, a closer investigation reveals that this is a result of the orthocorrection that was performed on the imagery; the buildings do not appear from “straight overhead”, but rather, they “fall away” to the east. Turn this into another teachable moment: Images, like maps, are not perfect, but they are very useful. We can learn to manage error and imperfection through critical thinking and through the use of geotechnologies. This is a central topic of our book and of this blog.

To dig deeper into issues of GPS track accuracy, see my related post on errors and teachable moments in collecting data, and on comparing the accuracy of GPS receivers and smartphones and mapping field collected data in ArcGIS Online here and here.

Despite these challenges, overall, I was quite pleased with my track’s spatial accuracy, even more so considering that I had the phone in my pocket most of the time I was walking.

Categories: Public Domain Data Tags: Crowd Sourcing, data quality, GPS, satellite imagery

Inexpensive and crowdsourced remote sensing

September 1, 2013 josephkerski 4 comments

In an article entitled “The Watchers”, David Samuels discusses a company seeking to deploy small satellites into orbit 500 miles (805 km) above the Earth. This company, Skybox, founded by ex-Stanford University students, seeks to shake up the commercial space imaging industry by doing two things: (1) Deploying smaller, less expensive satellites than what the commercial space imaging industry is currently using, the size of a dormitory room refrigerator, and (2) Using crowdsourcing for data classification. They seek to have ordinary citizens classify the incoming data, as well as do some classification themselves, even from images that the company has collected but does not sell. This could be the number of cars in every WalMart parking lot in the USA, the size of slag heaps outside the world’s largest gold mines in South Africa, and the rate at which the wattage along key stretches of the Ganges River is growing. These bits of information, they reason, are clues about the economic health of countries, industries, and individual businesses. Therefore, this information will be so valuable to investors, environmentalists, activists, and journalists, to name a few, that they will be willing to pay for the information. The company is working with the government of Russia for a launch vehicle and hopes to launch its first satellite this month, SkySat-1.

The future: More cameras overhead? Photograph by Joseph Kerski.

This story connects well with issues we raise in the book The GIS Guide to Public Domain Data, including data quality and resolution, military vs. civilian uses of data, crowdsourcing, and privacy. The resolution of the images returned from Skybox’s satellites will be comparable–less than 1 meter–to those from large commercial satellite imaging companies such as Digital Globe. However, the cost of constructing them should be considerably less and the size of the satellite itself considerably smaller. Skybox has added numerous advisers with connections in the defense industry “to avoid any military-industrial squelching of its technology before launch.” Relying on crowdsourcing to classify images is not a new concept, but what is new here is the scale at which it could be employed, and that it is embedded in the company’s business model. How standards will be established to assure data quality to potential purchasers of the derived information will be very interesting indeed. Lastly, the idea of inexpensive, high resolution, easy-to-deploy satellites imaging the planet has enormous privacy implications for those of us on the ground, whether from Skybox or for others who are sure to follow.

Categories: Public Domain Data Tags: Crowd Sourcing, data quality, imagery, privacy, remote sensing, satellite imagery

Reflections on Spatial Data from the 2013 Esri International User Conference

July 22, 2013 josephkerski 1 comment

After a week spent with 14,000 people at the annual Esri GIS Education Conference and the Esri International User Conference, it was evident to me that the themes that we examine in The GIS Guide to Public Domain Data not only are relevant to the conversations that the GIS community is having, but actually grow in importance each year. My observations from this year’s conference include, first, that despite the plethora of spatial data now available, the need for authoritative data remains paramount. Data provides the foundation for everything we do in GIS. This data needs to be curated and provided with sufficient metadata. Curation is particularly important in this era of cloud-based GIS. Second, every data consumer is now also a potential data producer. However, even though citizen science and volunteered geographic information is starting to provide a wealth of data at scales and with details the community only recently dreamed about, the need for data continues to outpace its production.

Joseph Kerski at the Esri UC: Where and Why are Important!

Third, with this avalanche of data and citizen science capabilities comes increased responsibility to use and produce data wisely. Fourth, GIS software and tools have become ubiquitous on just about every electronic device that we use for work and for play. This familiarity helps the community’s efforts in explaining to stakeholders why GIS is necessary. However, at the same time presents a challenge because administrators and policymakers can be lulled into thinking that consumer-facing mapping tools equate to a full GIS implementation, and hence may not understand the need to invest in a GIS. Fifth and most importantly, the world is changing, with pressing issues of biodiversity loss, climate and population change, food security, water quality and quantity, natural hazards, and many others that need to be solved. We won’t be able to effectively make decisions about these issues and effectively plan for the future unless we understand the spatial data that is behind the GIS tools and the resulting decisions that can be made with these tools.

After being with thousands of people from all over the world gathered at the Esri International User Conference, I am confident that these skilled, energetic, and dedicated people can grapple with and solve many of these issues. But again, much of it depends upon the data that we are producing and using.

Categories: Public Domain Data Tags: Crowd Sourcing, data quality, Metadata, Open Data, Software as a Service

Geo-Wiki.org: Crowdsourcing to improve global land cover data

March 17, 2013 josephkerski Leave a comment

In our book The GIS Guide to Public Domain Data, we spend quite a bit of time discussing crowdsourcing, and rightly so: Over the past few years, crowdsourcing has become a viable way not only to collect data, but also to verify and update existing data. Reasons include budget constraints in those agencies that provide data and the subsequent need for field verification, a growing recognition that decisions based on spatial data are only as beneficial as the accuracy of the data sets themselves, the rapid expansion of citizen science, and growth in the number and variety of mobile and web-GIS tools that enable citizen scientists to contribute to the global community.

Examples of verifying and updating existing data are numerous, and a noteworthy one is from a group of researchers at the International Institute for Applied Systems Analysis (IIASA) in Austria who lead an effort to improve global land cover/land use data. This effort, http://www.geo-wiki.org, verifies three land cover data sets, including GlobCover from the ESA, MODIS from NASA, and GLC 2000 from the IES Global Environment Monitoring Unit, through knowledge and photographs from people local to specific areas.

Geo-Wiki.org site.

Besides an improvement of the data and, it is hoped, in the decisions based on those data, some of these efforts feature innovative projects that provide benefit to local people. For example, Geo-Wiki users were asked to identify the presence of cultivated land and settlements in samples in Ethiopia in a “hackathon” associated with USAID in an effort to improve local food security.

More information can be found on the Geo-Wiki site and in an article describing the project.

Categories: Public Domain Data Tags: Crowd Sourcing, land cover, volunteered geographic information

Cucumbers, E. coli and Open Data

July 23, 2012 Jill Clark Leave a comment

Cucumbers, E-coli and open data: This unlikely trio appeared in a recent report entitled Science as an Open Enterprise, which looked into the issues surrounding the huge volumes of public domain data that are currently available, what will be required to exploit that data and how the principles of openness can be preserved. The 2011 outbreak of E. coli poisoning in Germany illustrated the changes in attitudes to sharing scientific research and data; within weeks of the outbreak, the genome of the bacteria was identified, and given the seriousness of the outbreak, the results were published on the Internet as soon as they were available.

The working group that produced the report, led by Geoffrey Boulton FRS FRSE (Regius Professor of Geology Emeritus, University of Edinburgh), set out to ‘identify the principles, opportunities and problems of sharing and disclosing scientific information’ and the measures required to create ‘.. a socially responsible open data regime’. The report goes on to state that curated open data is imperative for science and scientists to meet the increasing demands for public access to scientific data and to address some of the issues caused by the sheer volume of data that is now available for analysis. Sharing, compiling and integrating data sources will increasingly provide the basis for academic and scientific investigations.

Although the report focused on the academic and scientific communities, it touches on many of the issues we discuss in The GIS Guide to Public Domain Data – the value of open and unrestricted access to data, the importance of being able to establish the provenance of data, metadata, data curation to establish a measure of quality, citizen science, crowd sourcing, and what is good data? What may be considered good for one application may be wholly unsuitable for another. Vast amounts of public domain data are now available from a diverse range of sources – it is up to the individual or organisation to assess what is good enough for their requirements.

Categories: Public Domain Data Tags: Citizen science, Crowd Sourcing, E. coli, Metadata, Open Data

Welcome

April 16, 2012 Jill Clark 7 comments

Welcome to the Spatial Reserves blog.

The GIS Guide to Public Domain Data was written to provide GIS practitioners and instructors with the essential skills to find, acquire, format, and analyze public domain spatial data. Some of the themes discussed in the book include open data access and spatial law, the importance of metadata, the fee vs. free debate, data and national security, the efficacy of spatial data infrastructures, the impact of cloud computing and the emergence of the GIS-as-a-Service (GaaS) business model. Recent technological innovations have radically altered how both data users and data providers work with spatial information to help address a diverse range of social, economic and environmental issues.

This blog was established to follow up on some of these themes, promote a current discussion of the issues raised, and host a copy of the exercises that accompany the book. This story map provides a brief description of the exercises.

pdd