Archive

Posts Tagged ‘Open Data’

A review of Google’s search engine for Open Data

September 17, 2018 4 comments

An article in Nature described Google’s new search engine for open data, and since geospatial data is a fundamental part of open data, and after all these years, still challenging at times to find, I was immediately interested in testing it.

The tool, called Google Dataset Search, is accessible on this link.  Like Google Scholar and Google Books both of which I make heavy use of, this is a “specialized search engine.”

The utility of this tool will depend on metadata tagging. Indeed, as the article points out, “those who own the data sets should ‘tag’ them, using a standardized vocabulary called Schema.org, an initiative founded by Google and three other search-engine giants (Microsoft, Yahoo and Yandex)”. The schema.org dataset markup is the standard used here, but others are supported, such as CSVs, imagery, and proprietary formats.  I had to laugh at the open-ness of the last line in the list of what could qualify as a dataset, “Anything that looks like a dataset to you.”

Many data search engines and portals have a vast amount of data but very little geospatial data.  But with this tool, in my test searches, I found many useful geospatial data sets, some of which I knew and some that were new to me.  I have had challenges finding stream gauging data services for Australia recently, and with this tool found some new leads to investigate.  Being Google based, the searches were rapidly returned, with what I considered enough information to decide whether or not to investigate more fully (see screenshot below).  The data format was featured prominently, as was the coverage, both of which I appreciated.  NOAA was an early adopter of the indexing, and so it makes sense that I could find many NOAA data sets using this search engine.

I wonder if data in Github, or in Esri’s Living Atlas, or on state, national, and international portals will be findable.  I also wonder how the sheer importance of Google will influence how organizations tag their data in the future, and the influence this will have on agencies that perhaps did not put as much time on metadata as they perhaps should have.  Time will tell, but if Google Scholar and Google Books are any indication, the Google Dataset Search could indeed prove to be extremely useful for many of us in GIS research and education.

1111

Result of stream gauge search in the new Google data search engine. 

Advertisements

Ordnance Survey GB to provide OS MasterMap data for free.

June 19, 2018 1 comment

The UK government announced last week that key parts of the OS MasterMap dataset (OSMM) are to be made available free of charge (see full announcement from OS). The following two datasets are due to be released under the Open Government licence (OGL) agreement:

  • OS MasterMap Topography Layer property extents
  • OSMM Topography Layer TOIDs (TOpographic IDentifiers), built into the features in the OS OpenMap-Local dataset.

In addition, a number of datasets will be made available (through an API) for free, up to a threshold, including:

  • OS MasterMap Topography Layer, including building heights and functional sites
  • OS MasterMap Greenspace Layer
  • OS MasterMap Highways Network
  • OS MasterMap Water Network Layer
  • OS Detailed Path Network

The announcement didn’t included any information on what the threshold for free access was, but no doubt details will start to filter out shortly as organisations start making use of these new data assets.

An update on the World Bank’s Spatial Agent

May 28, 2018 5 comments
It sounds like a modern detective novel, but the Spatial Agent is actually a new, free app from the World Bank that offers one-stop access to interactive maps and charts of national, regional, and global datasets.  Jill Clark reviewed this site on our blog here. As we have written about data sources on this blog for nearly six years, and covered this topic in our book, the phrase “one-stop access” naturally caught my attention.  Could the Spatial Agent truly be all that it claims to be?

 

To find out, I began by watching a webinar that Mr Harshadeep recently conducted, which is as of this writing, still available online, here, after a short registration process.  In the webinar, and after my subsequent investigations, I was amazed at how the Spatial Agent as an app could bring together on-demand thousands of free, public-domain spatial data and analytical services (from in-situ and earth observation sources and also live cloud computing services).  It represents the data from sources such as the UN, NASA, NOAA, ESA, World Bank, many universities, and thousands of other sources, covering themes such as social (poverty, water supply), environmental (land use, biodiversity), economic (GDP, energy), and climate (snow cover, precipitation, for example).

The goal of the Spatial Agent is to offer solutions to many of the development challenges faced across the globe, which are often hampered by the poor availability of spatial data. For example, the app can be used to determine the areas in Madagascar that are susceptible to cyclones, or the areas in India that have high child malnutrition, or discovering the major exports of Vietnam, or determining how fast the population in Lagos is rising.  As these examples show, the Spatial Agent’s data cross boundaries, disciplines, and cover many different scales.  The Spatial Agent is the creation of Nagaraja Harshadeep, the lead environmental specialist and global lead for watersheds at the World Bank.  Mr Harshadeep has decades of experience working with spatial data and the application reflects his knowledge and passion.  There is much more than maps and imagery here, but rich tabular databases and other services, and the metadata for each of the data sets is quite robust.

I have been a long time fan of the spatial data from the World Bank, and use their data in several systems, including many layers available in ArcGIS Online.  The major limitation with the Spatial Agent app at this point that I can see is that it is just that — an app.  Therefore it only works on mobile phones and tablets.  I understand in part why it is focused on these devices–these are what many people are using day to day in their work.  Still, to bring the data sets into a GIS and more fully use them, I would love to see its capabilities inside of a series of user-driven interfaces that could be run in a standard web browser on a computer where I also have GIS and statistics tools available to me.  But I was glad to see this note about this very thing on the project’s site:  “The web version is being developed with the Bank’s Global Reach effort for launch later this year.”  Since the data and documentation are so rich on this site, I look forward to finding out how we will be able to use the services in a GIS.  Even without a GIS, the Spatial Agent is already very useful, because it is helping to bring data-driven decisions to daily decision making.

 

Two views of the hundreds of data layers and statistics available via the Spatial Agent.

For more information, including the links to access the apps, and the tutorials, see this page.

New working lists of US Federal and State GIS portals

January 15, 2018 2 comments

Joseph Elfelt of MappingSupport.com has compiled a very helpful working list of addresses for over 40 federal ArcGIS servers with open MapServer and ImageServer data:

https://mappingsupport.com/p/surf_gis/list-federal-GIS-servers.pdf

And a list of over 50 state server addresses:

https://mappingsupport.com/p/surf_gis/list-state-GIS-servers.pdf

The lists also contain some key caveats and tips for finding local GIS data as well.  Joseph is open to the community contacting him with additional federal or state servers to add them; his contact information is at the top of the lists.  That these already excellent resources will continue to be updated is very good news.

elfelt.JPG

A section of the very helpful federal and state lists of servers with open MapServer or ImageServer data, compiled by Joseph Elfelt. 

Possible Changes to NAIP Imagery Licensing Model

November 27, 2017 Leave a comment

As this blog and our book make clear, the world of geospatial data is in a continual state of change.  Much of this change has been toward more data in the public domain, but sometimes, the change may move in the opposite direction. The National Agriculture Imagery Program (NAIP) has been a source for aerial imagery in the USA since 2003 and has been in the public domain, available here.  But recently, the Farm Services Agency (FSA) has proposed to move the data model from the public domain to a licensing model.  The collection of this imagery has been under an innovative model wherein state governments and the federal government share the costs.

One reason for the proposed change is that the states have been $3.1 million short over the past several years, and FSA cannot continue “picking up the tab.”  Furthermore, delays in releasing funding from cost-share partners forces contract awards past “peak agriculture growth” season, which thwarts one key reason why the imagery is collected in the first place–to assess agricultural health and practices.  We have discussed this aspect of geospatial data frequently in this blog–that geospatial data comes at a cost.  Someone has to pay, and sometimes, those payment models need to be re-considered with changing funding and priorities.  In this case, agencies and data analysts that rely on NAIP imagery would suffer adverse consequences, but with the expansion of the types and means by which imagery can be acquired nowadays, perhaps these developments will enable those other sources to be explored more fully.  And, possibly, the model could be adjusted so that the data could be paid for and that all could benefit from it.

For more information, see the report by our colleagues at GIS Lounge, and the presentation housed on the FGDC site, here.

Two samples of NAIP imagery, for Texas, left, and North Dakota, right.

A review of the Los Angeles GeoHub

April 23, 2017 2 comments

The Los Angeles GeoHub represents, in many ways, the next generation GIS data portal. It is in my view what a data portal should be, and given the population and areal size of Los Angeles, that the portal is robust makes it even more impressive.  The data user can search the city’s open data site, and also do something that not all sites allow:  “Explore all data”.  At the time of this writing, “exploring all data” resulted in 554 results, which one can then add to “my favorites” for later investigation and download.  One can also explore the data by category, including business, boundaries, health, infrastructure, planning, recreation and parks, safety, schools, and transportation.  Most data sets can be downloaded as a spreadsheet, as a KML file, or a shapefile.  These layers include grasslands, fire stations, cell phone towers, road work projects, traffic, parcels, and dozens and dozens more–even bus stop benches and other treasures.  Each download is quick and painless.

A unique and very useful characteristic of the GeoHub is that each layer lists the number of attributes, which are easily displayed on the site.  Another wonderful feature is that each layer is displayed above its metadata listing as a web service inside ArcGIS Online, which can be opened immediately in ArcMap or ArcGIS Pro or streamed as a GeoJSON or GeoService as a full or filtered data set. Applications based on the data can also be accessed on the site, such as the CleanStat clean streets index and the social determinants of health app.  And yet there is even more–charts can be generated straight from the data, and a whole set of ArcGIS Online mapping applications that the city has generated are displayed in a gallery here.  Because of these applications, the site can be used effectively even by someone who is not familiar with how to run a GIS to understand Los Angeles better and to make smarter decisions.

If you are a data user, explore the data on the GeoHub today.  If you are a data administrator, consider using the GeoHub as a model for what you might develop and serve for your own data users in your own location.

la_geohub

Los Angeles GeoHub results from examining cell phone towers.  Note the many data-user-friendly items and choices to stream and download.

An Open Letter to the Open Data Community: Reaction

April 9, 2017 3 comments

A group of people at the Civic Analytics Network recently wrote “An Open Letter to the Open Data Community” that focuses on topics central to this blog and to our book. The Civics Analytics Network, is “a consortium of Chief Data Officers and analytics principals in large cities and counties throughout the United States.”  They state that their purpose is to “work together to advance how local governments use data to be more efficient, innovative, and in particular, transparent.”

The letter contained 8 guidelines the group believed that if followed, would “advance the capabilities of government data portals across the board and help deliver upon the promise of a transparent government.”  The guidelines included the following:

  1.  Improve accessibility and usability to engage a wider audience. 
  2. Move away from a single dataset centric view.
  3. Treat geospatial data as a first class data type.
  4. Improve management and usability of metadata. 
  5. Decrease the cost and work required to publish data. 
  6. Introduce revision history.
  7. Improve management of large datasets.
  8. Set clear transparent pricing based on memory, not number of datasets.

It is difficult to imagine a letter that is more germane to what we have been advocating on the Spatial Reserves blog.  We have been open about our praise of data portals that are user friendly–and critical of those that miss the mark–over the past five years.  We have noted the impact that the open data movement has had on the data portals themselves–becoming in many cases more user friendly and encouraging adoption of GIS beyond its traditional departmental boundaries.  The principles we have adhered to are also mentioned in this letter, such as being intuitive, data-driven, and with metrics.  The letter highlights a continued need, the ability to tie together and compare related data sets, which is at times challenging given “data silos.”

One of my favorite points in the letter is the authors’ admonition to “treat geospatial data as a first class data type.”  The authors claim that geospatial data is an underdeveloped and undervalued asset; and it “needs to be an integral part of any open data program”, citing examples from Chicago’s OpenGrid and Los Angeles’ GeoHub as forward-thinking models.

On the topic of metadata, the authors call for portals and managers to allow “custom metadata schemes, API methods to define and update the schema and content, and user interfaces that surface and support end-user use of the metadata.”  Hear, hear!  Equally welcome is the authors’ call to decrease the cost and work required to publish data. Through their point #6 about revision history, they advocate that these data sets need to be curated and updated but also allow historical versions to be accessed.

What are your reactions to this letter?  What do we need to do as the geospatial community to realize these aims?

Categories: Public Domain Data Tags: