Archive

Posts Tagged ‘big data’

Data Drives Everything (But the Bridges Need a Lot of Work)

September 14, 2014 1 comment

A new article in Earthzine entitled “Data Drives Everything, but the Bridges Need a Lot of Work” by Osha Gray Davidson seems to encapsulate one of the main themes of this blog and our book.

Dr Francine Berman directs the Center for a Digital Society at Rensselaer Polytechnic Institute, in Troy, New York, and as the article states, “has always been drawn to ambitious ‘big picture’ issues” at the “intersection of math, philosophy, and computers.”  Her project, the Research Data Alliance (RDA), has a goal of changing the way in which data are collected, used, and shared to solve specific problems around the globe.  Those large and important tasks should sound familiar to most GIS professionals.

And the project seems to have resonated with others, too–1,600 members from 70 countries have joined the RDA as members.  Reaching across boundaries and breaking down barriers that make data sharing difficult or impossible is one of the RDA’s chief goals.  Finding solutions to real-world problems is accomplished through Interest Groups, which then create more focused Working Groups.  I was pleased to see Interest Groups such as Big Data Analytics, Data In Context, and Geospatial, but at this point, a Working Group for Geospatial is still needed.  Perhaps someone from the geospatial community needs to step up and lead the Working Group effort.   I read the charter for the Geospatial Interest Group and though brief, it seems solid, with an identification of some of the chief challenges and major organizations to work with into the future to make their vision a reality.

I wish the group well, but simple wishing isn’t going to achieve data sharing for better decision making.  As we point out in our book with regards to this issue, geospatial goals for an organization like this are not going to be realized without the GIS community stepping forward.  Please investigate the RDA and consider how you might help their important effort.

Research Data Alliance

Research Data Alliance.

Always on: The analysts are watching …

August 25, 2014 1 comment

We recently came across the Moves App, the always-on data logger that records walking, cycling and running activities, with the option to monitor over 60 other activities that can be configured manually. By keeping track of both activity and idle time calorie burn, the app provides ‘ an automatic diary of your life’  .. and by implication, assuming location tracking is always enabled as well, an automatic log of your location throughout each day. While this highlights a number of privacy concerns we have written about in the past (including Location Privacy: Cellphones vs. GPS, and Location Data Privacy Guidelines Released), it also opens up the possibilities for some insightful, and real-time or near real-time, analytical investigations into what wearers of a particular device or users of a particular app are doing at any given time.

Gizmodo reported today on the activity chart released by Jawbone, makers of the Jawbone UP wristband tracking device, which showed a spike in activity for UP users at the time a 6.0 magnitude earthquake occurred in the Bay Area of Central California in the early hours of Sunday 24th August 2014. Analysis of the users data revealed some insight into the geographic extent of the impact of the quake, with the number of UP wearers active at the time of the quake decreasing with increasing distance from the epicentre.

How the NAPA earthquake affected Bay Area sleepers

How the NAPA earthquake affected Bay Area sleepers.

Source: The Jawbone Blog 

This example provides another timely illustration of just how much personal location data is being collected and how that data may be used in ways never really anticipated by the end users. However, it also shows the potential for using devices and apps like these to provide real-time monitoring of what’s going on at any given location, information that could be used to help save lives and property. As with all new innovations, there are pros and cons to consider; getting the right balance between respecting the privacy of users and reusing some of the location data will help ensure that data mining initiatives such as this will be seen as positive and beneficial and not invasive and creepy.

 

 

Geospatial Data Integration Challenges and Considerations

January 20, 2014 Leave a comment

A recent article in Sensors & Systems:  Making Sense of Global Change raised key issues regarding challenges and considerations in geospatial data integration.  Author Robert Pitts of New Light Technologies recognizes that the increased availability of data presents opportunities for improving our understanding of the world, but combining diverse data remains a challenge due to several reasons.  I like the way he cuts through the noise and captured the key analytical considerations, which we address in our book entitled, The GIS Guide to Public Domain Data.  These include coverage, quality, compatibility, geometry type and complexity, spatial and temporal resolution, confidentiality, and update frequency.

In today’s world of increasingly available data, and ways to access that data, integrating data sets to create decision-making dashboards for policymakers may seem like a daunting task–much worse than that term paper you were putting off writing until the last minute.  However, breaking down integration tasks into the operational considerations that Mr. Pitts identifies may help the geospatial and policymaking communities make progress toward the overall goal.  These operational considerations include access method, format and size of data, data model and schema, update frequency, speed and performance, and stability and reliability.

Fortunately, as Mr. Pitts points out, “operational dashboards” are appearing that help decision makers work with geospatial data in diverse contexts and scales.  These include the US Census Bureau’s  “On the Map for Emergency Management“, based on Google tools and the Florida State Emergency Response Team’s Geospatial Assessment Tool for Operations and Response (GATOR) based on ArcGIS Online technology, shown here.

Florida's GATOR disaster assessment tool

Florida’s GATOR disaster assessment dashboard.

As we discuss in our book and in this blog, portals or operational dashboards will not by themselves ensure that better decisions will be made.  I see two chief challenges with these dashboards and make the following recommendations:   (1) Make sure that those who create them are not simply putting something up quickly to satisfy an agency mandate.  Rather, those who create them need to understand the integration challenges listed above as they build the dashboard.  Furthermore, since the decision-makers are likely not to be geospatial professionals who understand scale, accuracy, and so on, the creators of these dashboards need to communicate the above considerations in an understandable way to those using the dashboards.  (2) Make sure that the dashboards are maintained and updated.  If you are a regular reader of this blog, you know that we are blunt in our criticism about portals that may be well-intentioned but are out of date and/or are extremely difficult to use.  For example, the US Census dashboard that I analyzed above contained emergencies that were three months old, despite the fact that I had checked the current date box for my analysis.

Take a look around at our world.  We need to incorporate geospatial technologies in decision making across the private sector, nonprofit organizations, and in government, at all levels and scales.  It is absolutely critical that geospatial tools and data are placed into the hands of decision makers for the benefit of all.  Progress is being made, but it needs to happen at a faster pace through the effort of the geospatial community as well as key decision makers working together.

Free versus fee

In The GIS Guide to Public Domain Data we devoted one chapter to a discussion of the Free versus Fee debate: Should spatial data be made available for free or should individuals, companies and government organisations charge for their data? In a recently published article Sell your data to save the economy and your future the author Jaron Lanier argues that a ‘monetised information economy‘, where information is a commodity that is traded to the advantage of both the information provider and the information collector, is best way forward.

Lanier argues that although the current movement for making data available for free has become well established, with many arguing that it has the potential for democratising the digital economy through access to open software, open spatial data, open education resources and the like, insisting that data is available for free will ultimately mean a small digital elite will thrive at the expense of the majority. Data, and the information products derived from them, are the new currency in the digital age and those who don’t have the opportunity to take advantage of this source of re-enumeration will lose out. Large IT companies with the best computing facilities, who collect and re-use our information, will be the winners with their ‘big data‘ crunching computers ‘... guarded like oilfields‘.

In one vision of an alternative information economy, people would be paid when data they made available via a network were accessed by someone else. Could selling the data that are collected by us, and about us, be a viable option and would it give us more control over how the data are used? Or is the open approach to data access and sharing the best way forward?

Geospatial Advances Drive the Big Data Problem but Also its Solution

In a recent essay, Erik Shepard claims that geospatial advances drive the big data problem but also its solution:  http://www.sensysmag.com/article/features/27558-geospatial-advances-drive-big-data-problem,-solution.html.  ”  The expansion of geospatial data is estimated to be 1 exabyte per day, according to Dr. Dan Sui.  Land use data, satellite and aerial imagery, transportation data, and crowd-sourced data all contribute to this expansion, but GIS also offers tools to manage the very data that it is contributing to.

We discuss these issues in our book, The GIS Guide to Public Domain Data.  These statements from Shepard are particularly relevant to the reflections we offer in our book:  “Today there is a dawning appreciation of the assumptions that drive spatial analysis, and how those assumptions affect results.  Questions such as what map projection is selected – does it preserve distance, direction or area? Considerations of factors such as the modifiable areal unit problem, or spatial autocorrelation.”

Indeed!  Today’s data users have more data at their fingertips than ever before.  But with that data comes choices about what to use, how, and why.  And those choices must be made carefully.

Categories: Public Domain Data Tags: