Thursday, February 24, 2011

Great Data, Byzantine Retrieval System

I am working on a series of maps for the Sierra Club's Resilient Habitat's campaign. The maps have been great fun and cover geography that I haven't mapped in awhile. The most recent map covers the Greater Everglades ecosystem, and since the habitat extent includes a substantial portion of the Atlantic ocean and Gulf of Mexico, we thought it might add visual interest to show bathymetry.

After doing a bit of research, I found a great source for integrated terrain and bathymetry at the NOAA coastal relief site. I found my area of interest, clicked on the link to download the data, and was somewhat surprised to see no obvious way to download the data--no 'download' button, no FTP link, nothing. I did notice a 'Create Custom Grid', but since I wanted all of the data shown in the map, that seemed like a hassle. So I called NOAA, and to my utter astonishment, reached an extremely helpful human being on my first attempt. She didn't know how to obtain the data either, but transferred me to someone who did.



Turns out, the ONLY way to obtain the data is by the 'Create Custom Grid' button, even if you want the whole enchilada. But it gets better--you can't specify the whole extent in the create custom grid dialogue, because the extractor limits your request to about 8,000 cells in either direction. So, I had to split the grid into fourths. Here are my notes outlining how to do that:



The grids are extracted and exported as ASCII text files, which I imported to ArcGIS GRIDs. But there's another catch: the upper value of each grid showed up as 65,535. This did not seem like a plausible elevation value (unless the Z values were in centimeters?!) so after a bit of Googling found this helpful post by Thomas Ballatore on the ESRI Support website. This is in response to someone working with SRTM data, but the same procedure worked equally well on the NOAA coastal relief data:

The SRTM files contain values from 0 to 65535. In a given SRTM tile, most of the values will be "typical" elevation values like the 1,133m and below you mention. However, areas where there is no data (voids) are assigned a value of 32768. The setnull command mentioned above would indeed set those values to "nodata".

HOWEVER, the SRTM data also correctly contains areas below sea level. These are reported as decreasing values from 65536. For example, an elevation of -2m would have a value of 65534, and elevation of -10 would be 65526 and so on.

For every country I have worked with that has a coastline, there will be a number of these along the coastline. Whether they are truly below sea level or if that is just an inaccuracy of the SRTM data would require further investigation.

Anyway, if you set all values greater than 1133 to nodata, then you will incorrectly set these negative values also to nodata. To avoid that, I use the following two steps in Raster Calculator to correctly prepare the SRTM data:

Step 1. Execute the following:

setnull([N30E119.bil] == 32768,[N30E119.bil])

where N30E119.bil is an SRTM file I was recently working with...change this to your tile's name. This will set the voids to nodata.

Step 2. Then execute this:

con([N30E119.bil] > 32768,[N30E119.bil] - 65536,[N30E119.bil])

This will correctly convert the negative values to negative values. Remember that spaces are important in raster calculator!
After running this on each of my four grids, I mosaiced them into a single grid, derived some hillshade, and was good to go.



Once the Florida Resilient Habitats map is finalized and Sierra Club has signed off on it, I will post it here.

Saturday, February 5, 2011

The Ephemeral Nature of Maps in our Modern Age

I've been thinking a lot about history these past several weeks, primarily because of the historic mapping work I described in an earlier post, but also because I am reading The Pirate Hunter by Richard Zacks. This excellent book is the true-life account of Captain Kidd, and it is based on painstaking research Zacks undertook for nearly three years, poring over historical documents, letters, diaries, newspapers, etc. Yes, it was time consuming and occasionally difficult work, but it was possible for Zacks to do the research because once he located a document, he could actually read it--or have it translated--and did not need to rely on any technology other than his eyes and grasp of archaic phraseology.


Captain William Kidd's 315 year old commission from the King of England



Similarly, but on a much more localized scale, I have been working with a client to recreate the land ownership pattern of portions of southwest Washington based on a digital scan of a hard copy map that was produced in 1891. This 120 year old map was created by hand with pen and ink, but it is remarkably accurate given the limitations of surveying and cartographic technologies available at the time. But similarly to Zacks, all we need to do is use our eyes to interpret the map and determine what was happening on the landscape back in the 19th century.

Let's contrast that with the current situation. Most of the maps we create are transmitted as PDFs, JPGs, or some other electronic format, and rarely printed out (maybe 5-10% are printed out). When they are printed, most of the non-profit organizations and government agencies do not have any clear system for cataloging and archiving the maps (or data used to create them) for posterity. Compounding the problem, more and more maps are completely web-based: the base layers might come from a variety of sources--Google, USGS, NOAA, NRCS, you name it--and the 'value-added' content might be coming from any number of other servers. The various layers are widely distributed across dozens of servers, and the underlying information is updated at varying intervals. How do you even capture a "snapshot" of an interactive map like that? What are the chances that the full functionality of any web map can be reproduced 120 years from now? I think very slim indeed, but perhaps I am being overly pessimistic.

The great irony seems to be that we now have access to more information than any other generation in history, but future historians will be able to access only a tiny fraction of it, because we are not doing a sufficient job of preserving it.

Obviously I am not the first person to surf this particular brainwave--for example, see here, here, here, here, and especially here.