Beyond Mapping III
|
Map
Analysis book with companion CD-ROM
for hands-on exercises and further reading |
Grids and Lattices Build Visualizations — describes
Lattice and Grid forms of map surface display
Maps Are Numbers First, Pictures Later — discusses
the numeric and geographic characteristics of map values
Normalizing Maps for Data Analysis — describes
map normalization and data exchange with other software packages
Multiple Methods Help Organize Raster
Data — discusses
different approaches to storing raster data
Use Mapping “Art” to Visualize Values — describes
procedures for generating contour maps
What’s Missing in Mapping? — discusses
the need for identifying data dispersion as well as average in Thematic Mapping
Author’s
Notes: The figures in this topic use MapCalcTM
software. An educational CD with online text,
exercises and databases for “hands-on” experience in these and other grid-based
analysis procedures is available for US$21.95 plus shipping and handling (www.farmgis.com/products/software/mapcalc/
).
<Click
here> right-click to download a printer-friendly version of this topic
(.pdf).
(Back to the Table of Contents)
______________________________
Grids and Lattices
Build Visualizations
(GeoWorld, July 2002, pg. 26-27)
For thousands of years, points, lines and polygons have been used to depict map features. With the stroke of a pen a cartographer could outline a continent, delineate a highway or identify a specific building’s location. With the advent of the computer, manual drafting of these data has been replaced by the cold steel of the plotter.
In digital form these spatial data have been linked to attribute tables that describe characteristics and conditions of the map features. Desktop mapping exploits this linkage to provide tremendously useful database management procedures, such as address matching, geo-query and routing. Vector-based data forms the foundation of these techniques and directly builds on our historical perspective of maps and map analysis.
Grid-based data, on the other hand, is a relatively new way to describe geographic space and its relationships. Weather maps, identifying temperature and barometric pressure gradients, were an early application of this new data form. In the 1950s computers began analyzing weather station data to automatically draft maps of areas of specific temperature and pressure conditions. At the heart of this procedure is a new map feature that extends traditional points, lines and polygons (discrete objects) to continuous surfaces.
Figure 1. Grid-based data can be
displayed in 2D/3D lattice or grid forms.
The rolling hills and valleys in our everyday world is a good example of a geographic surface. The elevation values constantly change as you move from one place to another forming a continuous spatial gradient. The left-side of figure 1 shows the grid data structure and a sub-set of values used to depict the terrain surface shown on the right-side.
Grid data are stored as an organized set of values in a matrix that is geo-registered over the terrain. Each grid cell identifies a specific location and contains a map value representing its average elevation. For example, the grid cell in the lower-right corner of the map is 1800 feet above sea level. The relative heights of surrounding elevation values characterize the undulating terrain of the area.
Two basic approaches can be used to display this
information—lattice and grid. The lattice
display form uses lines to convey surface configuration. The contour lines in the 2D version identify
the breakpoints for equal intervals of increasing elevation. In the 3D version the intersections of the
lines are “pushed-up” to the relative height of the elevation value stored for
each location. The grid display form uses
cells to convey surface configuration.
The 2D version simply fills each cell with the contour interval color,
while the 3D version pushes up each cell to its relative height.
The right-side of figure 2 shows a close-up of the data matrix of the project area. The elevation values are tied to specific X,Y coordinates (shown as yellow dots). Grid display techniques assume the elevation values are centered within each grid space defining the data matrix (solid back lines). A 2D grid display checks the elevation at each cell then assigns the color of the appropriate contour interval.
Figure 2. Contour lines are
delineated by connecting interpolated points of constant elevation along the
lattice frame.
Lattice display techniques, on the other hand, assume the values are positioned at the intersection of the lines defining the reference frame (dotted red lines). Note that the “extent” (outside edge of the entire area) of the two reference frames is off by a half-cell*. Contour lines are delineated by calculating where each line crosses the reference frame (red X’s) then these points are connected by straight lines and smoothed. In the left-inset of the figure note that the intersection for the 1900 contour line is about half-way between the 1843 and 1943 values and nearly on top of the 1894 value.
Figure 3. 3D display “pushes-up”
the grid or lattice reference frame to the relative height of the stored map
values.
Figure 3 shows how 3D plots are generated. Placing the viewpoint at different look-angles and distances creates different perspectives of the reference frame. For a 3D grid display entire cells are pushed to the relative height of their map values. The grid cells retain their projected shape forming blocky extruded columns.
3D lattice display pushes up each intersection node to its relative height. In doing so the four lines connected to it are stretched proportionally. The result is a smooth wireframe that expands and contracts with the rolling hills and valleys. Generally speaking, lattice displays create more pleasing maps and knock-your-socks-off graphics when you spin and twist the plots. However, grid displays provide a more honest picture of the underlying mapped data—a chunky matrix of stored values.
____________________
Author's Note: Be forewarned that the alignment difference
between grid and lattice reference frames is a frequent source of registration
error when one “blindly” imports a set of grid layers from a variety of sources.
Maps Are Numbers First, Pictures
Later
(GeoWorld, August 2002, pg. 20-21)
The unconventional view that “maps are numbers first, pictures later” forms the backbone for taking maps beyond mapping. Historically maps involved “precise placement of physical features for navigation.” More recently, however, map analysis has become an important ingredient in how we perceive spatial relationships and form decisions.
Understanding that a digital map is first and foremost an organized set of numbers is fundamental to analyzing mapped data. But exactly what are the characteristics defining a digital map? What do the numbers mean? Are there different types of numbers? Does their organization affect what you can do with them? If you have seen one digital map have you seen them all?
In an introductory
However this geo-centric view rarely explains the full nature of digital maps. For example consider the numbers themselves that comprise the X,Y coordinates—how does number type and size effect precision? A general feel for the precision ingrained in a “single precision floating point” representation of Latitude/Longitude in decimal degrees is*…
1.31477E+08 ft = equatorial circumference of
the earth
1.31477E+08 ft / 360 degrees = 365214
ft/degree length of one degree Longitude
Single precision number carries six decimal
places, so—
365214 ft/degree * 0.000001= .365214 ft
*12 = 4.38257 inch precision
Think if “double-precision” numbers (eleven decimal places) were used for storage—you likely could distinguish a dust particle on the left from one on the right.
In analyzing mapped data, however, the characteristics of the attribute values are even more critical. While textual descriptions can be stored with map features they can only be used in geo-query. For example if you attempted to add Longbrake Lane to Shortthrottle Way all you would get is an error, as text-based descriptors preclude any of the mathematical/statistical operations.
So what are the numerical characteristics of mapped data? Figure 1 lists the data types by two important categories—numeric and geographic. You should have encountered the basic numeric data types in several classes since junior high school. Recall that nominal numbers do not imply ordering. A 3 isn’t bigger, tastier or smellier than a 1, it’s just not a 1. In the figure these data are schematically represented as scattered and independent pieces of wood.
Ordinal numbers, on the other hand, do imply a definite ordering and can be conceptualized as a ladder, however with varying spaces between rungs. The numbers form a progression, such as smallest to largest, but there isn’t a consistent step. For example you might rank different five different soil types by their relative crop productivity (1= worst to 5= best) but it doesn’t mean that soil 5 is exactly five times more productive than soil 1.
Figure 1. Map values are characterized from two broad
perspectives—numeric and geographic—then further refined by specific data
types.
When a constant step is applied, interval numbers result. For example, a 60o Fahrenheit spring day is consistently/incrementally warmer than a 30 oF winter day. In this case one “degree” forms a consistent reference step analogous to typical ladder with uniform spacing between rungs.
A ratio number introduces yet another condition—an absolute reference—that is analogous to a consistent footing or starting point for the ladder, analogous to zero degrees “Kelvin” defined as when all molecular movement ceases. A final type of numeric data is termed “binary.” In this instance the value range is constrained to just two states, such as forested/non-forested or suitable/not-suitable.
So what does all of this have to do with analyzing digital
maps? The type of number dictates the
variety of analytical procedures that can be applied. Nominal data, for example, do not support
direct mathematical or statistical analysis.
Ordinal data support only a limited set of statistical procedures, such
as maximum and minimum. Interval and
ratio data, on the other hand, support a full set mathematics and statistics. Binary maps support special mathematical
operators, such as .
Even more interesting (this interesting, right?) are the
geographic characteristics of the numbers.
From this perspective there are two types of numbers. “Choropleth” numbers form sharp and
unpredictable boundaries in space such as the values on a road or cover type
map. “Isopleth” numbers, on the
other hand, form continuous and often predictable gradients in geographic
space, such as the values on an elevation or temperature surface.
Figure 2 puts it all together. Discrete
maps identify mapped data with independent numbers (nominal) forming sharp
abrupt boundaries (choropleth), such as a covertype map. Continuous maps contain a range of
values (ratio) that form spatial gradients (isopleth), such as an elevation
surface. This clean dichotomy is muddled
by cross-over data such as speed limits (ratio) assigned to the features on a
road map (choropleth).
Figure 2. Discrete and Continuous
map types combine the numeric and geographic characteristics of mapped data.
Discrete maps are best handled in 2D form—the 3D plot in the top-right inset is ridiculous and misleading because it implies numeric/geographic relationships among the stored values. What isn’t as obvious is that a 2D form of continuous data (lower-right inset) is equally as absurd.
While a contour map might be as familiar and comfortable as
a pair of old blue jeans, the generalized intervals treat the data as discrete
(ordinal, choropleth). The artificially
imposed sharp boundaries become the focus for visual analysis. Map-ematical
analysis of the actual data, on the other hand, incorporates all of the detail
contained in the numeric/geographic patterns of the numbers ...where the rubber
meets the spatial analysis road.
Normalizing Maps for Data Analysis
(GeoWorld, September 2002, pg. 22-23)
The last couple of sections have dealt with the numerical nature
of digital maps. Two fundamental
considerations remain—data normalization and exchange. Normalization involves
standardizing a data set, usually for comparison among different types of
data. In a sense, normalization
techniques allow you to “compare apples and oranges” using a standard “mixed
fruit scale of numbers.”
The most basic normalization procedure uses a “goal” to adjust map values. For example, a farmer might set a goal of 250
bushels per acre to be used in normalizing a yield map for corn. The equation, Norm_GOAL = (mapValue / 250) * 100, derives the percentage of the goal
achieved by each location in a field. In
evaluating the equation, the computer substitutes a map value for a field
location, completes the calculation, stores the result, and then repeats the
process for all of the other map locations.
Figure 1. Comparison of original and goal normalized
data.
Figure 1 shows the results of goal normalization. Note the differences in the descriptive statistics between the original (top) and normalized data (bottom)—a data range of 2.33 to 295 with an average of 158 bushels per acre for the original data versus .934 to 118 with an average of 63.3 percent for the normalized data.
However, the histogram and map patterns are identical (slight differences in the maps are an artifact of rounding the discrete display intervals). While the descriptive statistics are different, the relationships (patterns) in the normalized histogram and map are the same as the original data.
That’s an important point— both the numeric and
spatial relationships in the data are preserved during normalization. In effect, normalization simply “rescales”
the values like changing from one set of units to another (e.g., switching from
feet to meters doesn’t change your height).
The significance of the goal normalization is that the new scale allows
comparison among different fields and even crop types based on their individual
goals— the “mixed fruit” expression of apples and oranges. Same holds for normalizing environmental,
business, health or any other kind of mapped data.
An alternative “0-100” normalization forces a consistent range of values by spatially evaluating the equation Norm_0-100 = ((mapValue – min) * 100) / (max – min). The result is a rescaling of the data to a range of 0 to 100 while retaining the same relative numeric and spatial patterns of the original data. While goal normalization benchmarks a standard value, the 0-100 procedure rescales the original data range to a fixed, standard range.
Figure 2. The map values at each grid location form a single record in
the exported table.
A third normalization procedure, standard normal variable (
Map normalization is often a forgotten step in the rush to
make a map, but is critical to a host of subsequent analyses from visual map
comparison to advanced data analysis.
The ability to easily export the data in a universal format is just as
critical. Instead of a “do-it-all”
Figure 2 shows the process for grid-based data. Recall that a consistent analysis frame is used to organize the data into map layers. The map values at each cell location for selected layers are reformatted into a single record and stored in a standard export table that, in turn, can be imported into other data analysis software.
Figure 3. Mapped data can be imported
into standard statistical packages for further analysis.
Figure 3 shows the agricultural data imported into the JMP
statistical package (by SAS). Area (1)
shows the histograms and descriptive statistics for the P, K and N map layers
shown in figure 2. Area (2) is a
“spinning 3D plot” of the data that you can rotate to graphically visualize
relationships among the map layers. Area
(3) shows the results of applying a multiple linear regression model to predict
crop yield from the soil nutrient maps.
These are but a few of the tools beyond mapping that are available
through data exchange between
Modern statistical packages like JMP “aren’t your father’s”
stat experience and are fully interactive with point-n-click graphical
interfaces and wizards to guide appropriate analyses. The analytical tools, tables and displays
provide a whole new view of traditional mapped data. While a map picture might be worth a thousand
words, a gigabyte or so of digital map data is a whole revelation and foothold
for site-specific decisions.
Multiple Methods Help Organize Raster Data
(GeoWorld, April 2003, pg. 22-23)
Map features in a vector-based mapping system identify discrete, irregular spatial objects with sharp abrupt boundaries. Other data types—raster images, pseudo grids and raster grids—treat space in entirely different manner forming a spatially continuous data structure.
For example, a raster image is composed of thousands of “pixels” (picture elements) that are analogous to the dots on a computer screen. In a geo-registered B&W aerial photo, the dots are assigned a grayscale color from black (no reflected light) to white (lots of reflected light). The eye interprets the patterns of gray as forming the forests, fields, buildings and roads of the actual landscape. While raster maps contain tremendous amounts of information that are easily “seen,” the data values simply reference color codes that afford some quantitative analysis but are far too limited for the full suite of map analysis operations.
Figure 1. A vector-based system
can store continuous geographic space as a pseudo-grid.
Pseudo grids and raster grids are similar to raster images as they treat geographic space as a continuum. However, the organization and nature of the data are radically different.
A pseudo grid is formed by a series of uniform, square polygons covering an analysis area (figure 1). In practice, each grid element is treated as a separate polygon—it’s just that every polygon is the same shape/size and they all abut each other—with spatial and attribute tables defining the set of little polygons. For example, in the upper-right portion of the figure a set of discrete point measurements are stored as twelve individual “polygonal cells.” The interpolated surface from the point data (lower-right) is stored as 625 contiguous cells.
While pseudo grids store full numeric data in their attribute tables and are subject to the same vector analysis operations, the explicit organization of the data is both inefficient and too limited for advanced spatial analysis as each polygonal cell is treated as an independent spatial object. A raster grid, on the other hand, organizes the data as a listing of map values like you read a book—left to right (columns), top to bottom (rows). This implicit configuration identifies a grid cell’s location by simply referencing its position in the list of all map values.
In practice, the list of map values is read into a matrix with the appropriate number of columns and rows of an analysis frame superimposed over an area of interest. Geo-registration of the analysis frame requires an X,Y coordinate for one of the grid corners and the length of a side of a cell. To establish the geographic extent of the frame the computer simply starts at the reference location and calculates the total X, Y length by multiplying the number of columns/rows times the cell size.
Figure 2. A grid-based system stores a long list of map values that are implicitly linked to an analysis frame superimposed over an area.
Figure 2 shows a 100 column by 100 row analysis frame geo-registered over a subdued vector backdrop. The list of map values is read into the 100x100 matrix with their column/row positions corresponding to their geographic locations. For example, the maximum map value of 92 (customers within a quarter of a mile) is positioned at column 67, row 71 in the matrix— the 7,167th value in the list ((71 * 100) + 67 = 7167). The 3D plot of the surface shows the spatial distribution of the stored values by “pushing” up each of the 10,000 cells to its relative height.
In a grid-based dataset, the matrices containing the map values automatically align as each value list corresponds to the same analysis frame (#columns, # rows, cell size and geo-reference point). As depicted on the left side of figure 3-10, this organization enables the computer to identify any or all of the data for a particular location by simply accessing the values for a given column/row position (spatial coincidence used in point-by-point overlay operations). Similarly, the immediate or extended neighborhood around a point can be readily accessed by selecting the values at neighboring column/row positions (zonal groupings used in region-wide overlay operations). The relative proximity of one location to any other location is calculated by considering the respective column/row positions of two or more locations (proximal relationships used in distance and connectivity operations).
Figure 3. A map stack of
individual grid layers can be stored as separate files or in a multi-grid
table.
There are two fundamental approaches in storing grid-based data—individual “flat” files and “multiple-grid” tables (right side of figure 3). Flat files store map values as one long list, most often starting with the upper-left cell, then sequenced left to right along rows ordered from top to bottom. Multi-grid tables have a similar ordering of values but contain the data for many maps as separate field in a single table.
Generally speaking the flat file organization is best for
applications that create and delete a lot of maps during processing as table
maintenance can affect performance.
However, a multi-gird table structure has inherent efficiencies useful
in relatively non-dynamic applications.
In either case, the implicit ordering of the grid cells over continuous
geographic space provides the topological structure required for advanced map analysis.
_________________
Author's Note: * Let me apologize in advance to the
“geode-ists” readership—yep it’s a lot more complex than these simple equations
but the order of magnitude ought to be about right …thanks to Ken Burgess, VP
R&D, Red Hen Systems for getting me this far.
Use Mapping “Art” to Visualize Values
(GeoWorld, June 2003, pg. 20-21)
The digital map has revolutionized how we collect, store and perceive mapped data. Our paper map legacy has well-established cartographic standards for viewing these data. However, in many respects the display of mapped data is a very different beast.
In a
The display tools are both a boon and a bane as they require minimal skills to use but considerable thought and experience to use correctly. The interplay among map projection, scale, resolution, shading and symbols can dramatically change a map’s appearance and thereby the information it graphically conveys to the viewer.
While this is true for the points, lines and areas comprising traditional maps, the potential for cartographic effects are even more pronounced for contour maps of surface data. For example, consider the mapped data of phosphorous levels in a farmer’s field shown in figure 1. The inset on the left is a histogram of the 3288 grid values over the field ranging from 4.2 to 53.2 parts per million (ppm). The table describes the individual data ranges used to generalize the data into seven contour intervals.
Figure 1. An Equal Ranges contour map of surface data.
In this case, the contour intervals were calculated by dividing the data range into seven Equal Ranges. The procedure involves: 1] calculating the interval step as (max – min) / #intervals= (53.2 – 4.2) / 7 = 7.0 step, 2] assigning the first contour interval’s breakpoint as min + step = 4.2 + 7.0 = 11.2, 3] assigning the second contour interval’s breakpoint as previous breakpoint + step = 11.2 + 7.0 = 18.2, 4] repeating the breakpoint calculations for the remaining contour intervals (25.2, 32.2, 39.2, 46.2, 53.2).
The equally spaced red bars in the plot show the contour interval breakpoints superimposed on the histogram. Since the data distribution is skewed toward lower values, significantly more map locations are displayed in red tones— 41 + 44 = 85% of the map area assigned to contour intervals one and two. The 2D and 3D displays on the right side of figure 1 shows the results of “equal ranges contouring” of the mapped data.
Figure 2 shows the results of applying other strategies for contouring the same data. The top inset uses Equal Count calculations to divide the data range into intervals that represent equal amounts of the total map area. This procedure first calculates the interval step as total #cells / #intervals= 3288 / 7 = 470 cells then starts at the minimum map value and assigns progressively larger map values until 470 cells have been assigned. The calculations are repeated to successively capture groups of approximately 470 cells of increasing values, or about 14.3 percent of the total map area.
Figure 2. Equal Count and +/- 1 Standard Deviation
contour maps.
Notice the unequal spacing of the breakpoints (red bars) in the histogram plot for the equal count contours. Sometimes a contour interval only needs a small data step to capture enough cells (e.g., peaks in the histogram); whereas others require significantly larger steps (flatter portions of the histogram). The result is a more complex contour map with fairly equal amounts of colored polygons.
The bottom inset in figure 2 depicts yet another procedure for assigning contour breaks. This approach divides the data into groups based on the calculated mean and Standard Deviation. The standard deviation is added to the mean to identify the breakpoint for the upper contour interval (contour seven = 13.4 + 5.21= 18.61 to max) and subtracted to set the lower interval (contour one = 13.4 - 5.21= 8.19 to min).
In statistical terms the low and high contours are termed the “tails” of the distribution and locate data values that are outside the bulk of the data— sort of unusually lower and higher values than you normally might expect. In the 2D and 3D map displays on the right side of the figure these locations are shown as blue and pink areas.
The other five contour intervals are assigned by forming equal ranges within the lower and upper contours (18.61 - 8.19 = 10.42 / 5 = 2.1 interval step) and assigned colors red through green with a yellow inflection point. The result is a map display that highlights areas of unusually low and high values and shows the bulk of the data as gradient of increasing values.
Figure 3.
Comparison of different 2D contour displays.
The
bottom line is that the same surface data generated dramatically different 2D
contour maps (figure 3). All three
displays contain seven intervals but the methods of assigning the breakpoints
to the contours employ radically different approaches. So which one is right? Actually all three are right, they just
reflect different perspectives of the same data distribution …a bit of the art
in the “art and science” of
What’s Missing in Mapping?
(GeoWorld, April 2009)
We have known the purpose of maps for thousands of years—precise placement of physical features for navigation. Without them historical heroes might have sailed off the edge of the earth, lost their way along the Silk Route or missed the turn to Waterloo. Or more recently, you might have lost life and limb hiking the Devil’s Backbone or dug up the telephone trunk line in your neighborhood.
Maps have always told us where we are, and as best possible, what is there. For the most part, the historical focus of mapping has been successfully automated. It is the “What” component of mapping that has expanded exponentially through derived and modeled maps that characterize geographic space in entirely new ways. Digital maps form the building blocks and map-ematical tools provide the cement in constructing more accurate maps, as well as wholly new spatial expressions.
For example, consider the left-side of figure 1 that shows both discrete (Contour) and continuous (Surface) renderings of the Elevation gradient for a project area. Not so long ago the only practical way of mapping a continuous surface was to force the unremitting undulations into a set of polygons defined by a progression of contour interval bands. The descriptor for each of the polygons is an interval range, such as 500-700 feet for the lowest contour band in the figure. If you had to assign a single attribute value to the interval, it likely would be the middle of the range (600).
Figure 1. Visual assessment of the spatial
coincidence between a continuous Elevation surface and a discrete map of
Districts.
But does that really make sense? Wouldn’t some sort of a statistical summary of the actual elevations occurring within a contour polygon be a more appropriate representation? The average of all of the values within the contour interval would seem to better characterize the “typical elevation.” For the 500-700 foot interval in the example, the average is only 531.4 feet which is considerably less than the assumed 600 foot midpoint of the range.
Our paper map legacy has conditioned us to
the traditional contour map’s interpretation of fixed interval steps but that
really muddles the “What” information.
The right side of figure 1 tells a different story. In this case the polygons represent seven
Districts that are oriented every-which-way and have minimal to no relationship
to the elevation surface. It’s sort of
like a surrealist Salvador Dali painting with the Districts melted
onto the Elevation surface indentifying the coincident elevation values. Note that with the exception of District #1,
there are numerous different elevations occurring within each district’s
boundary.
One summary attribute would be simply
noting the Minimum/Maximum values in
a manner analogous to contour intervals.
Another more appropriate metric would be to assign the Median of the values identifying the
middle value for a metric that divides the total frequency into two
halves. However the most commonly used statistic for
characterizing the “typical condition” is a simple Average of all the elevation numbers occurring within each
district. The “Thematic Mapping”
procedure of assigning a single value/color to characterize individual map
features (lower left-side of figure 2) is fundamental to many GIS applications,
letting decision-makers “see” the spatial pattern of the data.
The discrete pattern, however, is a
generalization of the actual data that reduces the continuous surface to a
series of stepped mesas (right-side of figure 2). In some instances, such as District #1 where
all of the values are 500, the summary to a typical value is right on. On the other hand, the summaries for other
districts contain sets of radically differing values suggesting that the
“typical value” might not be very typical.
For example, the data in District #2 ranges from 500 to 2499 (valley
floor to the top of the mountain) and the average of 1539 is hardly anywhere,
and certainly not a value supporting good decision-making.
Figure 2. Characterizing the average Elevation for
each District and reporting how typical the typical Elevation value is.
So what’s the alternative? What’s better at depicting the “What
component” in thematic mapping? Simply
stated, an honest map is better. Basic
statistics uses the Standard Deviation
(StDev) to characterize the amount dispersion in a data set and the Coefficient of Variation (Coffvar=
[StDev/Average] *100) as a relative index.
Generally speaking, an increasing Coffvar index indicates increasing
data dispersion and a less “typical” Average— 0 to 10, not much data
dispersion; 10-20, a fair amount; 20-30, a whole lot; and >30, probably too
much dispersion to be useful (apologies to statisticians among us for the
simplified treatise and the generalized but practical rule of thumb). In the example, the thematic mapping results
are good for Districts #1, #3 and #6, but marginal for Districts #5, #7 and #4
and dysfunctional for District #2, as its average is hardly anywhere.
So what’s the bottom line? What’s missing in traditional thematic
mapping? I submit that a reasonable and
effective measure of a map’s accuracy has been missing (see Author’s Notes). In the paper map world one can simply include
the Coffvar index within the label as shown in left-side of figure 2. In the digital map world a host of additional
mechanisms can be used to report the dispersion, such as mouse-over pop-ups of
short summary tables like the ones on the right-side of figure 2.
Another possibility could be to use the
brightness gun to track the Coffvar—with the display color indicating the
average value and the relative brightness becoming more washed out toward white
for higher Coffvar indices. The effect
could be toggled on/off or animated to cycle so the user sees the assumed
“typical” condition, then the Coffvar rendering of how typical the typical
really is. For areas with a Coffvar
greater than 30, the rendering would go to white. Now that’s an honest map that shows the best
guess of typical value then a visual assessment of how typical the typical
is—sort of a warning that use of shaky information may be hazardous to you
professional health.
As
Geotechnology moves beyond our historical focus on “precise placement of physical features for navigation” the ability
to discern the good portions of a map from less useful areas is critical. While few readers are interested in
characterizing average elevation for districts, the increasing wealth of mapped
data surfaces is apparent— from a
realtor wanting to view average home prices for communities, to a natural
resource manager wanting to see relative habitat suitability for various
management units, to a retailer wanting to determine the average probability of
product sales by zip codes, to policemen wanting to appraise typical levels of
crime in neighborhoods, or to public health officials wanting to assess air
pollution levels for jurisdictions within a county. It is important that they all “see” the
relative accuracy of the “What component” of the results in addition to the
assumed average condition.
_____________________________
Author’s
Notes: see http://www.innovativegis.com/basis/MapAnalysis/Topic18/Topic18.htm
for an online discussion of the related concepts, structures and considerations
of grid-based mapped data. Discussion of
the differences between map Precision and Accuracy is at http://www.innovativegis.com/basis/MapAnalysis/MA_Intro/MA_Intro.htm,
“Determining Exactly Where is What.”