…an introduction to
grid-based map analysis and modeling
GEOG 3110,
University of Denver, Geography, Winter Term 2011
Thursdays
6:00-8:50 pm,
…<click here> to review the Report Writing Tips
Keep in mind that for all the lab exercises
you have several “life lines”
if you need them—
1) send me an
email with a specific question,
2) arrange for a
phone call via email for tutorial walk-thru (you need to be at a computer with
MapCalc or Surfer),
3) an arranged
eyeball meeting in the GIS Lab on Thursdays between 10:00 am and 3:00 pm, or
4) open door
office hours 3:00 to 5:00 pm (or as specially
arranged for Friday mornings).
___________________________
Dear Dr. Berry, I have a question regarding Question
#9. Is there a way to look at the R square or Adjusted R square for the
multiple linear regression model to see statistically how it fits the data in
addition to look at the error surface? Best,
Qing
3/7 Qing— alas, the old stat tool we used for Regress doesn’t
provide for a traditional “R square or Adjusted R square” evaluation option
and we didn’t extend beyond the basic tool …possibly room for future
enhancement.
However, you can get the correlation matrix information using
the Correlate command. Another way is to Export the set of maps as a CSV
file and do the statistical analysis (regression and R-square) in a “grown-up”
stat package like JMP or SAS.
One interesting feature of Regress, however, is the…
For
<newMap>
The resulting map contains predicted values
for the dependent map using the regression equation.
…that evaluates the equation for the set of map data which you can
compare (subtract) to the actual dependent map for an error map that gives you
insight into both a spatial pattern and overall levels of error (Shading
Manager table summary). While it is not a “traditional non-spatial”
evaluation of regression fit, the “biased performance evaluation” with error
map/summary can be useful. Possibly it could be argued more useful
(except with traditional statisticians) as R-squared is an aggregate,
non-spatial evaluator and this is spatial statistics …making R-squared sort
of off-the mark as it ignores the spatial pattern of relationship. -Joe
Regress
Regress performs linear regression analysis
by using the "least squares" method to fit a line through a set of
data points in multiple maps. Each grid location identifies a series of values.
You can analyze how a single map (the dependent variable) is affected by the
values of one or more other maps (independent variables). For example, you can
determine how crop yield is affected by such factors as phosphorous, potassium
and pH levels.
Regression is used for developing a
prediction model based on a set of sampled data. The relationship between the
dependent and independent variables is determined by fitting a line to the data
that minimizes the deviations between the line and the data. The mathematical
equation for the line is used to estimate the dependent variable for any given
set of values contained in the independent variables.
Note: Regress does not work with maps
containing categorical information, such as a soil classification map.
Regress
<dependentMap>
With
<independentMap>
If using more than one independent map,
select as needed from the drop-down list. Click Add after each selection.
Add
Click to add
the independent map to the command line.
Del
Click to
delete a highlighted independent map from the command line.
To
<newTextFile>
For
<newMap>
The resulting map contains predicted values
for the dependent map using the regression equation.
___________________________
Joe, I am working on
number 3, and when i get the change in percetn yield map, the "percents" range from -80
to 4160. Is there really a 4000 percent change? …my faith in all things computer suggests
there really is a 4160% change—check it out …but be sure to
comment on whether you believe it is real or just something to do with a data
collection artifact involving small numbers I put
the formula in exactly as you had it in the homework. Thanks, Eric
Eric—yep, that’s the dilemma when working with small numbers
as percents. 4000 percent says there is a 40-fold increase in yield
which is likely from 1 to 40 bushels …which I suspect is occurring along the
edge of the field and is likely “data collection noise” as the harvester moves
in and out of the crop it is harvesting.
An interesting extended discussion (captures your interest,
right?) would be to use you pointer to find out where this unusually large
change occurred, note the two values and manually solve the
percent change equation to confirm the calculation. A follow-on
extended discussion could comment on why one might want to get rid of whacko areas
in a data (termed “eliminating outliers” in stat-speak) before developing any
statistical models. -Joe
___________________________
Dear Dr. Berry, I have a question about Exercise 9
Question #1. By saying "use the same legend"
for the two maps (1997_Fall_P & 1997_Fall K), do you mean use the same
intervals for the color ranges for the two maps? …Yes I noticed that the 1997_Fall_P
map has values ranging from 5-102, and 1997_Fall_K has values ranging from
88-310. The common value range (88-102) for the two maps is very small. …No—the combined range to
use is 5 to 310 but you might adjust to 0 to 320 and use 16 User Defined
intervals of 10 ppm In 1997_Fall_P only
17 out of 3288 cells is within 88-102 interval, and 1997_Fall_K has only 12
cells fall into the 88-102 range. So, I am not sure whether it would be helpful
to use the same ranges for the two maps to make a comparison.
If you're trying to get us to "use the same legend" for the two maps,
how do I define the range intervals? Thanks, Qing
3/6 Qing—when visually comparing tow maps the legends should
be the same. That means you need to determine the minimum and maximum
for both maps then set up a legend that covers the combined range—from the
“minimum” of the two minimums to the “maximum” of the two maximums –a data
range that encompasses the individual data ranges on both maps. In
practice it helps to make the range a bit bigger such that the ranges can be
set up in sensible steps. The result is a legend that has the same
“color-coding” for the same data steps enabling the viewer to easily “walk”
between both maps.
For example, if one map had values from 10 to 75 and the other
had values 45 to 90 the combined range would be 10 to 90 but it would make
sense to set up the legend from 0 to 100 with steps of 5 (into 20) or 10 (into
10) and set a color inflection at 50 for the color ramp..
Keep in mind that this technique only works for map surfaces
that have the same units. If the maps are of two different variables
(apples and oranges) you would need to normalize the mapped data to a common
scale and then set a common legend …reasonable “extended discussion” fodder. -Joe
___________________________
Joe, Sarah and I are having some difficulty converting
our point data (towns) into raster data. ArcMap
doesn't complete the operation, and spits out a generic error that effectively
says that something unknown is wrong. Do
you have any ideas as to why the "point to raster" tool is not
working for us? Can you potentially give us any other raster conversion tool to
use? Thanks. Hope all is well. Cheers
2/28 Mark and Sarah—yep, the “base map” data preparations is
always the difficult step. By copy of this email, I am asking any of you
“ArcGIS-sperts” with vectorßà skills
to contact Mark (Mark.Janko@du.edu) or
Sarah (smiller07@gmail.com) with your
advice.
It sounds simple …simply convert X,Y
point locations (in vector Lat/Lon WGS 84, I believe) into a raster map of
specified cell size, row/column configuration and geo-positioning. My
ancient experience used the PolyGrid command
in AML but I am not sure what the command and specifications are in the current
ArcGIS GUI tools.
Thanks, Joe
___________________________
Hi Joe, I am looking at question 2b that says: “Embed
a screen grab of your “color-filled” 5-foot contour map with data posting below”
…What does with "with data posting" mean? Thanks!
Eric
Eric—in Surfer you can “post” the original point data in a map
display. In Surfer select Help from the main menu items and
search on “Post” to get help on how to post the sample points’ data
values to a map surface display. -Joe
___________________________
2/23 Folks— on the possible Optional Paper front, a popular
topic in the past is to compare some of the commands in Grid/Spatial Analyst to
corresponding MapCalc commands, such as Costdistance
and Spread and Pathdistance and Stream. Folks
in the past were most interested in Spatial Analysis (vs. Spatial Statistics)
and had prior experience with ArcGIS. A
cross-reference of the commands is posted at http://www.innovativegis.com/basis/MapCalc/MCcross_ref.htm
(provided the class website is up and running). Joe
___________________________
2/23 Folks— last class I noted that several
of you were interested in the Standard Normal Variable (SNV) and other
normalization techniques that are useful in pre-processing mapped data before
Descriptive and Predictive statistics are employed. As warm-up for the
next two lectures/exercises you might want to add the following to your
“readings”…
Normalizing
Maps for Data Analysis — describes map
normalization and data exchange with other software packages
Comparing
Apples and Oranges
— describes a
Standard Normal Variable (SNV) procedure for normalizing maps for comparison
In addition, the “Compare” command
in MapCalc calculates a bunch of comparison statistics between two maps (need
to normalize if the data are not in the same units …apples and
oranges).
Compare
creates
a summary table of various comparison statistics between two maps. The
comparison table summarizes the percent difference between the two specified
maps on a cell-by-cell basis. The statistical indices test for significant
differences between the two sets of data.
Example:
COMPARE
Slope WITH Slope_max TO SSm_compare.txt
Compare
<existingMap>
With
<anotherMap>
To
<newTextFile>
Example
Output (Note:
SCAN is used first for this example)
SCAN
Elevation Average WITHIN 5 FOR Elevation_smoothed
COMPARE
Elevation WITH Elevation_smoothed TO
Compare_table.txt
…see page 76 of the MapCalc User’s Manual
for explanation/interpretation of the statistics generated. -Joe
___________________________
2/21 – Folks, I am cheerfully reading through the midterms
…mostly good so far. However, there were two of the first-part questions
that showed a bit of general confusion—
1) Question comparing Traditional GIS vs. Spatial Analysis and
Traditional Statistics vs. Spatial Statistics:
Traditional GIS: involves discrete spatial
objects (points, lines, polygons) primarily for geo-query and mapping
(inventory focus)
Spatial Analysis: involves continuous map
surfaces primarily for analysis of “contextual” spatial patterns and
relationships (analysis focus)
Traditional Statistics: involves characterizing
non-spatial data to determine the “typical” response (Mean and Stdev) considering the data’s numerical distribution
alone
Spatial Statistics: involves characterizing spatial
data to determine both the numeric and geographic distributions (maps
the Variance) to analyze “numerical” spatial patterns and relationships
2) Question to identify and briefly describe the differences in
information contained in the following types of visibility maps:
Net-Weighted Visual Exposure Density Surface—
…the viewer map values are assigned positive weights for “pretty”
things (beautiful Profile Rock) and negative weights for “ugly” things
(unsightly Joe’s Junkyard) such that the sum of the weights indicates the
net-weighted arithmetic total. A negative sign of the net-weighted
value identifies locations connected to mostly ugly things; positive, mostly
pretty things. The magnitude of the Net Weighted VEDS indicates how
pretty or ugly the overall visual connections are at every map location.
Joe
___________________________
Joe- in reference to Question #4
on Exercise 5, below: Are you suggesting one
selects the Square
option within SCAN when you talk about the 3x3 roving window?
Use Scan and the Covertype map to
identify the proportion of a roving window (3x3) that has the same cover type (Covertype_proportion map). Thanks, Pete
2/9 Pete— …yep,
3x3 square window. That results in 9 cells (center cell and eight
surrounding cells). The Proportion calculations will “note” the number of
similar cover type category (map value) cells in the roving window …expressed
as a proportion of the total # of cells. For extended discussion (A
stuff), what do you think is the minimum and maximum values that could result
within a 3x3 window? What about the min/max for a 4x4 window? -Joe
___________________________
Joe-- I am working on question 5 and am getting a little
confused with the first bit. I completed the first command:
Use Scan and the Covertype
map to create and display a Covertype_diversity map within 100 meter reach (one cell
radius).
Which resulted in the covertype
diversity map which has continuous data. However, when using the reassignment values
for the following command:
Use Renumber to isolate the areas of high
cover type diversity (Assigning 1 to 3 and 0 to 1 thru 2).
The
map looks really weird with multiple colors and I think 9 or 10 ranges. Using these assignments, it appears to me
that anything valued between two and three haven't been incorporated and
that is where all the weird colors are coming from. However, if I "Assign 1
to 2 thru 3, and Assign 0 to 0 to 2" I get a nice binary map that makes me
really happy. is Is it okay to adjust the assigment assignment
values or am I missing something important and embarrasingly embarrassingly
obvious? Thanks, Eric
2/9 Eric— I
think you are the victim of “default map” display. In coding Scan, we had to set a default map
display when the operation is completed.
Since there are lots of ‘quantitative” options for summarizing the data
in the window (e.g., average, StDev, etc.) we decided
to set the default display to “continuous.”
Since the diversity summary is simply a count of the number of different
types in the window, you have to press the display Data Type button to switch
to “discrete.”
Default display Correct display
___________________________
2/8 Pete-- you need to view
and reply to my emails in “HTML”, NOT “Plain Text,” because I
embed graphics and other stuff requiring the retention of formatting and special
fonts/characters/text/links. When you reply, however, the formatting is
automatically stripped and you send only “Plain text” in default Times
New Roman 10 font that isn’t very exciting (Aerial, Calibri or Verdana might be a good “new look” for you).
Also,
it is professional to set Spell Checking on. I am not sure how to set the default to HTML
and Spell Checking in other email readers (e.g., gmail,
DU direct access reader, etc.) but there ought to be a way. -Joe
From Outlook’s main menu bar,
select Toolsà Optionsà Mail Format tabà
and set the “Message format” to HTML…
___________________________
Hi Dr. Berry- We have a question about the 'Orient'
function and the map it displays. From our perspective, it looks like the
areas in pink should be relatively flat; however, as you can see from our
included image where azimuth degrees are draped over elevation, that's not the
case. Can you explain why that is?
Thanks! Kylee + friends
2/8 Kylee—I am not sure what went wrong
with your display. When I entered “ORIENT Elevation Precisely FOR Elev_azimuth” and then displayed the result using 9 User Defined Ranges as shown …
Joe
___________________________
Ok, Joe, then with reference to
question 23 on the midterm study guide.
My answer to the question describe how accumulation surface is used
to determine an optimal path between two locations is: accumulation surface
values increase continuously as they move from a given starting location out
and away toward a destination. This
pattern results in a bowl-like surface where each "steepest downhill line over the surface"
represents estimated travel time for every location from the given starting
point. Am I close to answering this
question, correctly? Thanks, Pete
2/7
Pete—you need to mention the three basic steps to Least Cost Path (LCP)
routing—1) create a Discrete Cost Map that contains absolute barriers
(avoidance) and relative barriers (preferences) to movement; 2) create an Accumulation
Cost Surface from a starting location(s) to everywhere; and, 3) identify
the steepest downhill path from a desired end point(s) over the
Accumulated Cost Surface to delineate the “best” (most preferred) route.
Mentioning
the three steps, plus the “continuously increasing” configuration of the
Accumulation Cost Surface (“bowl-like” with varying steepness as a function of
the intervening absolute/relative barriers) approaches a complete answer. You also should mention that the steepest
downhill path retraces the movements of the wave-front that got the end point
first. -Joe
___________________________
Joe- in reference to question 23
from the midterm study questions, is the term accumulation surface
synonymous with the term accumulation cost surface? Thanks, Pete
2/7
Pete—while the terms Accumulation Surface
and Accumulation Cost Surface are
often interchanged in practice, there is a big distinction …an Accumulation Cost Surface is
technically reserved for “Least Cost Path (LCP)” analysis for routing that uses
a Discrete Cost Surface to guide the
effective distance waves (absolute and relative barriers). The more general term is simply Accumulation Surface that
describes any effective distance map, regardless whether it us used for
routing.
For
example in Exercise #4, Question 1, the command “SPREAD Roads TO 20
Simply FOR Roads_simpleprox” just created an Accumulation
Surface (simple proximity) and didn’t take the analysis any further. On the other hand, the “Ranch_prox” map you created in Question 2 and coupled with “STREAM
Cabin OVER Ranch_prox Simply Steepest Downhill Only
FOR Cabin_route” used an Accumulation Cost Surface (Ranch_prox) to route the quickest path between the Ranch and the
Cabin. -Joe
___________________________
Hi Joe— for the Map Analysis "Mini
Exercises" are you looking for map output and descriptions of that output
or more of a narrative of the steps you have provided? Thanks! Eric
2/7
Eric—yep, sort of mini-exercises where you “solve” the problem to include your
commands and screen grabs of important maps.
For example, check out the A-
solution below …all of the solution’s elements are clearly identified and well
presented; however, there is ample room for extended discussion.
Joe
__________
Given the base
map of Total_customers
(smallville.rgs database) create a map that identifies “pockets of high
customer density” with over 35 customers
within a quarter of a mile (6 cell reach). Note: use MapCalc to implement and
SnagIt to capture your solution and embed below. Be sure to identify the input maps,
processing procedure, and output map with an interpretation of the map values.
|
|
|
Figure 1a.
3D Grid Input map identifying the number of customers at each grid
location forming discrete quantitative data. |
Figure 1b.
Scan command that summarizes map values within a roving window |
Figure 1c.
2D Grid Output map determining the total number of customers within
.25 miles of each grid location forming continuous quantitative data. |
SCAN Total_customers Total IGNORE 0.0 WITHIN 6
CIRCLE FOR Total_customers_within6
The Scan
command is a Neighborhood class of
map analysis operators that summarizes map values within a “roving window” and
assigns the summary value to the center cell of the window. In this case, the total number of customers
within a .25 mile (6-cell) radius is calculated. The warmer tones in the Output map indicate
increasing number of customers within reach from 0 to 92.
|
|
|
Figure 2a.
3D Lattice Input map identifying the total number of customers within
.25 miles of each grid location forming continuous quantitative data (see
Figure 1c above). |
Figure 1b.
Renumber command that isolates areas of interest |
Figure 1c.
2D Grid Output map that locates areas with more than 35 customers
within .25 miles of each grid location forming discrete binary data. |
RENUMBER Total_customers_within6 ASSIGNING 0 TO 1 THRU 35 ASSIGNING 1 TO 35
THRU 92 FOR High_pockets
The Renumber
command is a Reclassify class of map
analysis operators that enables a user to specify new map values for old values
or range of values on an existing map.
In this case, a binary map is produced that identifies 0 with low
customer levels from 0 to 35, and 1 identifying “pockets of high customer
density “from 35 to 92 customers within a .25 mile reach.
___________________________
Joe - I don't get a slope_fitted option under MAP > OVERLAY after I have
performed the slope_fitted function.
2/7
Pete— slope isn’t an Overlay operator …the “Neighbors” drop-down contains all of the neighborhood
operations. Select the Slope command to pop-up its dialog box
and choose the “fitted” mode for
calculating slope. But I included a few more
“Helpful Hints” below that might be useful in preparing your report. -Joe
Use the Slope
command under the “Neighbors”
menu button, to create and capture 2D display maps you create of Slope_fitted, Slope_max, Slope_min and Slope_avg by
using the appropriate option button.
If you
intend to “visually compare” maps (as directed in this question) you must use a consistent legend.
Once you have calculated the four slope maps determine the maximum
range of values considering all four calculation techniques and then make
the best 2D display using User Defined for the display
calculation mode like above …map analysis “rule:” rarely can you use default
displays for report figures.
Be sure
your discussion explains the similarities/differences in the four maps of
slope and why it is necessary to use a common legend when visually comparing
map displays. -Joe
___________________________
Hi Joe— I have a question about
one of my assigned study-guide questions (#37).
What I think you're trying to get us to do is to explain how the spread
command calculates simple proximity for point, line, and polygon data. However, I can distinguish the difference
between the point and polygon data (ranches vs. housing). Where is the ranch data so that I can
actually do this command? -Nashwa
2/6
Nashwa-- the question under questions is...
37. Using the analogy to tossing an
object(s) into a pond, describe how a simple proximity map is created for the
following MapCalc commands…
SPREAD RanchMap
TO 100 for Ranch_Prox
SPREAD HousingMap
TO 100 for Housing_Prox
SPREAD RoadMap
TO 100 for Road_Prox
…the
ranch is on the Locations map (need
to Renumber to isolate it for the RanchMap). Note that the three different Starter(s) maps
contain different map features—a single Point, a set of Points and a set of
Lines.
I think
your proposed answer has most of the required elements. But keep in mind, the waves from a single
point simply propagates outward; waves from a set of points or lines propagate
outward but interact, such the distance to the closest Starter location is
retained. The discussions in…
Beyond
Mapping III, Topic 25: Calculating
Effective Distance and Connectivity
www.innovativegis.com/basis/MapAnalysis/Topic25/Topic25.htm
Measuring
Distance Is Neither Here nor There — discusses the basic
concepts of distance and proximity
Use
Cells and Rings to Calculate Simple Proximity — describes
how simple proximity is calculated
…ought
to help in formulating your answer. -Joe
___________________________
Hi Joe— I have a question about
exercise 4, question 4. When draping the
Elev_Smoothed_Difference Map over the 3D surface, I
used the Cover function to Cover Elev_Smoothed_Difference
over Elevation. My question is do I want
to ignore zero or not.
By ignoring zero the original
elevation values are visible depicting a 3D map that is not very
distinguishable from the elevation map.
However, maybe that is a good thing since this map is supposed to show areas
that have noticeable changes in elevation as determined by the difference
between actual values and the average value computed by neighboring cells. So overall, there aren't too many areas that
have major differences between actual and average.
On the other hand, if you want to
see where these changes actually occur, it's easier to see in a 3D map that
does not ignore zeros. Am I off base
here? Thanks, Nashwa
2/6
Nashwa-- Draping is a graphical overlay (just makes a cool map), whereas Cover
creates a new map (that one could use in further Map Analysis). In the exercise,
Question 4. Capture, embed, clearly
label the 2D and 3D (draped
over Elevation) displays of the maps you created then briefly discuss
the procedure you used to create the “Smoothed Difference” and “Coffvar” maps and interpret the meaning of the output map
values.
To drape
a map over display the 3D map you want to be the “drape” (e.g., Elevation), then select Map from the main menu, Overlay from the drop-down, and choose
the map you want to drape (e.g., Slope
in the example below; Elev_ElevSmoothed_difference in the Exercise Q4).
Once you
have created the enhanced displays, visually interpret what you see …does the
result make sense? -Joe
___________________________
Joe - re ex4: when entering the
spread operation in MapCalc from part 1, question 1, the null value is
PMAP_NULL. However, in the command line
from the exercise 4 document, PMAP_NULL is omitted. Will the map results be impacted due to this
discrepancy? Thanks, Pete
1/30
Pete-- PMAP_Null
is set to "infinity" in MapCalc and is used if the user wants to
exclude an area from processing. For
example, if one only wants to process an irregularly shaped town boundary you
could create a discrete binary Null_mask (assigning 1 to all town cells and PMAP_Null to the outside cells) that identifies all areas
outside of the boundary to the full extent of the rectangular analysis
frame. The PMAP_NULL cells will be
ignored during processing and displayed as a blank in the result…
In this
case we want to process everything within the rectangular project area so either
PMAP_NULL or no value in the ignoring phrase will cause the computer to
consider all locations in its processing—what we want. -Joe
___________________________
1/29
Folks— while grading Exercise # 3, I
see where I lead you astray about the differences in Display Type and Display Data Type…
-
2D/3D Toggle |
|
-
Use Cells |
|
-
Layer Mesh |
|
-
Data Type |
|
-
Shading Manager (you must set on your own) |
|
…hopefully the above revisions are
helpful. Keep in mind that mapped Data
Type always has two parts to its specification—the nature of both its Numeric
distribution and its
Geographic distribution. -Joe
___________________________
1/28 Folks—below is a simple flowchart
identifying the Hugag Habitat binary suitability
model that was created in PowerPoint
(for the cheap ones among us without Visio,
or who want flowcharts a bit more interesting).
The version on the right “soups” it up a bit with SnagIt screen grabs of the maps generated by in the map analysis
processing. For added effect, the map
graphics are grouped then assigned “animation” settings so they appear as you
press the down arrow to advance through the model logic steps. You can download and view the two-slide
animated PowerPoint from…
http://www.innovativegis.com/basis/Courses/GMcourse11/Email_dialog/HugagHabitat_Flowchart.ppt
If any of you are interested in a practicum
on creating “fancy” graphics like these using just SnagIt and PowerPoint, I
would be delighted to hold a short workshop on techniques I have learned before
class 5:00-6:00pm. Drop by class early
if you are interested …particularly useful for those who are getting ready for
thesis or dissertation defense as the procedures are generic to making
effective graphics, regardless of whether you use GIS modeling or not. -Joe