Bridging GIS and Map Analysis:

Identifying and Utilizing Spatial relationships

 

Joseph K. Berry

Keck Scholar in Geosciences, University of Denver, Denver

Principal, Berry & Associates // Spatial Information Systems (BASIS)

2000 South College Avenue, Suite 300, Fort Collins, Colorado 80525

jberry@innovativegis.com

 

 

ABSTRACT

 

Most human endeavors are inherently spatial.  The world we live in surrounds us with opportunities and challenges that are spatially dependent on “Where is What” tempered by “Why and So What” within cognitive contexts.  In just three decades GIS technology has revolutionized our perspective on what constitutes a map and the information it can project.  The underlying data are complex, as two descriptors are required— precise location, as well as a clear description.  Manually drafted maps emphasized accurate location of physical features.  Today, maps have evolved from guides of physical space into management tools for exploring spatial relationships and perceptions.  The journey from the map room to the conference room has transformed maps from static wall hangings into interactive mapped data that address complex spatial issues.  It also has sparked an entirely new analytical tool set that provides needed insight for effective decision-making.  This new perspective marks a turning point in the use of maps— from one emphasizing physical descriptions of geographic space, to one of interpreting mapped data and successfully communicating spatially based decision factors.  This paper investigates the context, conditions and forces forming the bridge from maps to mapped data, spatial analysis and beyond.

 

INTRODUCTION

 

Map analysis tools might at first seem uncomfortable, but they are simply extensions of traditional analysis procedures brought on by the digital nature of modern maps.  Since maps are “number first, pictures later,” a map-ematical framework can be can be used to organize the analytical operations.  Like basic math, this approach uses sequential processing of mathematical operations to perform a wide variety of complex map analyses.  By controlling the order which the operations are executed, and using a common database to store the intermediate results, a mathematical-like processing structure is developed.

 

This “map algebra” was first suggested in the late 1970s in a doctoral dissertation by Dana Tomlin while at Yale University, and then compiled into a landmark book (Tomlin, 1990).  The approach has been expanded and modified over the years (Haklay, 2004; Berry, 2004; Berry, 2003; Berry, 1999).  It is similar to traditional algebra where basic operations, such as addition, subtraction and exponentiation, are logically sequenced for specific variables to form equations—however, in map algebra the variables represent entire maps consisting of thousands of individual grid values.  Most of traditional mathematical capabilities, plus extensive set of advanced map processing operations, comprise the map analysis toolbox. 

 

In grid-based map analysis, the spatial coincidence and juxtapositioning of values among and within maps create new analytical operations, such as coincidence, proximity, visual exposure and optimal routes.  These operators are accessed through general purpose map analysis software available in many GIS systems, such as MapCalc, GRASS, ERDAS or the Spatial Analyst extension to ArcGIS.  While the specific command syntax and mechanics differs among software brands, the basic analytical capabilities and spatial reasoning skills used in map analysis form a common foundation. 

 

FUNDAMENTAL CONDITIONS FOR MAP ANALYSIS

 

There are two fundamental conditions required by any map analysis package—a consistent data structure and an iterative processing environment.  Topic 18 in the online book Map Analysis (Berry, 2004) describes the characteristics of a grid-based data structure by introducing the concepts of an analysis frame, map stack and data types.  The traditional discrete set of map features (points, lines and polygons) are extended to map surfaces that characterize geographic space as a continuum of uniformly-spaced grid cells.  This structure forms a framework for the map-ematics underlying map analysis and modeling.

 

The second condition of map analysis provides an iterative processing environment by logically sequencing map analysis operations and serves as the focus of this paper.  This involves:

¾      retrieval of one or more map layers from the database,

¾      processing that data as specified by the user,

¾      creation of a new map containing the processing results, and

¾      storage of the new map for subsequent processing.  

 

Each new map derived as processing continues aligns with the analysis frame so it is automatically geo-registered to the other maps in the database.  The values comprising the derived maps are a function of the processing specified for the “input maps.”

 

Figure 1.  An iterative processing environment, analogous to basic math, is used to derive new map variables.

 

This cyclical processing provides an extremely flexible structure similar to “evaluating nested parentheticals” in traditional math.  Within this structure, one first defines the values for each variable and then solves the equation by performing the mathematical operations on those numbers in the order prescribed by the equation.  For example, the equation for calculating percent change in your investment portfolio—

 

     %Change =  A = ( B - C ) / C ) * 100

                                = ( 100,000 – 90,000 ) / 90,000 ) * 100 …define variables

                                = ( 10,000 ) / 90,000 ) *100                   …intermediate solution #1

                                = ( .111 ) * 100                                                       …intermediate solution #2

                                = 11.1 %                                                 …final solution

 

—identifies that the variables B and C are first defined, then subtracted and the difference stored as an intermediate solution.  The intermediate solution is divided by variable C to generate another intermediate solution that, in turn is multiplied by 100 to calculate the solution variable A. 

 

This same basic mathematical structure provides the framework for computer-assisted map analysis.  The only difference is that the variables are represented by mapped data composed of thousands of values organized into a grid.  Figure 1 shows a similar solution for calculating percent change in animal activity based on mapped data. 

 

The processing steps shown in the figure are identical to the traditional solution except the calculations are performed for each grid cell in the study area and the result is a map that identifies the percent change at each map location.  Map analysis identifies what kind of change (termed the thematic attribute) occurred where (termed the spatial attribute).  The characterization of what and where provides information needed for further GIS modeling, such as determining if areas of large increases in animal activity are correlated with particular cover types or near areas of low human activity.

 

FUNDAMENTAL MAP ANALYSIS OPERATIONS

 

Within this iterative processing structure, four fundamental classes of map analysis operations can be identified.  These include:

¾      Reclassifying Maps that involve the reassignment of the values of an existing map as a function of its initial value, position, size, shape or contiguity of the spatial configuration associated with each map category.

¾      Overlaying Maps that result in the creation of a new map where the value assigned to each location is computed as a function of the independent values associated with that location on two or more maps.

¾      Measuring Distance and Connectivity that involve the creation of a new map expressing the distance and route between locations as straight-line length (simple proximity) or as a function of absolute or relative barriers (effective proximity).

¾      Characterizing and Summarizing Neighborhoods that result in the creation of a new map based on the consideration of values within the general vicinity of target locations.

 

Reclassification operations merely repackage existing information on a single map without creating new spatial patterns.  Overlay operations, on the other hand, involve two or more maps and result in the creation of new spatial patterns.  Distance and connectivity operations are more advanced techniques that generate entirely new information by characterizing the relative positioning of map features.  Neighborhood operations summarize the conditions occurring in the general vicinity of a location.  The online book, Map Analysis (Berry, 2004) contains several links to more detailed discussions of the types of map analysis operations. 

 

The reclassifying and overlaying operations based on point processing are the backbone of current GIS applications, allowing rapid updating and examination of mapped data.  However, other than the significant advantage of speed and ability to handle tremendous volumes of data, these capabilities are similar to those of manual map processing.  Map-wide overlays, distance and neighborhood operations, on the other hand, identify more advanced analytic capabilities and most often do not have paper-map legacy procedures. 

 

The mathematical structure and classification scheme of Reclassify, Overlay, Distance and Neighbors form a conceptual framework that is easily adapted to modeling spatial relationships in both physical and abstract systems.  A major advantage is flexibility.  For example, a model for siting a new highway could be developed as a series of processing steps.  The analysis likely would consider economic and social concerns (e.g., proximity to high housing density, visual exposure to houses), as well as purely engineering ones (e.g., steep slopes, water bodies).  The combined expression of both physical and non-physical concerns within a quantified spatial context is a major benefit. 

 

The ability to simulate various scenarios (e.g., steepness is twice as important as visual exposure and proximity to housing is four times more important than all other considerations) provides an opportunity to fully integrate spatial information into the decision-making process.  By noting how often and where the proposed route changes as successive runs are made under varying assumptions, information on the unique sensitivity to siting a highway in a particular locale is described.

 

In addition to flexibility, there are several other advantages in developing a generalized analytical structure for map analysis.  The systematic rigor of a mathematical approach forces both theorist and user to carefully consider the nature of the data being processed.  Also it provides a comprehensive format for learning that is independent of specific disciplines or applications.  Furthermore the flowchart of processing succinctly describes the components and weightings capsulated in an analysis.

 

This communication enables decision-makers to more fully understand the analytic process and actually interact with weightings, incomplete considerations and/or erroneous assumptions.  These comments, in most cases, can be easily incorporated and new results generated in a timely manner.  From a decision-maker’s point of view, traditional manual techniques for analyzing maps are a distinct and separate task from the decision itself.  They require considerable time to perform and many of the considerations are subjective in their evaluation. 

 

In the old environment, decision-makers attempt to interpret results, bounded by vague assumptions and system expressions of the technician.  Computer-assisted map analysis, on the other hand, engages decision-makers in the analytic process.  In a sense, it both documents the thought process and encourages interaction—sort of like a “spatial spreadsheet.”

 

MAPPING VERSUS ANALYSIS

 

Vector-based desktop mapping is rapidly becoming part of the modern business environment.  The close link between these systems and traditional spreadsheet and database management programs has fueled the adoption.  In many ways, “a database is just picture waiting to happen.”  The direct link between attributes described in a database record and their spatial characterization is conceptually easy.  Geo-query by clicking on a map to pop-up the attribute record or searching a database then plotting the selected records is an extremely useful extension of contemporary procedures.  Increasing data availability and Internet access coupled with decreasing desktop mapping system costs and complexity make adoption of spatial database technology a practical reality. 

 

Maps in their traditional form of point, lines and polygons identifying discrete spatial objects align with manual mapping concepts and experiences learned as early as girl and boy scouts.  Grid-based maps, on the other hand, represent a different paradigm of geographic space.  Whereas traditional vector maps emphasize “precise placement of physical features,” grid maps seek to “statistically characterize continuous space in both real and cognitive terms.”  The tools for mapping of database attributes are extended to analysis of spatial relationships.  This paper focuses the basic concepts, considerations and procedures in map analysis operations as they apply to many disciplines.  Three broad capabilities are discussed—1) surface modeling, 2) spatial data mining and 3) spatial analysis.

 

SURFACE MODELING

 

Surface modeling involves the translation of discrete point data into a continuous surface that represents the geographic distribution of that data.  Traditional non-spatial statistics involves an analogous process when a numerical distribution (e.g., standard normal curve) is used to generalize the central tendency of a data set.  The derived mean (average) and standard deviation reflects the typical response and provides a measure of how typical it is.  This characterization seeks to explain data variation in terms of the numerical distribution of measurements without any reference to the data’s spatial distribution. 

 

In fact, an underlying assumption in most traditional statistical analyses is that the data is randomly distributed in space.  If the data exhibits spatial autocorrelation, many of the analysis techniques are less valid.  Spatial statistics, on the other hand, utilizes geographic patterns in the data to further explain the variance.  There are numerous techniques for characterizing the spatial distribution inherent in a data set but they can be characterized by three basic approaches:

¾      Point Density mapping that aggregates the number of points within a specified distance (number per acre),

¾      Spatial Interpolation that weight-averages measurements within a localized area (e.g., kriging), and 

¾      Map Generalization that fits a functional form to the entire data set (e.g., polynomial surface fitting). 

 

For example, consider figure 2 showing a point density map derived from customer addresses.  The project area is divided into an analysis frame of 250-foot grid cells (100c x 100r = 10,000 cells).  The number of customers for each grid space is determined by street addresses in a desktop mapping system (“spikes” in the 3D map on the left).

 

Figure 2.  Point density map aggregating the number of customers within a specified distance.

 

A neighborhood summary operation is used to pass a “roving window” over the project area calculating the total customers within a half-mile of each map location.  The result is a continuous map surface indicating the relative density of customers—peaks where there is a lot of nearby customers and valleys where there aren’t many.  In essence, the map surface quantifies what your eye sees in the spiked map—some areas with lots of customers and others with very few.

 

SPATIAL DATA MINING

 

Spatial data mining seeks to uncover relationships within and among mapped data.  Some of the techniques include coincidence summary, proximal alignment, statistical tests, percent difference, surface configuration, level-slicing, map similarity, and clustering that are used in comparing maps and assessing similarities in data patterns. 

 

Another group of spatial data mining techniques focuses on developing predictive models.  For example, an early use of predictive modeling was in extending a test market project for a phone company (figure 3).  Customers’ address were used to “geo-code” map coordinates for sales of a new product enabling distinctly different rings to be assigned to a single phone—one for the kids and one for the parents.  Like pushpins on a map, the pattern of sales throughout test market emerged with some areas doing very well, while in other areas sales were few and far between. 

 

The demographic data for the city was analyzed to calculate a prediction equation between product sales (dependent variable) and census block data (independent variables).  The regression equation that was developed is similar to one derived using non-spatial statistics using a discrete set of samples.  However in the spatial statistics solution entire map surfaces are considered that account for the spatial autocorrelation within each map variable.  In addition, the solution is based on thousands of spatially dependent cases instead of just a few spatially independent samples.

 

 

 

Figure 3.  Spatial data mining can be used to derive predictive models of the relationships among mapped data.

 

The prediction equation derived from the test market sales was then applied to another city by evaluating existing demographics to “solve the equation” for a predicted sales map.  In turn, the predicted map was combined with a wire-exchange map to identify switching facilities that would require upgrading before release of the product in the new city. 

 

The variables used in the analysis consider geographic space as a continuum of demographic characteristics and the resultant map as a continuum characterizing the spatial propensity to purchase a product.  Similar analyses can relate crop yield to soil nutrient levels throughout a field or animal activity to habitat conditions.  The spatial data mining process is independent of application characteristics and is valid for mapped data that form continuous distributions in both numerical and geographical space.  New statistical techniques, such as CART technology, that can utilize nominal and ordinal data promise to revolutionize geoscience, as much as they are revolutionizing traditional statistics.

 

SPATIAL ANALYSIS

 

Whereas spatial data mining responds to “numerical” relationships in mapped data, spatial analysis investigates the “contextual” relationships.  Tools such as slope/aspect, buffers, effective proximity, optimal path, visual exposure and shape analysis, fall into this class of spatial operators.  Rather than statistical analysis of mapped data, these techniques examine geographic patterns, vicinity characteristics and connectivity among features.  

 

The example shown in figures 4 and 5 builds on two specific map analysis capabilities—effective proximity and accumulation surface analysis.  The following discussion focuses on the application of these tools to competition analysis between two stores. 

 

The top-left side of figure 4 shows the travel-time surface from Kent’s Emporium.  It is calculated by starting at the store then moving out along the road network like waves propagating through a canal system.  As the wave front moves, it adds the time to cross each successive road segment to the accumulated time up to that point

 

The result is the estimated travel-time from Kent’s to every location in the city.  The surface starts at 0.0 and extends to 24.4 minutes away (red peaks).  Note that it is shaped like a bowl with the bottom at the store’s location (south central portion of the city).  In the its accompanying 2D display, travel-time appears as a series of rings that form increasing distance zones.  The critical points to conceptualize are 1) that the surface is analogous to a football stadium (continually increasing) and 2) that every road location is assigned a distance value (minutes away) from the store. 

 

The inset below Kent’s travel-time surface shows the travel-time surface for another store (Colossal Mart) with its origin in the northeast portion of the city.  While Kent’s travel-time surface appears to “grow” away from you, Colossal’s surface seems to grow toward you.

 

Figure 4.  Two travel-time surfaces can be combined to identify the relative advantage of each store.

 

Simply subtracting the two surfaces derives the relative travel-time advantage for the stores (right-side figure 4).  Keep in mind that the surfaces actually contain geo-registered values and a new value (difference) is computed for each map location.  The inset on the left side of the figure shows a computed Colossal Mart advantage of 6.1 minutes (22.5 – 16.4= 6.1) for the indicated location in the extreme northeast corner of the city.

 

Locations that are the same travel distance from both stores result in zero difference and are displayed as black.  The green tones on the difference map identify positive values where Kent’s travel-time is larger than its competitor’s—advantage to Colossal Mart.  Negative values (red tones) indicate the opposite—advantage to Kent’s Emporium.  The yellow tone indicates the “combat zone” where potential customers are about the same distance from either store—advantage to no one. 

 

Figure 5 displays the same information as a 3D surface.  The combat zone is shown as a yellow valley dividing the city into two marketing regions—peaks of strong travel-time advantage.  Targeted marketing efforts, such as leaflets, advertising inserts and telemarketing might best be focused on the combat zone. 

 

At a minimum the travel-time advantage map enables retailers to visualize the lay of the competitive landscape.  However the information is in quantitative form and can be readily integrated with other customer data.   Knowing the relative travel-time advantage (or disadvantage) of every street address in a city can be a valuable piece of the marketing puzzle.  Like age, gender, education, and income, relative travel-time advantage is part of the soup that determines where one shops. 

 

There are numerous other map analysis operations in the grid-based “toolbox.”  The examples of customer density surface, sales prediction map and travel-time/competition analysis were used to illustrate a few of geo-business applications capitalizing on the new tools.  Keep in mind that the tools are generic and can be applied to a wide variety of spatial problems within most disciplines.  Like traditional mathematics, the tools are not application specific.  They arise from the digital nature of modern maps to form a generalized map-ematics that catapults GIS technology beyond mapping and geo-query.

 

 

Figure 5.  A transformed display of the difference map shows travel-time advantage as peaks (red) and locations with minimal advantage as an intervening valley (yellow).

 

Early information systems relied on physical storage of data and manual processing.  With the advent of the computer, most of these data and procedures have been automated during the past three decades.  Commensurate with the digital map, geoscience and resource information processing increasingly has become more quantitative. Systems analysis techniques developed links between descriptive data to the mix of management actions that maximize a set of objectives.  This mathematical approach to geoscience investigation has been both stimulated and facilitated by modern information systems technology.  The digital nature of mapped data in these systems provides a wealth of new analysis operations and an unprecedented ability to model complex spatial issues.  The full impact of the new data form and analytical capabilities is yet to be realized.

 

CONCLUSION

 

It is certain, however, that tomorrow's GIS will build on the cognitive basis, as well as the spatial databases and analytical operations of the technology.  This contemporary view pushes GIS beyond simply mapping and spatial database management to map analysis and spatial reasoning that focuses on the solution and communication of complex geographic phenomena and decision contexts.  In a sense, the bridge between GIS and map analysis is formed by the map-ematical structure and fundamental processing operations contained in grid-based analytical packages.  The bridge takes us to a better understanding of spatial relationships and their application in solving environmental, economic and social concerns facing an increasingly complex world.

 

REFERENCES

 

Berry, J. K. (2004). Map Analysis: Procedures and Applications in GIS Modeling, online book of compiled Beyond Mapping columns in GeoWorld from 1996 to the present is posted at http://www.innovativegis.com/. 

 

Berry, J. K. (2003). GIS in transition: moving maps to mapped data, spatial analysis and beyond. Keynote address for NW ESRI Users’ Group Conference, Stevenson, Washington, September 16-18, 2003. PowerPoint slide set is posted online at http://www.innovativegis.com/basis/present/NW_ESRI_03/Keynote.ppt.

Berry, J. K. (2000). Geographic information systems (GIS) technology: a brief history, trends and probable future, book chapter in Geographic Information Systems in Petroleum Exploration and Development, Coburn, T.C. and J.M. Yarus, editors, American Association of Petroleum Geologists, Tulsa, Oklahoma.

Berry, J. K. (1999). GIS technology in environmental management: a brief history, trends and probable future, book chapter in Global Environmental Policy and Administration, Soden and Steel, editors, Marcel Dekker Publishers, New York.

Haklay, M. (2004), Map Calculus in GIS: a proposal and demonstration. International Journal of Geographical Information Science, 18 (2):107-125.

Tomlin, D. (1990). Geographic Information Systems and Cartographic Modelling, Prentice Hall, New Jersey.