**Supplement for GIS Toolbox column
December, 1998, "Mapping Spatial Dependency"**

*The following further discussion and Excel worksheet is
posted at www.innovativegis.com/basis
under Column Supplements…*

*_______________________________*

Using the F-test to Evaluate Localized Spatial Autocorrelation

**by John Gins, HyperParallel, Inc. (jgins@earthlink.com)**

**Note: The designation of "Berry
IT" for this procedure in the Beyond Mapping column, GIS World, 1/99, was made in
jest. Although the "handle" is in jest, the approach is genuine. As explained in
the article, the procedure uses a "doughnut neighborhood" formed by a center
cell and its adjacent cells (inside) and the surrounding distant cells (outside). This
"roving window" is passed over a grid data, successively evaluating each map
location. John Gins suggests using the F-test to directly compare the variances.**

**A easy way to assess localized correlation is to compare
the variances… formulate the question, hypothesis this way:**

"Is the Variance of the adjacent neighborhood A = Var(A) the same as that

of the Variance of the doughnut neighborhood D = Var(D)?"

**Define the sample size of the adjacent neighborhood A as n(A) and
the sample size of the doughnut neighborhood D as n(D). If we can
assume that the population of A is normally distributed and that the population of D
is normally distributed, Var(A)/Var(D) follows the Fisher Distribution, F,
with (n(A)-1) and (n(D)-1) degrees of freedom. The result of
evaluating the function F(Var(A)/Var(D), n(A)-1, n(D)-1) is
the upper tail probability.**

**Note that if Var(A) > Var(D) then F ranges from 50% to 100%,
if Var(A) < Var(D) then F ranges from 0% to 50%, and if Var(A) = Var(D)
then F=50%.**

This value of F can be obtained in Excel using the following function:

FDIST (x,degrees_freedom1,degrees_freedom2)

x=Var(A)/Var(D)

degrees_freedom1 =n(A)-1

degrees_freedom2 =n(D)-1

**Typically in the middle of the map n(A)=5 and n(D)=20. If we are
willing to accept a 5% chance of being wrong, then we would reject our hypothesis that
Variance of the adjacent neighborhood is equal to the Variance of the doughnut
neighborhood if the ratio of the Variances . Var(A)/Var(D). is less than 0.1166 or
greater than 3.5587.**

**This range can be obtained in Excel using the following function:**

FINV (probability,degrees_freedom1,degrees_freedom2)

Probability = 0.025 and 0.975

degrees_freedom1 =n(A)-1

degrees_freedom2 =n(D)-1

**Using the data from Joe’s example, I have supplied a
"downloadable" Excel spreadsheet (FDIST.XLS, ***VERY big file... be patient*) with all of the
computational steps delineated. Each Worksheet Tab contains a portion of the calculation.

**Another thought… Heteroscadasticity - the lack of consist variance over
groups, is usually measured via analysis of variance over combinations of data from
independent groups or cells. The usual methods would not apply to the donut versus
local neighborhoods because of the way cell data is reused via the roving window. We
can do some analysis of the Fdist values that are derived from the ratios as follows:
If the variances are random then I would expect to see the Fdist values uniformly
distributed from 0 to 1. (5% of the values should be observed in each 0.05 interval).
If the variances are the same over the map then all of the Fdist values of the
ratios would be clustered around 0.5. If there are lots of features then I would
expect to see values less then .05 or greater than .95.
A good measure of the difference of the Fdist values to uniform is the Kolmogorov-Smirnov
goodness of fit test. I would want to use intervals of <=.01, .01-<=0.025,
0.025-<=.075, ... , 0.475-<=0.525, ... 0.925-<=0.975, 0.975-<=0.99,>0.99
This would give us a single measure over an entire map. If this is of interest I
will work up an example.**