::p_load(sf,tidyverse,spdep, tmap) pacman
Hands On Exercise 2.1 - Global and Local Measures of Spatial Autocorrelation
Overview
Hands On Exercise 3 - Global and Local Measures of Spatial Autocorrelation
In this hands-on exercise, we explore how to compute Global and Local Measure of Spatial Autocorrelation (GLSA) by using spdep package.
Getting Started
The code chunk below install & load sf, spdep, tmap & tidyverse packages into the R env
Importing Hunan Geospatial sf
= st_read(dsn="data/geospatial", layer="Hunan") hunan_sf
Reading layer `Hunan' from data source
`D:\Allanckw\ISSS624\Hands-on_Ex3\data\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 88 features and 7 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 108.7831 ymin: 24.6342 xmax: 114.2544 ymax: 30.12812
Geodetic CRS: WGS 84
Loading Hunan 2012 Aspatial File in CSV
= read_csv("data/aspatial/hunan_2012.csv") hunan_GDP
Rows: 88 Columns: 29
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): County, City
dbl (27): avg_wage, deposite, FAI, Gov_Rev, Gov_Exp, GDP, GDPPC, GIO, Loan, ...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Joining attribute data to the simple feature files
Next, left_join() of dplyr is used to join the geographical data and attribute table
= left_join(hunan_sf, hunan_GDP) hunan
Joining, by = "County"
Visualizing Regional Development Indicator
We will visualize a choropleth map that displays the distribution of GDPPC 2012 using the tmap package
= tm_shape(hunan) +
equal tm_fill("GDPPC", n = 5, style="equal") +
tm_borders(alpha=0.5) +
tm_layout(main.title = "Equal interval categorization")
= tm_shape(hunan) +
quantile tm_fill("GDPPC", n = 5, style="quantile") +
tm_borders(alpha=0.5) +
tm_layout(main.title = "Equal quantile interval categorization")
tmap_arrange(equal, quantile)
Computing Spatial Autocorrelation
We learn how to compute GLOBAL spatial autocorrelation statistics and to perform spatial complete randomness test for global spatial correlation
Computing Spatial Weights
We need to find the spatial weights first before we can compute global spatial correlation statistics. The spatial weights is used to define the neighbourhood relationships between the geographical units
We use poly2nb() of spdep package to compute the contiguity weight matrix. The function builds a neighbour list based on regions with contiguous boundaries. Using queen’s contiguity weight matrix, we have
= poly2nb(hunan)
wm_q summary(wm_q)
Neighbour list object:
Number of regions: 88
Number of nonzero links: 448
Percentage nonzero weights: 5.785124
Average number of links: 5.090909
Link number distribution:
1 2 3 4 5 6 7 8 9 11
2 2 12 16 24 14 11 4 2 1
2 least connected regions:
30 65 with 1 link
1 most connected region:
85 with 11 links
From the results, there are 88 regions in Hunan,
Using the Queen’s method, 85 of them has 11 neighbours, while only 2 of them has 1 neighbour
Building the Row-standardised weights matrix
After computing the spatial weights, we will need to build the row standardized weights matrix. “W” Style will be used such that each neighbouring polygon will be assigned equal weight. This is done by taking the 1/(no. of neighbours) to each neighbouring county and then summing up the weighted income values.
Although this is the most logical way to summarize the neighbours’ values, there is a disadvantage in that polygons at the study area’s boundaries will base their lagged values on fewer polygons, which could lead to an over- or underestimation of the true degree of spatial autocorrelation in the data.
= nb2listw(wm_q, style="W", zero.policy = TRUE)
rs_wm_q rs_wm_q
Characteristics of weights list object:
Neighbour list object:
Number of regions: 88
Number of nonzero links: 448
Percentage nonzero weights: 5.785124
Average number of links: 5.090909
Weights style: W
Weights constants summary:
n nn S0 S1 S2
W 88 7744 88 37.86334 365.9147
The Null Hypothesis
The null hypothesis is to assume that GDPPC is randomly distributed between the different counties.
Computing Spatial Autocorrelation: Moran’s I
We will perform Moran’s I statistical test with moran.test()
of the spdep package.
moran.test(hunan$GDPPC, listw = rs_wm_q, zero.policy = TRUE, na.action = na.omit)
Moran I test under randomisation
data: hunan$GDPPC
weights: rs_wm_q
Moran I statistic standard deviate = 4.7351, p-value = 1.095e-06
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.300749970 -0.011494253 0.004348351
Based on the result, we will reject the null hypothesis as the p-value is less than 0.05. In fact as the p-value is less than 0.01, we can consider that as highly significant.
Therefore, we can conclude that the GDPPC is not randomly distributed based on Moran’s I statistics
Computing Spatial Autocorrelation: Moran’s I with Monte Carlo simulation
In order to further confirm that the null hypothesis is false, we could use Monte Carlo simulation to predict potential outcomes of the event by using moran.mc()
function of the spdep package. We will use 1000 simulations for this test.
set.seed(908)
= moran.mc(hunan$GDPPC, listw = rs_wm_q, nsim=999, zero.policy = TRUE, na.action = na.omit)
bperm
bperm
Monte-Carlo simulation of Moran I
data: hunan$GDPPC
weights: rs_wm_q
number of simulations + 1: 1000
statistic = 0.30075, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater
Based on the result, we will reject the null hypothesis as the p-value is less than 0.05. In fact as the p-value is less than 0.01, we can consider that as highly significant even when the statistics is repeated 1000 times.
Therefore, we can conclude that the GDPPC is not randomly distributed based on Moran’s I statistics with Monte Carlo simulation
Visualizing Monte Carlo Moran’s I
It is always a good practice to examine the simulated Moran’s I test statistics in detail. This can be done by plotting the statistical values as a histogram by the code below:
mean(bperm$res[1:999]) #compute mean
[1] -0.01381501
var(bperm$res[1:999]) #compute variance
[1] 0.004192274
sd(bperm$res[1:999]) #compute std dev.
[1] 0.06474778
summary(bperm$res[1:999])
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.16447 -0.06010 -0.01643 -0.01382 0.02926 0.23767
Building the histogram
hist(bperm$res, freq=TRUE, breaks = 20, xlab="Simulated Moran's I")
abline(v=0, col="blue")
Using ggplot, we can reproduce the same graph, however we need to convert the result into a data frame first
= data.frame(bperm$res) #convert to data frame
df
ggplot(df, aes(bperm$res)) + #aes = column name
geom_histogram(bins=20,
color="White",
fill="lightblue") +
labs(x = "Simulated Moran's I",
y = "Frequency") +
geom_vline(aes(xintercept=0),
color="red", linetype="dashed", size=1)
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
Analysis:
The reason why abline is set to 0 is because it must fall between [-1, 1].
Negative correlation is -1, No correlation is 0, Positive correlation is 1
There is a positive correlation based on the result of the histogram for Moran’s I Statistics
Visualising Geary’s C test
Computing Spatial Autocorrelation: Geary’s C
We will perform Geary’s C statistical test with geary.test()
of the spdep package.
geary.test(hunan$GDPPC, list=rs_wm_q)
Geary C test under randomisation
data: hunan$GDPPC
weights: rs_wm_q
Geary C statistic standard deviate = 3.6108, p-value = 0.0001526
alternative hypothesis: Expectation greater than statistic
sample estimates:
Geary C statistic Expectation Variance
0.6907223 1.0000000 0.0073364
Based on the result, we will reject the null hypothesis as the p-value is less than 0.05. In fact as the p-value is less than 0.01, we can consider that as highly significant.
Therefore, we can conclude that the GDPPC is not randomly distributed based on Geary’s C statistics
Computing Spatial Autocorrelation: Geary’s C with Monte Carlo simulation
In order to further confirm that the null hypothesis is false, we could use Monte Carlo simulation to predict potential outcomes of the event by using geary.mc()
function of the spdep package. We will use 1000 simulations for this test
set.seed(908)
= geary.mc(hunan$GDPPC, listw = rs_wm_q, nsim=999)
bpermG
bpermG
Monte-Carlo simulation of Geary C
data: hunan$GDPPC
weights: rs_wm_q
number of simulations + 1: 1000
statistic = 0.69072, observed rank = 1, p-value = 0.001
alternative hypothesis: greater
Based on the result, we will reject the null hypothesis as the p-value is less than 0.05. In fact as the p-value is less than 0.01, we can consider that as highly significant even when the statistics is repeated 1000 times.
Therefore, we can conclude that the GDPPC is not randomly distributed based on Geary’s C statistics with Monte Carlo simulation
Visualising Monte Carlo Geary’s C
mean(bpermG$res[1:999]) #compute mean
[1] 1.002273
var(bpermG$res[1:999]) #compute variance
[1] 0.007094831
sd(bpermG$res[1:999]) #compute std dev.
[1] 0.08423082
summary(bpermG$res[1:999])
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.7135 0.9477 1.0010 1.0023 1.0577 1.2441
Building the histogram
hist(bpermG$res, freq=TRUE, breaks=20, xlab = "Simulated Geary c")
abline (v=1, col="blue")
Using ggplot, we can reproduce the same graph, however we need to convert the result into a data frame first
= data.frame(bpermG$res) #convert to data frame
df_G
ggplot(df_G, aes(bpermG$res)) + #aes = column name
geom_histogram(bins=20,
color="White",
fill="lightblue") +
labs(x = "Simulated Geary's C",
y = "Frequency") +
geom_vline(aes(xintercept=1),
color="red", linetype="dashed", size=1)
Analysis:
The reason why abline is set to 1 is because it must fall between [0, 2].
Negative correlation is 2, No correlation is 1, Positive correlation is 0, notice that it is essentially the opposite from Moran’s I
In Moran I the smaller the number, indicates negative correlation (small -> -ve), in contrast in Geary’s C the smaller the number indicates positive correlation (small -> +ve)
There is a positive correlation based on the result of the histogram for Geary’s C statistics
Spatial Correlogram
Examining spatial autocorrelation patterns in the data or model residuals is made simple with spatial correlograms.
They are graphs of some measure of autocorrelation (Moran’s I or Geary’s C) against distance and they demonstrate how correlated pairs of spatial observations are as one increase the distance (lag) between them.
Computing Moran’s I correlogram
We use sp.correlogram()
of spdep package to compute a 6-lag spatial correlogram of GDPPC. The global spatial autocorrelation used in Moran’s I
They are graphs of some measure of autocorrelation (Moran’s I or Geary’s c) against distance and they demonstrate how correlated pairs of spatial observations are as one increase the distance (lag) between them.
Although correlograms are not as fundamental as variograms, which is a fundamental idea in geostatistics, they are nevertheless a very valuable tool for exploratory and descriptive work. They offer deeper insights than variograms do for this purpose.
Computing Moran’s I correlogram
We use sp.correlogram()
of spdep package to compute a 6-lag spatial correlogram of GDPPC. The global spatial autocorrelation used in Moran’s I
Plot()
is used to draw the output
= sp.correlogram(wm_q, hunan$GDPPC, order = 6, method = "I", style = "W")
MI_Corr
plot(MI_Corr)
Plotting the output might not allow us to provide complete interpretation, this is because not all autocorrelation values are statistically significant. Hence we should analyze the report by printing out the result
print(MI_Corr)
Spatial correlogram for hunan$GDPPC
method: Moran's I
estimate expectation variance standard deviate Pr(I) two sided
1 (88) 0.3007500 -0.0114943 0.0043484 4.7351 2.189e-06 ***
2 (88) 0.2060084 -0.0114943 0.0020962 4.7505 2.029e-06 ***
3 (88) 0.0668273 -0.0114943 0.0014602 2.0496 0.040400 *
4 (88) 0.0299470 -0.0114943 0.0011717 1.2107 0.226015
5 (88) -0.1530471 -0.0114943 0.0012440 -4.0134 5.984e-05 ***
6 (88) -0.1187070 -0.0114943 0.0016791 -2.6164 0.008886 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p value is < 0.05 and hence is statically significant except for the 4th neighbour with p value at 0.226.
We can tell that GDPPC is positively correlated for counties up to a distance of 3 neighbours, and negatively correlated from the 5th neighbour onwards.
As the 4th degree neighbour is not statistically significant, we will not reject the null hypothesis of it being random.
Computing Geary’s C correlogram
We use sp.correlogram()
of spdep package to compute a 6-lag spatial correlogram of GDPPC. The global spatial autocorrelation used in Geary’s C
= sp.correlogram(wm_q, hunan$GDPPC, order = 6, method = "C", style = "W")
GC_Corr
plot(GC_Corr)
Plotting the output might not allow us to provide complete interpretation, this is because not all autocorrelation values are statistically significant. Hence we should analyze the report by printing out the result
print(GC_Corr)
Spatial correlogram for hunan$GDPPC
method: Geary's C
estimate expectation variance standard deviate Pr(I) two sided
1 (88) 0.6907223 1.0000000 0.0073364 -3.6108 0.0003052 ***
2 (88) 0.7630197 1.0000000 0.0049126 -3.3811 0.0007220 ***
3 (88) 0.9397299 1.0000000 0.0049005 -0.8610 0.3892612
4 (88) 1.0098462 1.0000000 0.0039631 0.1564 0.8757128
5 (88) 1.2008204 1.0000000 0.0035568 3.3673 0.0007592 ***
6 (88) 1.0773386 1.0000000 0.0058042 1.0151 0.3100407
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In this case, it is only statistically significant for the 1st, 2nd and 5th degree neighbour for GDPPC to be correlated by distance. The rest of the neighbours are not and appears to be random for the Geary’s C method.
Cluster and Outlier Analysis
Statistics called Local Indicators of Spatial Association, or LISA, assess whether clusters exist in the spatial arrangement of a given variable.
Local clusters in the rates, for example, indicate that some census tracts in a given city have greater or lower rates than would be predicted by chance alone; that is, the values observed are higher or lower than those of a random distribution in space.
We will use relevant Local Indicators for Spatial Association (LISA), particularly local Moran, in this section to identify clusters and/or outliers in the GDP per capita 2012 figures for Hunan Province.
Computing local Moran’s I
The localmoran()
function of spdep will be used to calculate local Moran’s I. Given a collection of l_i values, z_i values and a listw object with neighbour weighting details for the polygon associated with the z_i values.
= order(hunan$County)
fips = localmoran(hunan$GDPPC, rs_wm_q)
localMI head(localMI)
Ii E.Ii Var.Ii Z.Ii Pr(z != E(Ii))
1 -0.001468468 -2.815006e-05 4.723841e-04 -0.06626904 0.9471636
2 0.025878173 -6.061953e-04 1.016664e-02 0.26266425 0.7928094
3 -0.011987646 -5.366648e-03 1.133362e-01 -0.01966705 0.9843090
4 0.001022468 -2.404783e-07 5.105969e-06 0.45259801 0.6508382
5 0.014814881 -6.829362e-05 1.449949e-03 0.39085814 0.6959021
6 -0.038793829 -3.860263e-04 6.475559e-03 -0.47728835 0.6331568
localmoran()
function returns a matrix of values whose columns are:
Ii: the local Moran’s I statistics
E.Ii: the expectation (mean) of local Moran statistic under the randomization hypothesis
Var.Ii: the variance of local Moran statistic under the randomization hypothesis
Z.Ii: the standard deviation of local Moran statistic
Pr: the p-value of local Moran statistic where it investigate
We can print the local Moran’s matrix by printCoefmat
printCoefmat(data.frame(localMI[fips,],
row.names=hunan$County[fips]),
check.names=FALSE)
Ii E.Ii Var.Ii Z.Ii Pr.z....E.Ii..
Anhua -2.2493e-02 -5.0048e-03 5.8235e-02 -7.2467e-02 0.9422
Anren -3.9932e-01 -7.0111e-03 7.0348e-02 -1.4791e+00 0.1391
Anxiang -1.4685e-03 -2.8150e-05 4.7238e-04 -6.6269e-02 0.9472
Baojing 3.4737e-01 -5.0089e-03 8.3636e-02 1.2185e+00 0.2230
Chaling 2.0559e-02 -9.6812e-04 2.7711e-02 1.2932e-01 0.8971
Changning -2.9868e-05 -9.0010e-09 1.5105e-07 -7.6828e-02 0.9388
Changsha 4.9022e+00 -2.1348e-01 2.3194e+00 3.3590e+00 0.0008
Chengbu 7.3725e-01 -1.0534e-02 2.2132e-01 1.5895e+00 0.1119
Chenxi 1.4544e-01 -2.8156e-03 4.7116e-02 6.8299e-01 0.4946
Cili 7.3176e-02 -1.6747e-03 4.7902e-02 3.4200e-01 0.7324
Dao 2.1420e-01 -2.0824e-03 4.4123e-02 1.0297e+00 0.3032
Dongan 1.5210e-01 -6.3485e-04 1.3471e-02 1.3159e+00 0.1882
Dongkou 5.2918e-01 -6.4461e-03 1.0748e-01 1.6338e+00 0.1023
Fenghuang 1.8013e-01 -6.2832e-03 1.3257e-01 5.1198e-01 0.6087
Guidong -5.9160e-01 -1.3086e-02 3.7003e-01 -9.5104e-01 0.3416
Guiyang 1.8240e-01 -3.6908e-03 3.2610e-02 1.0305e+00 0.3028
Guzhang 2.8466e-01 -8.5054e-03 1.4152e-01 7.7931e-01 0.4358
Hanshou 2.5878e-02 -6.0620e-04 1.0167e-02 2.6266e-01 0.7928
Hengdong 9.9964e-03 -4.9063e-04 6.7742e-03 1.2742e-01 0.8986
Hengnan 2.8064e-02 -3.2160e-04 3.7597e-03 4.6294e-01 0.6434
Hengshan -5.8201e-03 -3.0437e-05 5.1076e-04 -2.5618e-01 0.7978
Hengyang 6.2997e-02 -1.3046e-03 2.1865e-02 4.3486e-01 0.6637
Hongjiang 1.8790e-01 -2.3019e-03 3.1725e-02 1.0678e+00 0.2856
Huarong -1.5389e-02 -1.8667e-03 8.1030e-02 -4.7503e-02 0.9621
Huayuan 8.3772e-02 -8.5569e-04 2.4495e-02 5.4072e-01 0.5887
Huitong 2.5997e-01 -5.2447e-03 1.1077e-01 7.9685e-01 0.4255
Jiahe -1.2431e-01 -3.0550e-03 5.1111e-02 -5.3633e-01 0.5917
Jianghua 2.8651e-01 -3.8280e-03 8.0968e-02 1.0204e+00 0.3076
Jiangyong 2.4337e-01 -2.7082e-03 1.1746e-01 7.1800e-01 0.4728
Jingzhou 1.8270e-01 -8.5106e-04 2.4363e-02 1.1759e+00 0.2396
Jinshi -1.1988e-02 -5.3666e-03 1.1334e-01 -1.9667e-02 0.9843
Jishou -2.8680e-01 -2.6305e-03 4.4028e-02 -1.3543e+00 0.1756
Lanshan 6.3334e-02 -9.6365e-04 2.0441e-02 4.4972e-01 0.6529
Leiyang 1.1581e-02 -1.4948e-04 2.5082e-03 2.3422e-01 0.8148
Lengshuijiang -1.7903e+00 -8.2129e-02 2.1598e+00 -1.1623e+00 0.2451
Li 1.0225e-03 -2.4048e-07 5.1060e-06 4.5260e-01 0.6508
Lianyuan -1.4672e-01 -1.8983e-03 1.9145e-02 -1.0467e+00 0.2952
Liling 1.3774e+00 -1.5097e-02 4.2601e-01 2.1335e+00 0.0329
Linli 1.4815e-02 -6.8294e-05 1.4499e-03 3.9086e-01 0.6959
Linwu -2.4621e-03 -9.0703e-06 1.9258e-04 -1.7676e-01 0.8597
Linxiang 6.5904e-02 -2.9028e-03 2.5470e-01 1.3634e-01 0.8916
Liuyang 3.3688e+00 -7.7502e-02 1.5180e+00 2.7972e+00 0.0052
Longhui 8.0801e-01 -1.1377e-02 1.5538e-01 2.0787e+00 0.0376
Longshan 7.5663e-01 -1.1100e-02 3.1449e-01 1.3690e+00 0.1710
Luxi 1.8177e-01 -2.4855e-03 3.4249e-02 9.9561e-01 0.3194
Mayang 2.1852e-01 -5.8773e-03 9.8049e-02 7.1663e-01 0.4736
Miluo 1.8704e+00 -1.6927e-02 2.7925e-01 3.5715e+00 0.0004
Nan -9.5789e-03 -4.9497e-04 6.8341e-03 -1.0988e-01 0.9125
Ningxiang 1.5607e+00 -7.3878e-02 8.0012e-01 1.8274e+00 0.0676
Ningyuan 2.0910e-01 -7.0884e-03 8.2306e-02 7.5356e-01 0.4511
Pingjiang -9.8964e-01 -2.6457e-03 5.6027e-02 -4.1698e+00 0.0000
Qidong 1.1806e-01 -2.1207e-03 2.4747e-02 7.6396e-01 0.4449
Qiyang 6.1966e-02 -7.3374e-04 8.5743e-03 6.7712e-01 0.4983
Rucheng -3.6992e-01 -8.8999e-03 2.5272e-01 -7.1814e-01 0.4727
Sangzhi 2.5053e-01 -4.9470e-03 6.8000e-02 9.7972e-01 0.3272
Shaodong -3.2659e-02 -3.6592e-05 5.0546e-04 -1.4510e+00 0.1468
Shaoshan 2.1223e+00 -5.0227e-02 1.3668e+00 1.8583e+00 0.0631
Shaoyang 5.9499e-01 -1.1253e-02 1.3012e-01 1.6807e+00 0.0928
Shimen -3.8794e-02 -3.8603e-04 6.4756e-03 -4.7729e-01 0.6332
Shuangfeng 9.2835e-03 -2.2867e-03 3.1516e-02 6.5174e-02 0.9480
Shuangpai 8.0591e-02 -3.1366e-04 8.9838e-03 8.5358e-01 0.3933
Suining 3.7585e-01 -3.5933e-03 4.1870e-02 1.8544e+00 0.0637
Taojiang -2.5394e-01 -1.2395e-03 1.4477e-02 -2.1002e+00 0.0357
Taoyuan 1.4729e-02 -1.2039e-04 8.5103e-04 5.0903e-01 0.6107
Tongdao 4.6482e-01 -6.9870e-03 1.9879e-01 1.0582e+00 0.2900
Wangcheng 4.4220e+00 -1.1067e-01 1.3596e+00 3.8873e+00 0.0001
Wugang 7.1003e-01 -7.8144e-03 1.0710e-01 2.1935e+00 0.0283
Xiangtan 2.4530e-01 -3.6457e-04 3.2319e-03 4.3213e+00 0.0000
Xiangxiang 2.6271e-01 -1.2703e-03 2.1290e-02 1.8092e+00 0.0704
Xiangyin 5.4525e-01 -4.7442e-03 7.9236e-02 1.9539e+00 0.0507
Xinhua 1.1810e-01 -6.2649e-03 8.6001e-02 4.2409e-01 0.6715
Xinhuang 1.5725e-01 -4.1820e-03 3.6648e-01 2.6667e-01 0.7897
Xinning 6.8928e-01 -9.6674e-03 2.0328e-01 1.5502e+00 0.1211
Xinshao 5.7578e-02 -8.5932e-03 1.1769e-01 1.9289e-01 0.8470
Xintian -7.4050e-03 -5.1493e-03 1.0877e-01 -6.8395e-03 0.9945
Xupu 3.2406e-01 -5.7468e-03 5.7735e-02 1.3726e+00 0.1699
Yanling -6.9021e-02 -5.9211e-04 9.9306e-03 -6.8667e-01 0.4923
Yizhang -2.6844e-01 -2.2463e-03 4.7588e-02 -1.2202e+00 0.2224
Yongshun 6.3064e-01 -1.1350e-02 1.8830e-01 1.4795e+00 0.1390
Yongxing 4.3411e-01 -9.0735e-03 1.5088e-01 1.1409e+00 0.2539
You 7.8750e-02 -7.2728e-03 1.2116e-01 2.4714e-01 0.8048
Yuanjiang 2.0004e-04 -1.7760e-04 2.9798e-03 6.9181e-03 0.9945
Yuanling 8.7298e-03 -2.2981e-06 2.3221e-05 1.8121e+00 0.0700
Yueyang 4.1189e-02 -1.9768e-04 2.3113e-03 8.6085e-01 0.3893
Zhijiang 1.0476e-01 -7.8123e-04 1.3100e-02 9.2214e-01 0.3565
Zhongfang -2.2685e-01 -2.1455e-03 3.5927e-02 -1.1855e+00 0.2358
Zhuzhou 3.2864e-01 -5.2432e-04 7.2391e-03 3.8688e+00 0.0001
Zixing -7.6849e-01 -8.8210e-02 9.4057e-01 -7.0144e-01 0.4830
Mapping the local Moran’s I
Before mapping the local Moran’s I map, we need to append the local Moran’s I data frame (localMI
) onto the Hunan’s spatial polygon data frame by using cbind()
= cbind(hunan, localMI) %>% #pipe
hunan.localMI rename(Pr.Ii = Pr.z....E.Ii..)
After creating the the new data frame hunan.localMI
, we can use the tmap package to plot the local Moran’s I values
tm_shape(hunan.localMI) +
tm_fill(col="Ii", #note that actual value is Ii
style="pretty", palette = "PuRd", title = "Local Moran's I Statistics") +
tm_borders(alpha = 0.5)
Mapping local Moran’s I p-values
The choropleth map shows that there is evidence for both positive & negative li values. However, we need to consider the p-values for each of these values to determine if they are statistically significant
By using breaks and fixed style, we can determine which are the areas that are statistically significant
tm_shape(hunan.localMI) +
tm_fill(col="Pr.Ii", #note that p value is Pr.Ii
breaks=c(-Inf, 0.001, 0.01, 0.05, Inf),
style="fixed",
palette = "-Greens", title = "Local Moran's I p values") + tm_borders(alpha = 0.5)
It is recommended to plot the local Moran’s I values map and its associated p-values map side by side for effective interpretation, we can use tmap_arrange()
to accomplish that.
= tm_shape(hunan.localMI) +
localMI.map tm_fill(col="Ii", #note that actual value is li
style="pretty", palette = "PuRd", title = "Local Moran's I Statistics") +
tm_borders(alpha = 0.5)
= tm_shape(hunan.localMI) +
pvalue.map tm_fill(col="Pr.Ii", #note that p value is Pr.li
breaks=c(-Inf, 0.001, 0.01, 0.05, Inf),
style="fixed",
palette = "-Greens", title = "Local Moran's I p values") + tm_borders(alpha = 0.5)
tmap_arrange(localMI.map, pvalue.map, asp=1, ncol=2)
The Null Hypothesis of Local Moran’s I Statistics
The null hypothesis of Local Moran’s I statistics is that there is no correlation between the value at one site and the values at other locations close by. (Long, n.d.)
Analysis of Results of Local Moran’s I Statistics - Dissimilar Features (< 0)
The figure below shows the various clusters boxed up that are considered outliers as their I value is less than zero.
After superimposing it with the p value map, we can infer that
Only 2 areas are statistically significant (labelled by sig), which we can reject the null hypothesis to conclude that there is indeed a correlation in these 2 areas that their neighbouring features having dissimilar characteristics.
All other regions that does not have a sig label, the null hypothesis is accepted and they have a negative local Moran I value purely due to chance.
Analysis of Results of Local Moran’s I Statistics - Similar Features (>= 0)
The figure below shows the various clusters boxed up with similarly high or low attribute values as the local Moran I Statistics is more than or equal to zero.
After superimposing it with the p value map, we can infer that
Cluster A is the most statistically significant, the area GDPPC is highly influence by its neighbours as we reject the null hypothesis. Only 2 areas has very different features as explained in the previous section. The 2 dissimilar area however, seems to suggest that they are outskirt of cluster A.
In Cluster B, only 4 sites are influence by one another, however the influence is weak as the I statistics is between zero and one
In cluster C, it looks like only its first degree neighbour has some influence over the GDPPC of the area in the statistically significant lone area
In all other regions. the null hypothesis is accepted and they have a positive local Moran I value purely due to chance.
The relevant sites are color coded on the LISA Cluster Map according to the type of spatial autocorrelation.
The Moran scatterplot must first be drawn before we can create the LISA cluster map.
Plotting Moran Scatterplot
A helpful visual tool for exploratory analysis is the Moran scatter plot, which helps one to judge how similar an observed value is to its nearby observations.
The y axis, also referred to as the response axis , is dependent on the values of the observations.
Based on the weighted average or spatial lag of the corresponding observation on the X axis, the Y axis is constructed.
= moran.plot(hunan$GDPPC, rs_wm_q,
nci labels=as.character(hunan$County),
xlab = "GDPPC 2012",
ylab="Spatially lag GDPPC 2012",
xlim=c(0, 90000), ylim=c(0,60000), pch=5
)
The plot is split into 4 quadrants, below is an example of what each quadrant represents.
The global Moran’s I is estimated from the slope of the regression line. The relative density of the dots in the correlation quadrants shows how association between high and/or low values determines the overall measure of spatial relationship. (Figure 5, Gomez, et al, 2011)
Analysis
From the resulting plot, we can see that majority of the points are positively correlated but are below the average.
The areas that are above the average in the high-high quadrant are likely represented by purple and dark red spots on the local Moran’s I map in Cluster A.
ZiXing and LengShuiJiang are likely the 2 areas with dissimilar features in cluster A as previously explained.
Preparing LISA map classes
Create the quadrants
= vector(mode="numeric",length=nrow(localMI)) quadrant
Center the variable of interest around its mean
$lag_GDPPC = lag.listw(rs_wm_q, hunan$GDPPC) hunan= hunan$lag_GDPPC - mean(hunan$lag_GDPPC) DV
Center the local Moran’s I value around the mean
#local moran = (localMI[,1] - mean(localMI[,1])) LM_I
Setup the statistically significant levels for the local Moran
= 0.05 signif
Define the quadrants levels
These four command lines define the low-low (1), low-high (2), high-low (3) and high-high (4) categories.
#L_MI = Local Moran I ard mean <0 & LM_I>0] = 1 quadrant[DV >0 & LM_I<0] = 2 quadrant[DV <0 & LM_I<0] = 3 quadrant[DV >0 & LM_I>0] = 4 quadrant[DV
Place non significant Moran into category 0
5]>signif] = 0 quadrant[localMI[,
Plotting the LISA Map
$quadrant <- quadrant hunan.localMI= c("white", "#2c7bb6", "#abd9e9", "#fdae61", "#d7191c") colors = c("insignificant", "low-low", "low-high", "high-low", "high-high") clusters tm_shape(hunan.localMI) + tm_fill(col="quadrant", style="cat", palette = colors[c(sort(unique(quadrant)))+1], labels = clusters[c(sort(unique(quadrant)))+1]) + tm_borders(alpha=0.5)
For effective interpretation, it is better to plot both the LISA map and its GDPPC map next to each other.
<- qtm(hunan, "GDPPC")
gdppc
$quadrant <- quadrant
hunan.localMI= c("white", "#2c7bb6", "#abd9e9", "#fdae61", "#d7191c")
colors = c("insignificant", "low-low", "low-high", "high-low", "high-high")
clusters
= tm_shape(hunan.localMI) +
LISAMap tm_fill(col="quadrant", style="cat",
palette = colors[c(sort(unique(quadrant)))+1],
labels = clusters[c(sort(unique(quadrant)))+1],
popup.vars = c("")) +
tm_borders(alpha=0.5)
#tmap_arrange(localMI.map, LISAMap, pvalue.map, asp=1, ncol=3)
tmap_arrange(gdppc, LISAMap, asp=1, ncol=2)
Analysis
Comparing the GDPPC and LISA maps, it tallies with the analysis in the Local Moran’s section that the dissimilar areas have low GDPPC, while similar regions have high GDPPC in cluster A
There are also 2 low high areas in cluster B, these are outliers that neighbours affects its GDPPC. They are likely to be ZhuZhou and XiangTan in the Moran Scatter plot
In cluster C, the significant area is likely PingJiang as an outlier in the Moran Scatter plot, where most neighbouring counties have low GDPPC, while it has a GDPPC of between 20k to 40k. However, in the LISA plot, it is insignificant.
The small area at the center of the map, although it has high GDPPC, but only has 3 neighbours, as the number of neighbours is small, it has been considered to be statistically insignificant in hte LISA Map
For reference, the figure below was previously discussed in the Local Moran’s Section.
Hot Spot and Cold Spot Area Analysis
Beside detecting cluster and outliers, localised spatial statistics can be also used to detect hot spot and/or cold spot areas.
The term ‘hot spot’ has been used generically across disciplines to describe a region or value that is higher relative to its surroundings (Lepers et al 2005, Aben et al 2012, Isobe et al 2015).
A hot spot is a location where high values cluster together
A cold spot is a location place where low values cluster together
• Moran’s I and Geary’s C cannot distinguish them
• They only indicate clustering
• Cannot tell if these are hot spots, cold spots, or both
Getis and Ord’s G-Statistics
The G statistic distinguishes between hot spots and cold spots. It identifies spatial concentrations.
G is relatively large if high values cluster together
G is relatively low if low values cluster together
The General G statistic is interpreted relative to its mean (or expected) value. The value for which there is no spatial association
G > expected value -> potential “hot spots”
G < expected value -> potential “cold spots”
The analysis consists of three steps:
Deriving spatial weight matrix
Computing Gi statistics
Mapping Gi statistics
Deriving distance-based weight matrix
We must first specify a new set of neighbours. While the spatial autocorrelation took into account units that shared borders, in Getis-Ord, neighbours are determined based on distance. There are 2 types of distance-based proximity matrix, they are:
fixed distance weight matrix; and
adaptive distance weight matrix.
To get our longitude values we map the st_centroid()
function over the geometry column of us.bound and access the longitude value through double bracket notation [[]] and 1. This allows us to get only the longitude, which is the first value in each centroid.
The reason why G statistics requires centroids is because it is based on point pattern analysis logic, which is very different from LISA or Local Moran’s I that compares local-global correlation.
= map_dbl(hunan$geometry, ~st_centroid(.x)[[1]]) longitude
We do the same for latitude with one key difference. We access the second value per each centroid with [[2]].
<- map_dbl(hunan$geometry, ~st_centroid(.x)[[2]]) latitude
Now that we have latitude and longitude, we use cbind()
to put longitude and latitude into the same object.
= cbind(longitude, latitude) coord
Determine the cut-off distance
Find the lower and upper bounds
Using the k nearest neighbour (knn) algorithm, we can return a matrix with indices of points that belongs to the set of k nearest neighbours of each others by using
knearneigh()
of spdepConvert the knn objects into a neighbours list of class nb with a list of integer vectors containing neighbour region number ids by using
knn2nb()
Return the length of neighbour relationship edges by using
nbdists()
of spdep. The function returns in the units of coordinates if the coordinates are projected, in km otherwise.Remove the list structure of the return objects by using
unlist()
= knn2nb(knearneigh(coord)) #returns a list of nb objects from the result of k nearest neighbours matrix, Step 1 & 2 k1 = unlist(nbdists(k1, coord, longlat = TRUE)) #return the length of neighbour relationship edges and remove the list structures, Step 3 & 4 k1dist summary(k1dist)
Min. 1st Qu. Median Mean 3rd Qu. Max. 24.79 32.57 38.01 39.07 44.52 61.79
From the result, the largest first nearest neighbour is 61.79km, hence by using this as the upper bound, we can be certain that all units will have at least 1 neighbour
dnearneigh
will be used to compute the distance weight matrix= dnearneigh(coord, 0, 62, longlat = TRUE) wm_d62 wm_d62
Neighbour list object: Number of regions: 88 Number of nonzero links: 324 Percentage nonzero weights: 4.183884 Average number of links: 3.681818
Next
nb2listw()
is used to convert the nb object into spatial weights objects= nb2listw(wm_d62, style="B") wm62_lw summary(wm62_lw)
Characteristics of weights list object: Neighbour list object: Number of regions: 88 Number of nonzero links: 324 Percentage nonzero weights: 4.183884 Average number of links: 3.681818 Link number distribution: 1 2 3 4 5 6 6 15 14 26 20 7 6 least connected regions: 6 15 30 32 56 65 with 1 link 7 most connected regions: 21 28 35 45 50 52 82 with 6 links Weights style: B Weights constants summary: n nn S0 S1 S2 B 88 7744 324 648 5440
The fixed distance weight matrix has the property that locations with higher densities of habitation (often urban areas) tend to have more neighbours, whereas areas with lower densities (typically rural areas) tend to have fewer neighbours.
By enforcing symmetry or accepting asymmetric neighbours, as shown in the code below, it is possible to control the number of neighbours of each region using the knn algorithm.
= knn2nb(knearneigh(coord, k=8)) knn8
Next
nb2listw()
is used to convert the nb object into spatial weights objects= nb2listw(knn8, style = "B") knn_lw summary(knn_lw)
Characteristics of weights list object: Neighbour list object: Number of regions: 88 Number of nonzero links: 704 Percentage nonzero weights: 9.090909 Average number of links: 8 Non-symmetric neighbours list Link number distribution: 8 88 88 least connected regions: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 with 8 links 88 most connected regions: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 with 8 links Weights style: B Weights constants summary: n nn S0 S1 S2 B 88 7744 704 1300 23014
Computing Gi statistics
Gi statistics using fixed distance (G Statistics)
= order(hunan$County)
fips = localG(hunan$GDPPC, wm62_lw, return_internals = TRUE)
gi.fixed gi.fixed
[1] 0.436075843 -0.265505650 -0.073033665 0.413017033 0.273070579
[6] -0.377510776 2.863898821 2.794350420 5.216125401 0.228236603
[11] 0.951035346 -0.536334231 0.176761556 1.195564020 -0.033020610
[16] 1.378081093 -0.585756761 -0.419680565 0.258805141 0.012056111
[21] -0.145716531 -0.027158687 -0.318615290 -0.748946051 -0.961700582
[26] -0.796851342 -1.033949773 -0.460979158 -0.885240161 -0.266671512
[31] -0.886168613 -0.855476971 -0.922143185 -1.162328599 0.735582222
[36] -0.003358489 -0.967459309 -1.259299080 -1.452256513 -1.540671121
[41] -1.395011407 -1.681505286 -1.314110709 -0.767944457 -0.192889342
[46] 2.720804542 1.809191360 -1.218469473 -0.511984469 -0.834546363
[51] -0.908179070 -1.541081516 -1.192199867 -1.075080164 -1.631075961
[56] -0.743472246 0.418842387 0.832943753 -0.710289083 -0.449718820
[61] -0.493238743 -1.083386776 0.042979051 0.008596093 0.136337469
[66] 2.203411744 2.690329952 4.453703219 -0.340842743 -0.129318589
[71] 0.737806634 -1.246912658 0.666667559 1.088613505 -0.985792573
[76] 1.233609606 -0.487196415 1.626174042 -1.060416797 0.425361422
[81] -0.837897118 -0.314565243 0.371456331 4.424392623 -0.109566928
[86] 1.364597995 -1.029658605 -0.718000620
attr(,"internals")
Gi E(Gi) V(Gi) Z(Gi) Pr(z != E(Gi))
[1,] 0.064192949 0.05747126 2.375922e-04 0.436075843 6.627817e-01
[2,] 0.042300020 0.04597701 1.917951e-04 -0.265505650 7.906200e-01
[3,] 0.044961480 0.04597701 1.933486e-04 -0.073033665 9.417793e-01
[4,] 0.039475779 0.03448276 1.461473e-04 0.413017033 6.795941e-01
[5,] 0.049767939 0.04597701 1.927263e-04 0.273070579 7.847990e-01
[6,] 0.008825335 0.01149425 4.998177e-05 -0.377510776 7.057941e-01
[7,] 0.050807266 0.02298851 9.435398e-05 2.863898821 4.184617e-03
[8,] 0.083966739 0.04597701 1.848292e-04 2.794350420 5.200409e-03
[9,] 0.115751554 0.04597701 1.789361e-04 5.216125401 1.827045e-07
[10,] 0.049115587 0.04597701 1.891013e-04 0.228236603 8.194623e-01
[11,] 0.045819180 0.03448276 1.420884e-04 0.951035346 3.415864e-01
[12,] 0.049183846 0.05747126 2.387633e-04 -0.536334231 5.917276e-01
[13,] 0.048429181 0.04597701 1.924532e-04 0.176761556 8.596957e-01
[14,] 0.034733752 0.02298851 9.651140e-05 1.195564020 2.318667e-01
[15,] 0.011262043 0.01149425 4.945294e-05 -0.033020610 9.736582e-01
[16,] 0.065131196 0.04597701 1.931870e-04 1.378081093 1.681783e-01
[17,] 0.027587075 0.03448276 1.385862e-04 -0.585756761 5.580390e-01
[18,] 0.029409313 0.03448276 1.461397e-04 -0.419680565 6.747188e-01
[19,] 0.061466754 0.05747126 2.383385e-04 0.258805141 7.957856e-01
[20,] 0.057656917 0.05747126 2.371303e-04 0.012056111 9.903808e-01
[21,] 0.066518379 0.06896552 2.820326e-04 -0.145716531 8.841452e-01
[22,] 0.045599896 0.04597701 1.928108e-04 -0.027158687 9.783332e-01
[23,] 0.030646753 0.03448276 1.449523e-04 -0.318615290 7.500183e-01
[24,] 0.035635552 0.04597701 1.906613e-04 -0.748946051 4.538897e-01
[25,] 0.032606647 0.04597701 1.932888e-04 -0.961700582 3.362000e-01
[26,] 0.035001352 0.04597701 1.897172e-04 -0.796851342 4.255374e-01
[27,] 0.012746354 0.02298851 9.812587e-05 -1.033949773 3.011596e-01
[28,] 0.061287917 0.06896552 2.773884e-04 -0.460979158 6.448136e-01
[29,] 0.014277403 0.02298851 9.683314e-05 -0.885240161 3.760271e-01
[30,] 0.009622875 0.01149425 4.924586e-05 -0.266671512 7.897221e-01
[31,] 0.014258398 0.02298851 9.705244e-05 -0.886168613 3.755267e-01
[32,] 0.005453443 0.01149425 4.986245e-05 -0.855476971 3.922871e-01
[33,] 0.043283712 0.05747126 2.367109e-04 -0.922143185 3.564539e-01
[34,] 0.020763514 0.03448276 1.393165e-04 -1.162328599 2.451020e-01
[35,] 0.081261843 0.06896552 2.794398e-04 0.735582222 4.619850e-01
[36,] 0.057419907 0.05747126 2.338437e-04 -0.003358489 9.973203e-01
[37,] 0.013497133 0.02298851 9.624821e-05 -0.967459309 3.333145e-01
[38,] 0.019289310 0.03448276 1.455643e-04 -1.259299080 2.079223e-01
[39,] 0.025996272 0.04597701 1.892938e-04 -1.452256513 1.464303e-01
[40,] 0.016092694 0.03448276 1.424776e-04 -1.540671121 1.233968e-01
[41,] 0.035952614 0.05747126 2.379439e-04 -1.395011407 1.630124e-01
[42,] 0.031690963 0.05747126 2.350604e-04 -1.681505286 9.266481e-02
[43,] 0.018750079 0.03448276 1.433314e-04 -1.314110709 1.888090e-01
[44,] 0.015449080 0.02298851 9.638666e-05 -0.767944457 4.425202e-01
[45,] 0.065760689 0.06896552 2.760533e-04 -0.192889342 8.470456e-01
[46,] 0.098966900 0.05747126 2.326002e-04 2.720804542 6.512325e-03
[47,] 0.085415780 0.05747126 2.385746e-04 1.809191360 7.042128e-02
[48,] 0.038816536 0.05747126 2.343951e-04 -1.218469473 2.230456e-01
[49,] 0.038931873 0.04597701 1.893501e-04 -0.511984469 6.086619e-01
[50,] 0.055098610 0.06896552 2.760948e-04 -0.834546363 4.039732e-01
[51,] 0.033405005 0.04597701 1.916312e-04 -0.908179070 3.637836e-01
[52,] 0.043040784 0.06896552 2.829941e-04 -1.541081516 1.232969e-01
[53,] 0.011297699 0.02298851 9.615920e-05 -1.192199867 2.331829e-01
[54,] 0.040968457 0.05747126 2.356318e-04 -1.075080164 2.823388e-01
[55,] 0.023629663 0.04597701 1.877170e-04 -1.631075961 1.028743e-01
[56,] 0.006281129 0.01149425 4.916619e-05 -0.743472246 4.571958e-01
[57,] 0.063918654 0.05747126 2.369553e-04 0.418842387 6.753313e-01
[58,] 0.070325003 0.05747126 2.381374e-04 0.832943753 4.048765e-01
[59,] 0.025947288 0.03448276 1.444058e-04 -0.710289083 4.775249e-01
[60,] 0.039752578 0.04597701 1.915656e-04 -0.449718820 6.529132e-01
[61,] 0.049934283 0.05747126 2.334965e-04 -0.493238743 6.218439e-01
[62,] 0.030964195 0.04597701 1.920248e-04 -1.083386776 2.786368e-01
[63,] 0.058129184 0.05747126 2.343319e-04 0.042979051 9.657182e-01
[64,] 0.046096514 0.04597701 1.932637e-04 0.008596093 9.931414e-01
[65,] 0.012459080 0.01149425 5.008051e-05 0.136337469 8.915545e-01
[66,] 0.091447733 0.05747126 2.377744e-04 2.203411744 2.756574e-02
[67,] 0.049575872 0.02298851 9.766513e-05 2.690329952 7.138140e-03
[68,] 0.107907212 0.04597701 1.933581e-04 4.453703219 8.440175e-06
[69,] 0.019616151 0.02298851 9.789454e-05 -0.340842743 7.332220e-01
[70,] 0.032923393 0.03448276 1.454032e-04 -0.129318589 8.971056e-01
[71,] 0.030317663 0.02298851 9.867859e-05 0.737806634 4.606320e-01
[72,] 0.019437582 0.03448276 1.455870e-04 -1.246912658 2.124295e-01
[73,] 0.055245460 0.04597701 1.932838e-04 0.666667559 5.049845e-01
[74,] 0.074278054 0.05747126 2.383538e-04 1.088613505 2.763244e-01
[75,] 0.013269580 0.02298851 9.719982e-05 -0.985792573 3.242349e-01
[76,] 0.049407829 0.03448276 1.463785e-04 1.233609606 2.173484e-01
[77,] 0.028605749 0.03448276 1.455139e-04 -0.487196415 6.261191e-01
[78,] 0.039087662 0.02298851 9.801040e-05 1.626174042 1.039126e-01
[79,] 0.031447120 0.04597701 1.877464e-04 -1.060416797 2.889550e-01
[80,] 0.064005294 0.05747126 2.359641e-04 0.425361422 6.705732e-01
[81,] 0.044606529 0.05747126 2.357330e-04 -0.837897118 4.020885e-01
[82,] 0.063700493 0.06896552 2.801427e-04 -0.314565243 7.530918e-01
[83,] 0.051142205 0.04597701 1.933560e-04 0.371456331 7.102977e-01
[84,] 0.102121112 0.04597701 1.610278e-04 4.424392623 9.671399e-06
[85,] 0.021901462 0.02298851 9.843172e-05 -0.109566928 9.127528e-01
[86,] 0.064931813 0.04597701 1.929430e-04 1.364597995 1.723794e-01
[87,] 0.031747344 0.04597701 1.909867e-04 -1.029658605 3.031703e-01
[88,] 0.015893319 0.02298851 9.765131e-05 -0.718000620 4.727569e-01
attr(,"cluster")
[1] Low Low High High High High High High High Low Low High Low Low Low
[16] High High High High Low High High Low Low High Low Low Low Low Low
[31] Low Low Low High Low Low Low Low Low Low High Low Low Low Low
[46] High High Low Low Low Low High Low Low Low Low Low High Low Low
[61] Low Low Low High High High Low High Low Low High Low High High Low
[76] High Low Low Low Low Low Low High High Low High Low Low
Levels: Low High
attr(,"gstari")
[1] FALSE
attr(,"call")
localG(x = hunan$GDPPC, listw = wm62_lw, return_internals = TRUE)
attr(,"class")
[1] "localG"
The output of localG()
is a vector of G or Gstar values, with attributes “gstari
” set to TRUE or FALSE, “call” set to the function call, and class “localG”.
The Gi statistics is represented as a Z-score. Greater values represent a greater intensity of clustering and the direction (positive or negative) indicates high or low clusters.
Next, we will join the Gi values to their corresponding hunan sf data frame by using the cbind()
= cbind(hunan, as.matrix(gi.fixed)) %>% #pipe
hunan.gi rename(gstat_fixed = as.matrix.gi.fixed.)
The code above performs 3tasks.
- First, it convert the output vector (i.e.
gi.fixed
) into r matrix object by usingas.matrix()
. cbind()
is used to join hunan@data and gi.fixed matrix to produce a new SpatialPolygonDataFrame called hunan.gi.
the field name of the gi values is renamed to gstat_fixed by using rename().
Mapping Gi values with fixed distance weights
We plot the map and the gimap side by side for analysis
= tm_shape(hunan.gi) +
Gimap_fixed tm_fill(col = "gstat_fixed",
style = "pretty",
palette="-RdBu",
title = "local Gi") +
tm_borders(alpha = 0.5)
tmap_arrange(gdppc, Gimap_fixed, asp=1, ncol=2)
Variable(s) "gstat_fixed" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
Analysis
Comparing with Local Moran’s map that was discussed below, we can see that the dissimilar areas were classified as a hot spot.
Conversely, the spatial outliers in Cluster B and C were groups as cold spots in the G Statistics Map.
The reason for this phenomenon is that the G statistics does not take into account outliers as it does not consider spatial correlation. It only take into account hot spots & cold spots.
From the result, we can infer that the GDPPC of the hot spot regions has high GDPPC, while the cold spot regions has low GDPPC.
Mapping Gi values with adaptive distance weights (G* Statistics)
The code below is used to compute the Gi values for GDPPC 2012 by using an adaptive distance weight matrix (i.e knb_lw
).
= order(hunan$County)
fips = localG(hunan$GDPPC, knn_lw, return_internals = TRUE)
gi.adaptive gi.adaptive
[1] 0.274428799 0.300225037 0.030447697 -0.009771412 -0.033921570
[6] -0.154780126 4.034649782 2.057586016 4.378892586 1.479129376
[11] 0.761743842 -0.648205275 -0.773677838 0.589236922 1.040407601
[16] 0.368526533 -0.604240867 -0.241840937 0.031714037 -0.110547691
[21] 0.761314356 1.175580259 -0.884714136 -0.860993329 -1.643096490
[26] -1.290687016 -1.422253022 -0.675281508 -1.719511109 -1.210266137
[31] -1.300914263 -1.599085669 -1.298761870 -1.836622587 1.637619520
[36] -0.721435309 -1.958848641 -1.665195897 -1.868014845 -1.183536130
[41] -0.169560764 -2.084882362 -2.181780084 -2.081025645 -0.499000625
[46] 2.194733590 2.495469794 -1.695557884 -0.745540634 -1.193763093
[51] -1.821073681 -1.894085866 -1.570969008 -1.055766446 -1.299966539
[56] -0.201823610 0.498063690 0.581955247 -0.876827566 -0.955484907
[61] -0.723004897 -0.790993867 -0.183585082 1.129758266 2.271097895
[66] 3.047193741 4.995149600 4.022126163 -0.313165513 0.384924896
[71] 3.018245449 0.561045961 0.210102660 4.365942776 -1.210175378
[76] 2.391729501 -1.188720061 3.068344267 -0.600223372 1.046676007
[81] -1.427632954 -0.156355526 1.176546366 3.726230897 -0.327758027
[86] 2.972571047 -1.009008013 -0.989393051
attr(,"internals")
Gi E(Gi) V(Gi) Z(Gi) Pr(z != E(Gi))
[1,] 0.09720587 0.09195402 0.0003662397 0.274428799 7.837551e-01
[2,] 0.09769063 0.09195402 0.0003651040 0.300225037 7.640055e-01
[3,] 0.09253816 0.09195402 0.0003680612 0.030447697 9.757100e-01
[4,] 0.09176695 0.09195402 0.0003665281 -0.009771412 9.922037e-01
[5,] 0.09130429 0.09195402 0.0003668767 -0.033921570 9.729397e-01
[6,] 0.08898762 0.09195402 0.0003673079 -0.154780126 8.769947e-01
[7,] 0.16751891 0.09195402 0.0003507748 4.034649782 5.468380e-05
[8,] 0.13054918 0.09195402 0.0003518436 2.057586016 3.962989e-02
[9,] 0.17277103 0.09195402 0.0003406253 4.378892586 1.192839e-05
[10,] 0.12001759 0.09195402 0.0003599760 1.479129376 1.391057e-01
[11,] 0.10633361 0.09195402 0.0003563487 0.761743842 4.462129e-01
[12,] 0.07951853 0.09195402 0.0003680448 -0.648205275 5.168522e-01
[13,] 0.07714548 0.09195402 0.0003663568 -0.773677838 4.391213e-01
[14,] 0.10311529 0.09195402 0.0003587953 0.589236922 5.557024e-01
[15,] 0.11178796 0.09195402 0.0003634216 1.040407601 2.981506e-01
[16,] 0.09902122 0.09195402 0.0003677535 0.368526533 7.124807e-01
[17,] 0.08068910 0.09195402 0.0003475655 -0.604240867 5.456835e-01
[18,] 0.08732412 0.09195402 0.0003665092 -0.241840937 8.089034e-01
[19,] 0.09256190 0.09195402 0.0003673900 0.031714037 9.747001e-01
[20,] 0.08984049 0.09195402 0.0003655276 -0.110547691 9.119750e-01
[21,] 0.10653391 0.09195402 0.0003667585 0.761314356 4.464693e-01
[22,] 0.11447605 0.09195402 0.0003670374 1.175580259 2.397626e-01
[23,] 0.07508563 0.09195402 0.0003635312 -0.884714136 3.763108e-01
[24,] 0.07555112 0.09195402 0.0003629457 -0.860993329 3.892417e-01
[25,] 0.06043622 0.09195402 0.0003679474 -1.643096490 1.003630e-01
[26,] 0.06742593 0.09195402 0.0003611483 -1.290687016 1.968122e-01
[27,] 0.06478946 0.09195402 0.0003647974 -1.422253022 1.549528e-01
[28,] 0.07912867 0.09195402 0.0003607191 -0.675281508 4.994969e-01
[29,] 0.05932898 0.09195402 0.0003599915 -1.719511109 8.552135e-02
[30,] 0.06893033 0.09195402 0.0003618998 -1.210266137 2.261768e-01
[31,] 0.06724327 0.09195402 0.0003608067 -1.300914263 1.932878e-01
[32,] 0.06134370 0.09195402 0.0003664310 -1.599085669 1.098016e-01
[33,] 0.06714525 0.09195402 0.0003648812 -1.298761870 1.940257e-01
[34,] 0.05762358 0.09195402 0.0003493969 -1.836622587 6.626563e-02
[35,] 0.12317148 0.09195402 0.0003633868 1.637619520 1.015011e-01
[36,] 0.07825698 0.09195402 0.0003604615 -0.721435309 4.706417e-01
[37,] 0.05490035 0.09195402 0.0003578169 -1.958848641 5.013052e-02
[38,] 0.06013762 0.09195402 0.0003650661 -1.665195897 9.587368e-02
[39,] 0.05649408 0.09195402 0.0003603425 -1.868014845 6.176000e-02
[40,] 0.06958160 0.09195402 0.0003573248 -1.183536130 2.365967e-01
[41,] 0.08870667 0.09195402 0.0003667818 -0.169560764 8.653556e-01
[42,] 0.05226797 0.09195402 0.0003623370 -2.084882362 3.707998e-02
[43,] 0.05058836 0.09195402 0.0003594662 -2.181780084 2.912577e-02
[44,] 0.05256094 0.09195402 0.0003583316 -2.081025645 3.743156e-02
[45,] 0.08249954 0.09195402 0.0003589829 -0.499000625 6.177789e-01
[46,] 0.13351191 0.09195402 0.0003585448 2.194733590 2.818271e-02
[47,] 0.13980943 0.09195402 0.0003677540 2.495469794 1.257905e-02
[48,] 0.05972453 0.09195402 0.0003613115 -1.695557884 8.996964e-02
[49,] 0.07779955 0.09195402 0.0003604495 -0.745540634 4.559450e-01
[50,] 0.06933428 0.09195402 0.0003590369 -1.193763093 2.325707e-01
[51,] 0.05717238 0.09195402 0.0003647919 -1.821073681 6.859566e-02
[52,] 0.05561872 0.09195402 0.0003680088 -1.894085866 5.821361e-02
[53,] 0.06225124 0.09195402 0.0003574860 -1.570969008 1.161898e-01
[54,] 0.07183294 0.09195402 0.0003632178 -1.055766446 2.910749e-01
[55,] 0.06738016 0.09195402 0.0003573408 -1.299966539 1.936124e-01
[56,] 0.08811771 0.09195402 0.0003613143 -0.201823610 8.400546e-01
[57,] 0.10147288 0.09195402 0.0003652580 0.498063690 6.184392e-01
[58,] 0.10310390 0.09195402 0.0003670801 0.581955247 5.605968e-01
[59,] 0.07526754 0.09195402 0.0003621606 -0.876827566 3.805803e-01
[60,] 0.07370784 0.09195402 0.0003646671 -0.955484907 3.393325e-01
[61,] 0.07823737 0.09195402 0.0003599264 -0.723004897 4.696769e-01
[62,] 0.07683091 0.09195402 0.0003655412 -0.790993867 4.289476e-01
[63,] 0.08846487 0.09195402 0.0003612141 -0.183585082 8.543390e-01
[64,] 0.11362359 0.09195402 0.0003678997 1.129758266 2.585781e-01
[65,] 0.13552322 0.09195402 0.0003680335 2.271097895 2.314105e-02
[66,] 0.15029172 0.09195402 0.0003665206 3.047193741 2.309888e-03
[67,] 0.18713548 0.09195402 0.0003630845 4.995149600 5.879018e-07
[68,] 0.16912010 0.09195402 0.0003680793 4.022126163 5.767515e-05
[69,] 0.08597972 0.09195402 0.0003639373 -0.313165513 7.541549e-01
[70,] 0.09930460 0.09195402 0.0003646621 0.384924896 7.002931e-01
[71,] 0.14976364 0.09195402 0.0003668522 3.018245449 2.542429e-03
[72,] 0.10267460 0.09195402 0.0003651229 0.561045961 5.747662e-01
[73,] 0.09598415 0.09195402 0.0003679379 0.210102660 8.335875e-01
[74,] 0.17564058 0.09195402 0.0003674137 4.365942776 1.265756e-05
[75,] 0.06894940 0.09195402 0.0003613546 -1.210175378 2.262116e-01
[76,] 0.13777971 0.09195402 0.0003671080 2.391729501 1.676920e-02
[77,] 0.06924543 0.09195402 0.0003649397 -1.188720061 2.345498e-01
[78,] 0.15052389 0.09195402 0.0003643681 3.068344267 2.152485e-03
[79,] 0.08060684 0.09195402 0.0003573967 -0.600223372 5.483574e-01
[80,] 0.11191592 0.09195402 0.0003637301 1.046676007 2.952490e-01
[81,] 0.06473996 0.09195402 0.0003633737 -1.427632954 1.533975e-01
[82,] 0.08896972 0.09195402 0.0003643008 -0.156355526 8.757528e-01
[83,] 0.11452640 0.09195402 0.0003680752 1.176546366 2.393766e-01
[84,] 0.15719339 0.09195402 0.0003065349 3.726230897 1.943644e-04
[85,] 0.08568420 0.09195402 0.0003659344 -0.327758027 7.430946e-01
[86,] 0.14892272 0.09195402 0.0003672891 2.972571047 2.953169e-03
[87,] 0.07271488 0.09195402 0.0003635650 -1.009008013 3.129708e-01
[88,] 0.07310269 0.09195402 0.0003630331 -0.989393051 3.224709e-01
attr(,"cluster")
[1] Low Low High High High High High High High Low Low High Low Low Low
[16] High High High High Low High High Low Low High Low Low Low Low Low
[31] Low Low Low High Low Low Low Low Low Low High Low Low Low Low
[46] High High Low Low Low Low High Low Low Low Low Low High Low Low
[61] Low Low Low High High High Low High Low Low High Low High High Low
[76] High Low Low Low Low Low Low High High Low High Low Low
Levels: Low High
attr(,"gstari")
[1] FALSE
attr(,"call")
localG(x = hunan$GDPPC, listw = knn_lw, return_internals = TRUE)
attr(,"class")
[1] "localG"
= cbind(hunan, as.matrix(gi.adaptive)) %>% #pipe
hunan.gi rename(gstat_adaptive = as.matrix.gi.adaptive.)
We plot the map and the gimap side by side for analysis
= tm_shape(hunan.gi) +
Gimap_adaptive tm_fill(col = "gstat_adaptive",
style = "pretty",
palette="-RdBu",
title = "local Gi") +
tm_borders(alpha = 0.5)
tmap_arrange(gdppc, Gimap_adaptive, asp=1, ncol=2)
Variable(s) "gstat_adaptive" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_arrange(Gimap_fixed, Gimap_adaptive, asp=1, ncol=2)
Variable(s) "gstat_fixed" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
Variable(s) "gstat_adaptive" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
Analysis
The G Statistics using fixed weight is simply the spatial lag, while the G* statistics using adaptive weights is the weighted average of neighbour value at region i
This allow us to find how much weight one need to give the location relative to its neighbours.
With the G* statistics, we could tell that the hotspot has shrunk and became more intense, as relative weights were assigned, this method is more robust as compared to the G statistics which uses a consistent weight across its analysis.
From the result, we can infer that the GDPPC of the hot spot regions has high GDPPC, while the cold spot regions has low GDPPC.
Based on this exercise, the Local Moran Statistics, LISA Map and G Statistical test has gave consistent results with regards to Cluster A. Thus, we can draw the conclusion that Cluster A is likely to be an urban area with higher degree of economic activity than the rest of Hunan, which leads to a higher GDPPC.
Reference
Anselin L. (2020) Local Spatial Autocorrelation (1) LISA and Local Moran https://geodacenter.github.io/workbook/6a_local_auto/lab6a.html#local-moran
ArcGIS Pro 3.0, How Spatial Autocorrelation (Global Moran’s I) works
https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/h-how-spatial-autocorrelation-moran-s-i-spatial-st.htm
Gomez, Cristina & White, Joanne & Wulder, Michael. (2011). Characterizing the state and processes of change in a dynamic forest environment using hierarchical spatio-temporal segmentation. Remote Sensing of Environment. 115. 1665-1679. 10.1016/j.rse.2011.02.025.
Long, A (n.d.), Local Moran
http://ceadserv1.nku.edu/longa//geomed/stats/localmoran/localmoran.html