Introduction

mgwrsar package estimates linear and local linear models with spatial autocorrelation of the following forms:

\[y=\beta_v(u_i,v_i) X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (GWR)\]

\[y=\beta_c X_c+\beta_v(u_i,v_i) X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR)\]

\[y=\lambda Wy+\beta_c X_c+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,k,0))\]

\[y=\lambda Wy+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,0,k))\]

\[y=\lambda Wy+\beta_c X_c+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,k_c,k_v))\]

\[y=\lambda(u_i,v_i) Wy+\beta_c X_c+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,k,0))\]

\[y=\lambda(u_i,v_i)Wy+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,0,k))\]

\[y=\lambda(u_i,v_i)Wy+\beta_cX_c+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,k_c,k_v))\]

For a deeper understanding of the estimation method, please refer to Geniaux, G. and Martinetti, D. (2017). A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models. Regional Science and Urban Economics. (https://doi.org/10.1016/j.regsciurbeco.2017.04.001)

The estimation of the aforementioned model can be performed using the MGWRSAR wrapper function, which necessitates a dataframe and a matrix of coordinates. It’s important to note:

Both 2SLS and Best 2SLS methods are applicable. In cases where the model involves local regressions, the specification requires a bandwidth and a kernel type.
Optimal bandwidth estimation can be achieved using the golden_search_bandwidth function, considering either AICc or Leave-One-Out Cross Validation, applicable to all models, whether they involve spatial autocorrelation or not.
There are three modes of predictions available for spatially varying models (refer to Geniaux, 2024, “Speeding up estimation of spatially varying coefficients models,” forthcoming in the Journal of Geographical Systems).
Predictions for models incorporating spatial autocorrelation can be conducted using the Best Linear Unbiased Prediction method proposed by Goulard et al. (2017) (Goulard, M., Laurent, T., & Thomas-Agnan, C., 2017, “About predictions in spatial autoregressive models: Optimal and almost optimal strategies,” published in Spatial Economic Analysis, 12(2-3), 304-325).
model specifications and spatial stationarity of coefficients can be tested using bootstrap method.

In addition to enabling the consideration of spatial autocorrelation in GWR/MGWR-like models, the MGWRSAR function introduces several useful techniques for estimating local regression with space and space-time coordinates:

It employs RCCP and RCCPeigen code to expedite computations and facilitates parallel computing using the doParallel package.
It offers a method for selecting an optimal set of target points (based on a specified size) to enhance computational efficiency.
It accommodates the consideration of space-time kernels (including asymmetric time kernels and asymmetric time cycling kernels) with appropriate diagnostics (such as AICc) for determining optimal bandwidths.

Please note:

When spatial autocorrelation is implied in the model, a row-normalized spatial weight matrix must be provided.
When the model entails mixed GWR regression, the names of stationary covariates must be provided.

Estimation of GWR/MGWR with spatial dependance.

The MGWRSAR function requires a dataframe and a matrix of coordinates, and eventually a spatial weight matrix if model include spatiale dependence. The package includes a simulated data example which is used for this vignette.

library(mgwrsar)
#> Loading required package: sp
#> Loading required package: Matrix
#> Registered S3 method overwritten by 'future':
#>   method               from      
#>   all.equal.connection parallelly
## loading data example
data(mydata)
coord=as.matrix(mydata[,c("x","y")])
head(coord)
#>             x       y
#> [1,] 872142.8 6234075
#> [2,] 872894.3 6233366
#> [3,] 878615.1 6229884
#> [4,] 869936.0 6228408
#> [5,] 866441.8 6231471
#> [6,] 866952.9 6222411

head(mydata)
#>   Y_mgwrsar_1_0_kv X0       X1        X2        X3       Beta0      Beta1
#> 1       0.44146014  1 2.396274 0.4763857 0.6854775  0.11599551 0.26922206
#> 2       0.02223952  1 1.684776 0.3303000 0.8994048  0.11533789 0.25784575
#> 3       3.41607993  1 2.369278 1.5117796 0.5808293  0.20973774 0.55723064
#> 4       0.28497090  1 4.208743 1.0916590 1.4058201 -0.08706293 0.06795084
#> 5      -0.79417336  1 3.109543 0.6487661 1.6977387 -0.08209832 0.21763086
#> 6       6.78538775  1 7.284831 0.8828887 1.5085908 -0.28330886 0.52468859
#>       Beta2      Beta3    lambda     Y_ols    Y_sar       Y_gwr      Y_mgwr
#> 1 0.6772792 -1.1106223 0.2408643 0.9890454 2.010387  0.32246490  0.39829402
#> 2 0.6751576 -1.0326101 0.2273072 0.2732833 1.539485 -0.15597959 -0.12664995
#> 3 0.5623073 -0.5412975 0.3384363 2.1155894 3.637831  2.06565538  1.79922754
#> 4 1.0796267 -0.9258913 0.3543732 1.7902102 3.201578  0.07587226 -0.02831127
#> 5 1.1019372 -1.2759568 0.3772657 0.5057989 1.460611 -0.85670748 -0.38820500
#> 6 1.5488474 -0.7649646 0.4933481 3.0167134 4.654380  3.75240014  3.39782791
#>   Y_mgwrsar_0_0_kv Y_mgwrsar_0_kc_kv Y_mgwrsar_1_kc_kv Y_mgwr_outlier X2dummy
#> 1        0.5673431         0.7137761        0.55150700      0.3982940   FALSE
#> 2        0.2031676         0.2611103        0.06388119     -0.1266500    TRUE
#> 3        3.8183989         3.1882294        2.86908705      1.7992275    TRUE
#> 4        0.3314998         0.1832260        0.14519215     -0.0188881    TRUE
#> 5       -0.7858941        -0.1114446       -0.13694311     -0.3882050   FALSE
#> 6        5.8227160         5.2859121        6.16333174      3.3978279    TRUE
#>   Y_mgwr_X2dummy        x       y
#> 1     0.07564794 872142.8 6234075
#> 2    -0.34965450 872894.3 6233366
#> 3     0.94914284 878615.1 6229884
#> 4    -1.20689547 869936.0 6228408
#> 5    -1.10310449 866441.8 6231471
#> 6     2.03036799 866952.9 6222411
## plot
ggplot(mydata,aes(x,y))+geom_point(aes(color=Beta0))+ ggtitle('Spatial Pattern of coefficient Beta0')+scale_colour_gradientn(colours = terrain.colors(10))

ggplot(mydata,aes(x,y))+geom_point(aes(color=Beta1))+ ggtitle('Spatial Pattern of coefficient Beta1')+scale_colour_gradientn(colours = terrain.colors(10))

ggplot(mydata,aes(x,y))+geom_point(aes(color=lambda))+ ggtitle('Spatial Pattern of spatial autocorrelation coefficient (lambda)')+scale_colour_gradientn(colours = terrain.colors(10))


W=kernel_matW(H=4,kernels='rectangle',coords=coord,NN=5,adaptive=TRUE,diagnull=TRUE)

Note that in this package there are two ways to express a bandwidth \(H\). It can be a distance (adaptive = F) or a number of neighbors (adaptive = T).

Estimation

Estimation of a mgwrsar(1,0,kv) model with all parameter spatially varying using an adapative gaussian kernel (based on 20 nearest neighbors):



res_mgwrsar_1_0_kv<- search_bandwidths(formula ='Y_mgwrsar_1_0_kv~X1+X2+X3',data = mydata,coords = coord,kernels = c("gauss"),Model = "MGWRSAR_1_0_kv",control = list( criterion = "AICc", adaptive = c(TRUE), Type = "GD", verbose = FALSE,W=W),hs_range = c(6, nrow(coord)),ht_range =NULL,n_seq = 10,ncore = 1,show_progress = F,n_rounds = 2,refine=T,tol = c(1),verbose = TRUE)

H_opt=res_mgwrsar_1_0_kv$best$h_s

mgwrsar_1_0_kv<-MGWRSAR(formula = 'Y_mgwrsar_1_0_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars=NULL,kernels=c('gauss'),H=H_opt, Model = 'MGWRSAR_1_0_kv',control=list(SE=T,adaptive=T,W=W))
summary(mgwrsar_1_0_kv)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_0_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = NULL, kernels = c("gauss"), 
#>     H = H_opt, Model = "MGWRSAR_1_0_kv", control = list(SE = T, 
#>         adaptive = T, W = W))
#> 
#> Model: MGWRSAR_1_0_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 7 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.207 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Varying Parameters]
#>    Intercept              X1                X2               X3          
#>  Min.   :-7.72674   Min.   :0.05003   Min.   :-1.758   Min.   :-2.01204  
#>  1st Qu.:-1.75755   1st Qu.:0.35654   1st Qu.: 0.422   1st Qu.:-1.24745  
#>  Median :-0.62264   Median :0.55660   Median : 1.516   Median :-0.99979  
#>  Mean   :-0.89440   Mean   :0.51027   Mean   : 1.655   Mean   :-0.98199  
#>  3rd Qu.: 0.08947   3rd Qu.:0.64788   3rd Qu.: 2.769   3rd Qu.:-0.68450  
#>  Max.   : 2.02044   Max.   :1.05498   Max.   : 6.594   Max.   :-0.07195  
#>      lambda       
#>  Min.   :-0.3523  
#>  1st Qu.: 0.1821  
#>  Median : 0.3848  
#>  Mean   : 0.3726  
#>  3rd Qu.: 0.5621  
#>  Max.   : 1.2224  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 704.39 
#>    AICc: 3711.12 
#>    Residual sum of squares: 1031.78 
#>    RMSE: 1.0158 
#> ------------------------------------------------------

Estimation of a mgwrsar(0,0,kv) model with stationnary spatial multiplier using an adapative gaussian kernel (based on 20 nearest neighbors):

res_mgwrsar_0_0_kv<- search_bandwidths(formula ='Y_mgwrsar_0_0_kv~X1+X2+X3',data = mydata,coords = coord,kernels = c("gauss"),Model = "MGWRSAR_0_0_kv",control = list( criterion = "AICc", adaptive = c(TRUE), Type = "GD", verbose = FALSE,W=W),hs_range = c(6, nrow(coord)),ht_range =NULL,n_seq = 10,ncore = 1,show_progress = F,n_rounds = 1,refine=T,tol = c(1),verbose = TRUE)

H_opt=res_mgwrsar_0_0_kv$best$h_s

mgwrsar_0_0_kv<-MGWRSAR(formula = 'Y_mgwrsar_0_0_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars=NULL,kernels=c('gauss'),H=H_opt, Model = 'MGWRSAR_0_0_kv',control=list(SE=T,adaptive=T,W=W))
summary(mgwrsar_0_0_kv)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_0_0_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = NULL, kernels = c("gauss"), 
#>     H = H_opt, Model = "MGWRSAR_0_0_kv", control = list(SE = T, 
#>         adaptive = T, W = W))
#> 
#> Model: MGWRSAR_0_0_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 6 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.168 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Constant Parameters]
#> [1] 0.377765
#> 
#>    [Varying Parameters]
#>    Intercept             X1                 X2                X3          
#>  Min.   :-5.5933   Min.   :-0.02637   Min.   :-2.7354   Min.   :-1.99933  
#>  1st Qu.:-1.1951   1st Qu.: 0.36239   1st Qu.: 0.3739   1st Qu.:-1.25872  
#>  Median :-0.3825   Median : 0.55045   Median : 1.3259   Median :-0.98323  
#>  Mean   :-0.5167   Mean   : 0.50539   Mean   : 1.3059   Mean   :-0.98406  
#>  3rd Qu.: 0.2199   3rd Qu.: 0.64008   3rd Qu.: 2.6088   3rd Qu.:-0.70394  
#>  Max.   : 2.9978   Max.   : 0.97632   Max.   : 5.1919   Max.   :-0.09413  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 710.04 
#>    AICc: 2369.83 
#>    Residual sum of squares: 275.99 
#>    RMSE: 0.5254 
#> ------------------------------------------------------

Estimation of a mgwrsar(0,kc,kv) model with stationary X2 and stationnary spatial multiplier using an adapative gaussian kernel (based on 20 nearest neighbors):

res_mgwrsar_0_kc_kv<- search_bandwidths(formula ='Y_mgwrsar_0_kc_kv~X1+X2+X3',data = mydata,coords = coord,kernels = c("gauss"),fixed_vars='X2',Model = "MGWRSAR_0_kc_kv",control = list( criterion = "AICc", adaptive = c(TRUE), Type = "GD", verbose = FALSE,W=W),hs_range = c(6, nrow(coord)),ht_range =NULL,n_seq = 10,ncore = 1,show_progress = F,n_rounds = 2,refine=T,tol = c(1),verbose = TRUE)

H_opt=res_mgwrsar_0_kc_kv$best$h_s
mgwrsar_0_kc_kv<-MGWRSAR(formula = 'Y_mgwrsar_0_kc_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars='X2',kernels=c('gauss'),H=H_opt, Model = 'MGWRSAR_0_kc_kv',control=list(SE=T,adaptive=T,W=W))
summary(mgwrsar_0_kc_kv)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_0_kc_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = "X2", kernels = c("gauss"), 
#>     H = H_opt, Model = "MGWRSAR_0_kc_kv", control = list(SE = T, 
#>         adaptive = T, W = W))
#> 
#> Model: MGWRSAR_0_kc_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 6 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.149 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Constant Parameters]
#>        X2      PhWy 
#> 0.7132556 0.4151707 
#> 
#>    [Varying Parameters]
#>    Intercept             X1                X3         
#>  Min.   :-1.6122   Min.   :-0.0617   Min.   :-1.3988  
#>  1st Qu.:-0.3765   1st Qu.: 0.3556   1st Qu.:-1.0765  
#>  Median : 0.1363   Median : 0.5511   Median :-0.9934  
#>  Mean   : 0.1990   Mean   : 0.5056   Mean   :-0.9949  
#>  3rd Qu.: 0.6953   3rd Qu.: 0.6453   3rd Qu.:-0.9114  
#>  Max.   : 2.4668   Max.   : 0.9752   Max.   :-0.6007  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 731.12 
#>    AICc: 2120.19 
#>    Residual sum of squares: 233.26 
#>    RMSE: 0.483 
#> ------------------------------------------------------

Estimation of a mgwrsar(1,kc,kv) model with stationary X2 using an adapative gaussian kernel (based on 20 nearest neighbors):

res_mgwrsar_1_kc_kv<- search_bandwidths(formula ='Y_mgwrsar_1_kc_kv~X1+X2+X3',data = mydata,coords = coord,kernels = c("gauss"),fixed_vars='X2',Model = "MGWRSAR_1_kc_kv",control = list( criterion = "AICc", adaptive = c(TRUE), Type = "GD", verbose = FALSE,W=W),hs_range = c(6, nrow(coord)),ht_range =NULL,n_seq = 10,ncore = 1,show_progress = F,n_rounds = 2,refine=T,tol = c(1),verbose = TRUE)

H_opt=res_mgwrsar_0_kc_kv$best$h_s
mgwrsar_1_kc_kv<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars='X2',kernels=c('gauss'),H=7, Model = 'MGWRSAR_1_kc_kv',control=list(SE=T,adaptive=T,W=W))
summary(mgwrsar_1_kc_kv)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_kc_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = "X2", kernels = c("gauss"), 
#>     H = 7, Model = "MGWRSAR_1_kc_kv", control = list(SE = T, 
#>         adaptive = T, W = W))
#> 
#> Model: MGWRSAR_1_kc_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 7 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.171 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Constant Parameters]
#>       X2 
#> 1.956348 
#> 
#>    [Varying Parameters]
#>    Intercept              X1                  X3              lambda       
#>  Min.   :-3.23744   Min.   :0.0003216   Min.   :-1.2686   Min.   :0.03101  
#>  1st Qu.:-1.69589   1st Qu.:0.3576303   1st Qu.:-1.0586   1st Qu.:0.42899  
#>  Median :-1.33853   Median :0.5394726   Median :-1.0081   Median :0.57111  
#>  Mean   :-1.36110   Mean   :0.4988842   Mean   :-1.0026   Mean   :0.55998  
#>  3rd Qu.:-1.00408   3rd Qu.:0.6349332   3rd Qu.:-0.9474   3rd Qu.:0.68432  
#>  Max.   : 0.02671   Max.   :0.9032831   Max.   :-0.7237   Max.   :1.43057  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 738.59 
#>    AICc: -708.39 
#>    Residual sum of squares: 14.17 
#>    RMSE: 0.119 
#> ------------------------------------------------------

Estimation of a mgwrsar(1,kc,0) model with stationary X1 using an adapative gaussian kernel (based on 20 nearest neighbors):

res_mgwrsar_1_kc_0<- search_bandwidths(formula ='Y_mgwrsar_1_kc_kv~X1+X2+X3',data = mydata,coords = coord,kernels = c("gauss"),fixed_vars='X2',Model = "MGWRSAR_1_kc_0",control = list( criterion = "AICc", adaptive = c(TRUE), Type = "GD", verbose = FALSE,W=W),hs_range = c(6, nrow(coord)),ht_range =NULL,n_seq = 10,ncore = 1,show_progress = F,n_rounds = 2,refine=T,tol = c(1),verbose = TRUE)


mgwrsar_1_kc_0<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars=c('Intercept','X1','X2','X3'),kernels=c('gauss'),H=7, Model = 'MGWRSAR_1_kc_0',control=list(SE=T,adaptive=T,W=W))
summary(mgwrsar_1_kc_0)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_kc_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = c("Intercept", "X1", "X2", "X3"), 
#>     kernels = c("gauss"), H = 7, Model = "MGWRSAR_1_kc_0", control = list(SE = T, 
#>         adaptive = T, W = W))
#> 
#> Model: MGWRSAR_1_kc_0 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 7 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.108 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Constant Parameters]
#>  Intercept         X1         X2         X3 
#> -1.2829925  0.5155295  1.5111313 -1.0299700 
#> 
#>    [Varying Parameters]
#>      lambda       
#>  Min.   :-2.2507  
#>  1st Qu.: 0.4532  
#>  Median : 0.5995  
#>  Mean   : 0.4923  
#>  3rd Qu.: 0.7094  
#>  Max.   : 0.9941  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 917.89 
#>    AICc: 1509.35 
#>    Residual sum of squares: 221.19 
#>    RMSE: 0.4703 
#> ------------------------------------------------------

Plot and summary

Summary and plot of mgwrsar object using leaflet:


mgwrsar_1_0_kv<-MGWRSAR(formula = 'Y_mgwrsar_1_0_kv~X1+X2+X3', data = mydata,coords=coord,fixed_vars=NULL,kernels=c('gauss'),H=7, Model = 'MGWRSAR_1_0_kv',control=list(SE=T,adaptive=T,W=W))

summary(mgwrsar_1_0_kv)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_0_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = NULL, kernels = c("gauss"), 
#>     H = 7, Model = "MGWRSAR_1_0_kv", control = list(SE = T, adaptive = T, 
#>         W = W))
#> 
#> Model: MGWRSAR_1_0_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 7 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.287 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Varying Parameters]
#>    Intercept              X1                X2               X3          
#>  Min.   :-7.72674   Min.   :0.05003   Min.   :-1.758   Min.   :-2.01204  
#>  1st Qu.:-1.75755   1st Qu.:0.35654   1st Qu.: 0.422   1st Qu.:-1.24745  
#>  Median :-0.62264   Median :0.55660   Median : 1.516   Median :-0.99979  
#>  Mean   :-0.89440   Mean   :0.51027   Mean   : 1.655   Mean   :-0.98199  
#>  3rd Qu.: 0.08947   3rd Qu.:0.64788   3rd Qu.: 2.769   3rd Qu.:-0.68450  
#>  Max.   : 2.02044   Max.   :1.05498   Max.   : 6.594   Max.   :-0.07195  
#>      lambda       
#>  Min.   :-0.3523  
#>  1st Qu.: 0.1821  
#>  Median : 0.3848  
#>  Mean   : 0.3726  
#>  3rd Qu.: 0.5621  
#>  Max.   : 1.2224  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 704.39 
#>    AICc: 3711.12 
#>    Residual sum of squares: 1031.78 
#>    RMSE: 1.0158 
#> ------------------------------------------------------
p<-plot(mgwrsar_1_0_kv,type='B_coef',var='X2',crs=2154)
#> Warning in plot.mgwrsar(x = x, type = type, var = var, crs = crs, mypalette =
#> mypalette, : n_time_steps ignored (Model is not Spatio-Temporal GDT).
p<-plotly::partial_bundle(p)
p

Prediction

In this example, we utilize an in-sample of size 800 for model estimation and an out-sample of size 200 for prediction. It’s worth noting that for models with spatial autocorrelation, you must provide a global weight matrix of size 1000, ordered by in-sample and then out-sample locations.

For GWR and MGWR_1_0_kv models, where all coefficients vary in space, predictions are made using the corresponding model estimate with the out-sample locations as target points. Hence, the estimated coefficients are not utilized; only the call and the data of the estimated model are employed (method_pred = ‘TP’, default value). However, for mixed models that include some constant coefficients (such as MGWR, MGXWR_0_0_kv, MGXWR_1_kc_kv, MGXWR_1_kc_0), the estimated coefficients are extrapolated using a weighting matrix. By default, the weighting matrix for prediction is derived from the expected weights of out-sample data if they were added to in-sample data to estimate the corresponding MGWRSAR model: method_pred=‘tWtp_model’ (refer to Geniaux, 2024, for further details). Alternatively, the user can choose to extrapolate the model coefficients using a Shepard kernel with a custom number of neighbors (method_pred=‘shepard’,k_extra=12) or using the same kernel and bandwidth as the estimated model (method_pred=‘model’). In case of extrapolation, predition method without the best linear unbiased predictor (‘type = YTC’) and with the best linear unbiased predictor (‘type = BPN’) are available.

Prediction of MGWRSAR_1_0_kv model using shepard kernel with 4 neighbors and Best Linear Unbiased Predictor (BLUP) :


### Global Spatial Weight matrix W should be ordered by in_sample (S) then out_sample

# in_sample and out_sample folds
length_out=800
index_in=sample(1:1000,length_out)
index_out=(1:1000)[-index_in]

#W=kernel_matW(H=4,kernels='rectangle',coords=coord,NN=4,adaptive=TRUE,diagnull=TRUE)
W_in_out=kernel_matW(H=4,kernels='rectangle',coords=rbind(coord[index_in,],coord[index_out,]),NN=4,adaptive=TRUE,diagnull=TRUE)
W_in=W_in_out[1:length(index_in),1:length(index_in)]
W_in=mgwrsar::normW(W_in)
newdata=mydata[index_out,]
newdata_coords=coord[index_out,]


### SAR Model
model_SAR_insample<-MGWRSAR(formula = 'Y_sar~X1+X2+X3', data = mydata[index_in,],coords=coord[index_in,], fixed_vars=NULL,kernels=NULL,H=NULL, Model = 'SAR',control=list(W=W_in))
model_SAR_insample@RMSE
#> [1] 0.1325775
summary(model_SAR_insample)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_sar~X1+X2+X3", data = mydata[index_in, ], 
#>     coords = coord[index_in, ], fixed_vars = NULL, kernels = NULL, 
#>     H = NULL, Model = "SAR", control = list(W = W_in))
#> 
#> Model: SAR 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : NA (Fixed / Distance) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.003 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 800 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Constant Parameters]
#>  Intercept         X1         X2         X3     lambda 
#>  0.2014254  0.5020017  1.1437199 -1.0141987  0.3035669 
#> 
#> ------------------------------------------------------
#> Diagnostics:
#>    AICc:  
#>    Residual sum of squares: 14.06 
#>    RMSE: 0.1326 
#> ------------------------------------------------------

## SAR Model: predictions
## without Best Linear Unbiased Predictor
# prediction YTC
Y_pred10=predict(model_SAR_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='YTC')
head(Y_pred10)
#> [1] 5.011485 3.237824 3.492985 2.679169 1.862020 2.766853
RMSE_YTC=sqrt(mean((mydata$Y_sar[index_out]-Y_pred10)^2))
RMSE_YTC 
#> [1] 0.08964377

## Using Best Linear Unbiased Predictor
Y_pred11=predict(model_SAR_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN')
head(Y_pred11)
#> [1] 5.042327 3.230759 3.513008 2.660348 1.888774 2.704777
RMSE_BPN=sqrt(mean((mydata$Y_sar[index_out]-Y_pred11)^2))
RMSE_BPN 
#> [1] 0.06277489


#### MGWRSAR_1_0_kv model
model_MGWRSAR_1_0_kv_insample<-MGWRSAR(formula = 'Y_mgwrsar_1_0_kv~X1+X2+X3', data = mydata[index_in,],coords=coord[index_in,], fixed_vars=NULL,kernels='gauss',H=20, Model = 'MGWRSAR_1_0_kv',control=list(W=W_in,adaptive=TRUE,isgcv=F))
model_MGWRSAR_1_0_kv_insample@RMSE
#> [1] 0.8045282
summary(model_MGWRSAR_1_0_kv_insample)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_0_kv~X1+X2+X3", data = mydata[index_in, 
#>     ], coords = coord[index_in, ], fixed_vars = NULL, kernels = "gauss", 
#>     H = 20, Model = "MGWRSAR_1_0_kv", control = list(W = W_in, 
#>         adaptive = TRUE, isgcv = F))
#> 
#> Model: MGWRSAR_1_0_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 20 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.118 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 800 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Varying Parameters]
#>    Intercept             X1               X2               X3         
#>  Min.   :-4.4001   Min.   :0.1075   Min.   :-1.268   Min.   :-1.8995  
#>  1st Qu.:-1.6387   1st Qu.:0.3755   1st Qu.: 0.964   1st Qu.:-1.3126  
#>  Median :-0.8543   Median :0.5274   Median : 1.708   Median :-1.0658  
#>  Mean   :-0.8422   Mean   :0.4971   Mean   : 2.155   Mean   :-1.0343  
#>  3rd Qu.:-0.2260   3rd Qu.:0.6201   3rd Qu.: 3.494   3rd Qu.:-0.7195  
#>  Max.   : 2.2572   Max.   :0.8375   Max.   : 7.352   Max.   :-0.2145  
#>      lambda       
#>  Min.   :-0.1958  
#>  1st Qu.: 0.1621  
#>  Median : 0.2907  
#>  Mean   : 0.2789  
#>  3rd Qu.: 0.4109  
#>  Max.   : 0.7153  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 704.98 
#>    AICc: 2139.4 
#>    Residual sum of squares: 517.81 
#>    RMSE: 0.8045 
#> ------------------------------------------------------

#### MGWRSAR_1_0_kv model: predictions

## Using Best Linear Unbiased Predictor with method_pred='model' 
Y_pred12=predict(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN',method_pred='model' )
head(Y_pred12)
#> [1] 1.1404331 6.6056158 2.4251779 2.9401019 1.2551243 0.9482162
RMSE_model_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred12)^2))
RMSE_model_BPN 
#> [1] 0.4050825

# prediction with method_pred='tWtp'
Y_pred13=predict(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN',method_pred='tWtp_model')
head(Y_pred13)
#> [1] 1.128355 6.561310 2.489386 3.025743 1.259118 1.018711
RMSE_tWtp_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred13)^2))
RMSE_tWtp_BPN 
#> [1] 0.3991092

## Using Best Linear Unbiased Predictor with method_pred='shepard' 
W_out=kernel_matW(H=4,kernels='rectangle',coords=coord[index_out,],NN=4,adaptive=TRUE,diagnull=TRUE)
Y_pred14=predict(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN',method_pred='shepard',k_extra=8)
head(Y_pred14)
#> [1] 0.6940118 6.8240480 2.5281222 3.1481212 1.2232821 1.0784311
RMSE_shepard_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred14)^2))
RMSE_shepard_BPN
#> [1] 0.4430776

Prediction of MGWRSAR_1_kc_kv model using shepard kernel with 4 neighbors and Best Linear Unbiased Predictor (BLUP) :


### Global Spatial Weight matrix W should be ordered by in_ sample (S) then out_sample

# in_sample and out_sample folds
length_out=800
index_in=sample(1:1000,length_out)
index_out=(1:1000)[-index_in]

W=kernel_matW(H=4,kernels='rectangle',coords=rbind(coord[index_in,],coord[index_out,]),NN=4,adaptive=TRUE,diagnull=TRUE)
W_in=W[index_in,index_in]
W_in=mgwrsar::normW(W_in)

### model_MGWRSAR_1_kc_kv
model_MGWRSAR_1_kc_kv_insample<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata[index_in,],coords=coord[index_in,], fixed_vars='X3',kernels=c('gauss'),H=11, Model = 'MGWRSAR_1_kc_kv',control=list(W=W_in,adaptive=TRUE,isgcv=F))
model_MGWRSAR_1_kc_kv_insample@RMSE
#> [1] 0.3379756
summary(model_MGWRSAR_1_kc_kv_insample)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_kc_kv~X1+X2+X3", data = mydata[index_in, 
#>     ], coords = coord[index_in, ], fixed_vars = "X3", kernels = c("gauss"), 
#>     H = 11, Model = "MGWRSAR_1_kc_kv", control = list(W = W_in, 
#>         adaptive = TRUE, isgcv = F))
#> 
#> Model: MGWRSAR_1_kc_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 11 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.116 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: NO 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 800 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Constant Parameters]
#>       X3 
#> -1.05894 
#> 
#>    [Varying Parameters]
#>    Intercept               X1                X2              lambda         
#>  Min.   :-10.82461   Min.   :0.07481   Min.   :-0.7886   Min.   :-0.157793  
#>  1st Qu.: -1.91473   1st Qu.:0.37015   1st Qu.: 1.9299   1st Qu.:-0.028099  
#>  Median : -0.89254   Median :0.53437   Median : 2.8884   Median : 0.001659  
#>  Mean   : -1.12218   Mean   :0.51080   Mean   : 3.3694   Mean   : 0.003490  
#>  3rd Qu.: -0.03437   3rd Qu.:0.65271   3rd Qu.: 5.3199   3rd Qu.: 0.036549  
#>  Max.   :  1.79588   Max.   :0.97652   Max.   :11.9464   Max.   : 0.157537  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 668.12 
#>    AICc: 852.14 
#>    Residual sum of squares: 91.38 
#>    RMSE: 0.338 
#> ------------------------------------------------------

### model_MGWRSAR_1_kc_kv: predictions
## without Best Linear Unbiased Predictor
newdata=mydata[index_out,]
newdata_coords=coord[index_out,]
newdata$Y_mgwrsar_1_kc_kv=0

Y_pred15=predict(model_MGWRSAR_1_kc_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W,type='YTC',method_pred='shepard')
head(Y_pred15)
#> [1] 0.5695360 4.5434772 2.5779640 2.7029408 0.9313297 2.2977041
RMSE_YTC=sqrt(mean((mydata$Y_mgwrsar_1_kc_kv[index_out]-Y_pred15)^2))
RMSE_YTC
#> [1] 0.4463266

## Using Best Linear Unbiased Predictor
Y_pred16=predict(model_MGWRSAR_1_kc_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W,type='BPN',method_pred='shepard')
head(Y_pred16)
#> [1] 0.5864393 4.5429482 2.5936091 2.7319570 0.9341798 2.3247283
RMSE_BPN=sqrt(mean((mydata$Y_mgwrsar_1_kc_kv[index_out]-Y_pred16)^2))
RMSE_BPN 
#> [1] 0.4305838

Optimal bandwidths search

The mgwrsar package offers a wrapper that facilitates finding the optimal bandwidth for a variety of model and kernel types, utilizing either AICc or CV criteria. It is specifically tailored for spatial kernels (Type=‘GD’) but can also be applied to space-time kernels (Type=‘GDT’) by specifying the bandwidth for the time kernel.

Examples for MGWRSAR_1_0_kv model using first an adaptive gaussian kernel and then a non-adaptive gaussian kernel.

max_dist=max(dist(coord))
### adaptive kernel : the bandwidth is expressed as a distance 
## AICc
res9=golden_search_bandwidth(formula='Y_mgwrsar_1_0_kv~X1+X2+X3',Ht=NULL,data = mydata,coords=coord, fixed_vars=NULL,kernels=c('gauss'), Model = 'MGWRSAR_1_0_kv',control=list(verbose=F,NN=300,criterion='AICc',adaptive=F,W=W),lower.bound=500, upper.bound=max_dist,tolerance=0.001,show_progress=FALSE)
res9$minimum 
#> [1] 985.0077
summary(res9$model)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = formula, data = data, coords = coords, fixed_vars = fixed_vars, 
#>     kernels = kernels, H = c(res, Ht), Model = Model, control = control_o)
#> 
#> Model: MGWRSAR_1_0_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Fixed / Distance) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 985.01 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.039 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: YES (300 neighbors) 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Varying Parameters]
#>    Intercept              X1                X2               X3         
#>  Min.   :-15.4535   Min.   :0.04892   Min.   :-1.341   Min.   :-2.4178  
#>  1st Qu.: -1.8635   1st Qu.:0.35351   1st Qu.: 1.560   1st Qu.:-1.3051  
#>  Median : -0.8417   Median :0.53706   Median : 2.756   Median :-1.0404  
#>  Mean   : -1.0911   Mean   :0.50284   Mean   : 3.358   Mean   :-1.0521  
#>  3rd Qu.:  0.3301   3rd Qu.:0.63889   3rd Qu.: 5.398   3rd Qu.:-0.7211  
#>  Max.   :  3.5281   Max.   :1.05515   Max.   :13.685   Max.   : 0.1515  
#>      lambda         
#>  Min.   :-0.175209  
#>  1st Qu.:-0.038418  
#>  Median : 0.005728  
#>  Mean   : 0.009294  
#>  3rd Qu.: 0.051659  
#>  Max.   : 0.360849  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 738.58 
#>    AICc: 1054.28 
#>    Residual sum of squares: 82.59 
#>    RMSE: 0.2874 
#> ------------------------------------------------------

## CV
res11=golden_search_bandwidth(formula='Y_mgwrsar_1_0_kv~X1+X2+X3',Ht=NULL,data = mydata,coords=coord, fixed_vars=NULL,kernels=c('gauss'), Model = 'MGWRSAR_1_0_kv',control=list(verbose=F,NN=300,criterion='CV',adaptive=F,W=W),lower.bound=500, upper.bound=max_dist,tolerance=0.001,show_progress=FALSE)
res11$minimum 
#> [1] 715.4225
summary(res11$model)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = formula, data = data, coords = coords, fixed_vars = fixed_vars, 
#>     kernels = kernels, H = c(res, Ht), Model = Model, control = control_o)
#> 
#> Model: MGWRSAR_1_0_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Fixed / Distance) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 715.42 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.034 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: YES (300 neighbors) 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Varying Parameters]
#>    Intercept              X1                X2               X3         
#>  Min.   :-17.4153   Min.   :0.02566   Min.   :-1.709   Min.   :-2.2927  
#>  1st Qu.: -1.7951   1st Qu.:0.35254   1st Qu.: 1.264   1st Qu.:-1.3030  
#>  Median : -0.5838   Median :0.53315   Median : 2.767   Median :-1.0335  
#>  Mean   : -1.0773   Mean   :0.49767   Mean   : 3.295   Mean   :-1.0231  
#>  3rd Qu.:  0.3951   3rd Qu.:0.63465   3rd Qu.: 5.178   3rd Qu.:-0.7096  
#>  Max.   :  6.1585   Max.   :1.10149   Max.   :14.792   Max.   : 0.4838  
#>      lambda         
#>  Min.   :-0.394586  
#>  1st Qu.:-0.042147  
#>  Median : 0.008316  
#>  Mean   : 0.010220  
#>  3rd Qu.: 0.052383  
#>  Max.   : 0.382000  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 585.34 
#>    AICc: 1245.8 
#>    Residual sum of squares: 49.14 
#>    RMSE: 0.2217 
#> ------------------------------------------------------

### non-adaptive kernel: the bandwidth is expressed as of number of neighbors
## AICc
res10=golden_search_bandwidth(formula='Y_mgwrsar_1_0_kv~X1+X2+X3',Ht=NULL,data = mydata,coords=coord, fixed_vars=NULL,kernels=c('gauss'), Model = 'MGWRSAR_1_0_kv',control=list(verbose=F,NN=300,criterion='AICc',adaptive=T,W=W),lower.bound=500, upper.bound=max_dist,tolerance=0.001,show_progress=FALSE)
res10$minimum 
#> [1] 27485
summary(res10$model)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = formula, data = data, coords = coords, fixed_vars = fixed_vars, 
#>     kernels = kernels, H = c(res, Ht), Model = Model, control = control_o)
#> 
#> Model: MGWRSAR_1_0_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 27485 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.034 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: YES (300 neighbors) 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Varying Parameters]
#>    Intercept             X1               X2                X3         
#>  Min.   :-3.5292   Min.   :0.2081   Min.   :-0.7708   Min.   :-1.8422  
#>  1st Qu.:-2.3213   1st Qu.:0.3979   1st Qu.: 2.9343   1st Qu.:-1.3463  
#>  Median :-1.7199   Median :0.4804   Median : 4.3005   Median :-1.0754  
#>  Mean   :-1.5448   Mean   :0.4538   Mean   : 4.1361   Mean   :-1.1446  
#>  3rd Qu.:-0.9625   3rd Qu.:0.5236   3rd Qu.: 5.5522   3rd Qu.:-0.8812  
#>  Max.   : 1.7082   Max.   :0.6294   Max.   : 7.5749   Max.   :-0.6308  
#>      lambda         
#>  Min.   :-0.132937  
#>  1st Qu.:-0.039385  
#>  Median : 0.001323  
#>  Mean   : 0.002633  
#>  3rd Qu.: 0.038544  
#>  Max.   : 0.162166  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 978.91 
#>    AICc: 2950.27 
#>    Residual sum of squares: 1070.62 
#>    RMSE: 1.0347 
#> ------------------------------------------------------

## CV
res12=golden_search_bandwidth(formula='Y_mgwrsar_1_0_kv~X1+X2+X3',Ht=NULL,data = mydata,coords=coord, fixed_vars=NULL,kernels=c('gauss'), Model = 'MGWRSAR_1_0_kv',control=list(verbose=F,NN=300,criterion='CV',adaptive=T,W=W),lower.bound=500, upper.bound=max_dist,tolerance=0.001,show_progress=FALSE)
res12$minimum 
#> [1] 27485
summary(res12$model)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = formula, data = data, coords = coords, fixed_vars = fixed_vars, 
#>     kernels = kernels, H = c(res, Ht), Model = Model, control = control_o)
#> 
#> Model: MGWRSAR_1_0_kv 
#> ------------------------------------------------------
#> Kernels Configuration:
#>    Spatial Kernel  : gauss (Adaptive / Neighbors) 
#> ------------------------------------------------------
#> Bandwidth Configuration:
#>    Spatial Bandwidth (H): 27485 
#> ------------------------------------------------------
#> Model Settings:
#>    Method for spatial autocorrelation: 2SLS 
#>    Computation time: 0.034 sec
#>    Use of Target Points: NO 
#>    Use of rough kernel approximation: YES (300 neighbors) 
#>    Parallel computing: FALSE (ncore = 1) 
#>    Number of data points: 1000 
#> ------------------------------------------------------
#> Coefficients Summary:
#>    [Varying Parameters]
#>    Intercept             X1               X2                X3         
#>  Min.   :-3.5292   Min.   :0.2081   Min.   :-0.7708   Min.   :-1.8422  
#>  1st Qu.:-2.3213   1st Qu.:0.3979   1st Qu.: 2.9343   1st Qu.:-1.3463  
#>  Median :-1.7199   Median :0.4804   Median : 4.3005   Median :-1.0754  
#>  Mean   :-1.5448   Mean   :0.4538   Mean   : 4.1361   Mean   :-1.1446  
#>  3rd Qu.:-0.9625   3rd Qu.:0.5236   3rd Qu.: 5.5522   3rd Qu.:-0.8812  
#>  Max.   : 1.7082   Max.   :0.6294   Max.   : 7.5749   Max.   :-0.6308  
#>      lambda         
#>  Min.   :-0.132937  
#>  1st Qu.:-0.039385  
#>  Median : 0.001323  
#>  Mean   : 0.002633  
#>  3rd Qu.: 0.038544  
#>  Max.   : 0.162166  
#> ------------------------------------------------------
#> Diagnostics:
#>    Effective degrees of freedom: 978.91 
#>    AICc: 2950.27 
#>    Residual sum of squares: 1070.62 
#>    RMSE: 1.0347 
#> ------------------------------------------------------

References

Brunsdon, C., Fotheringham, A., Charlton, M., 1996. Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr. Anal. 28 (4), 281–298.

Friedman, J. H. 1984. A variable span smoother. Laboratory for Computational Statistics, Department of Statistics, Stanford University: Technical Report(5).

Geniaux, G. and Martinetti, D. 2018. A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models. Regional Science and Urban Economics, 72, 74-85. https://doi.org/10.1016/j.regsciurbeco.2017.04.001

Geniaux, G. and Martinetti, D. 2017b. R package mgwrsar: Functions for computing (mixed) geographycally weighted regression with spatial autocorrelation,. https://CRAN.R-project.org/package=mgwrsar.

Geniaux, G. 2024. Speeding up estimation of spatially varying coefficients models. Jounal of Geographical System 26, 293–327. https://doi.org/10.1007/s10109-024-00442-3

Loader, C. 1999. Local regression and likelihood, volume 47. springer New York.

McMillen, D. P. 1996. One hundred fifty years of land values in chicago: A nonparametric approach. Journal of Urban Economics, 40(1):100 – 124.

GWR and Mixed GWR with spatial autocorrelation

Ghislain Geniaux

2025-12-03