One of the earliest extensions to incorporate the temporal dimension into Geographically Weighted Regression (GWR) is Geographically and Temporally Weighted Regression (GTWR). GTWR enables regression coefficients to vary both spatially and temporally by employing spatiotemporal weights that account for spatial and temporal distances. This approach captures evolving local dynamics more effectively than traditional GWR (Huang et al., 2010).
The mgwrsar package extends this functionality by
incorporating flexible spatiotemporal kernels for all
available models (GWR, MGWR,
MGwrsar).
Our implementation allows for: 1. Asymmetric temporal kernels: Ensuring that only past observations are considered (crucial for forecasting). 2. Cyclic kernels: Modeling seasonal processes (e.g., monthly or annual cycles). 3. Flexible Weight Combination: Using the parameter \(\alpha\) to balance between multiplicative and additive combinations of space and time.
The final weight \(W_{ij}\) between a focal point \(i\) and an observation \(j\) is derived from two components:
The mgwrsar package combines these components using a
mixing parameter \(\alpha \in [0, 1]\)
according to the following logic:
\[W_{ij} = \alpha \underbrace{\left( W_{ij}^S \times W_{ij}^T \right)}_{\text{Multiplicative (Interaction)}} + (1 - \alpha) \underbrace{\left( W_{ij}^S + W_{ij}^T \right)}_{\text{Additive (Union)}}\]
_past): Weights are set to
zero if \(t_j > t_i\) (future
observations)._symmetric): Past and
future observations are treated equally (interpolation/smoothing).To use these kernels, the temporal kernel name follows specific
conventions in the kernels argument:
[BaseKernel]_[Type]_[Cycle]
gauss, bisq,
epane, rectangle, triangle.past (asymmetric) or
symmetric (default).Examples: * 'gauss_past_365': Gaussian
kernel, past-only, 365-unit cycle (seasonality). *
'bisq_symmetric': Standard Bisquare kernel in time.
To enable spatiotemporal modeling, use
control = list(type = 'GDT', ...).
Example Configuration:
kernels = c('gauss', 'gauss_past_365') # Spatial: Gauss, Temporal: Gauss Past Cyclic
H = c(20000, 30) # Spatial BW: 20km, Temporal BW: 30 days
control = list(type = 'GDT', adaptive = c(FALSE, FALSE), alpha = 1)
Visualizing the Kernel Weights: The figures below illustrate the weights (Z-axis) given spatial (X-axis) and temporal (Y-axis) distance. Grey areas indicate zero weight.
We simulate a dataset where coefficients \(\beta_1\) and \(\beta_2\) vary according to both space and time.
# 1. Simulate Spatiotemporal Data
simu2 <- simu_multiscale(
n = 1000,
myseed = 1,
type = 'GG2024',
constant = NULL,
nuls = NULL,
config_beta = 'spatiotemp_old',
config_snr = 0.9,
config_eps = 'normal'
)
mydata <- simu2$mydata
coords <- simu2$coords
# Add coordinates to data for ggplot visualization
mydata$x <- coords[,1]
mydata$y <- coords[,2]
# True Betas for validation
TRUEBETA <- as.matrix(mydata[, c('Beta1', 'Beta2', 'Beta3', 'Beta4')])
# Formula
myformula <- as.formula('Y ~ X1 + X2 + X3')
# 2. Visualize the evolution of Beta1 over time
# Pattern for late observations (time > 0.6)
ggplot(mydata %>% filter(time > 0.6), aes(x, y)) +
geom_point(aes(color = Beta1), size = 2) +
scale_colour_gradientn(colours = terrain.colors(10), limits = c(0, 13)) +
ggtitle('Space-Time Pattern of Beta1 (Time > 0.6)') +
theme_minimal()
# Pattern for early observations (time < 0.3)
ggplot(mydata %>% filter(time < 0.3), aes(x, y)) +
geom_point(aes(color = Beta1), size = 2) +
scale_colour_gradientn(colours = terrain.colors(10), limits = c(0, 13)) +
ggtitle('Space-Time Pattern of Beta1 (Time < 0.3)') +
theme_minimal()
We compare a standard spatial GWR against the Spatiotemporal GWR (GTWR).
Key Implementation Details: * Use
type = 'GDT' in the control list. * Pass the time vector
via Z in the control list:
control = list(..., Z = mydata$time). * The
search_bandwidths function is used here to find optimal
\(h_s\) (spatial) and \(h_t\) (temporal).
First, we estimate a model using only a spatial kernel
(Type = 'GD').
# Spatial GWR with Gaussian Kernel
res_s <- search_bandwidths(
formula = myformula,
data = mydata,
coords = coords,
kernels = c("gauss"),
Model = "GWR",
control = list(
criterion = "AICc",
adaptive = FALSE,
Type = "GD",
verbose = FALSE
),
hs_range = c(0, 1.4), # Range for spatial bandwidth
n_seq = 12,
n_rounds = 3,
refine = TRUE,
ncore = 1,
tol = 0.001,
verbose = FALSE,
show_progress = FALSE
)
model_gwr <- res_s$best_model
summary(model_gwr)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = formula, data = HEAVY$data, coords = HEAVY$coords,
#> fixed_vars = HEAVY$fixed_vars, kernels = kernels, H = H_final,
#> Model = Model, control = control_o)
#>
#> Model: GWR
#> ------------------------------------------------------
#> Kernels Configuration:
#> Spatial Kernel : gauss (Fixed / Distance)
#> ------------------------------------------------------
#> Bandwidth Configuration:
#> Spatial Bandwidth (H): 0.06
#> ------------------------------------------------------
#> Model Settings:
#> Computation time: 0.091 sec
#> Use of Target Points: NO
#> Use of rough kernel approximation: NO
#> Parallel computing: (ncore = 1)
#> Number of data points: 1000
#> ------------------------------------------------------
#> Coefficients Summary:
#> [Varying Parameters]
#> Intercept X1 X2 X3
#> Min. :-1.521 Min. :-0.5287 Min. :-2.0985 Min. :-3.57999
#> 1st Qu.: 1.937 1st Qu.: 2.0646 1st Qu.:-0.4018 1st Qu.:-1.10614
#> Median : 3.643 Median : 3.5566 Median : 0.1169 Median : 0.17260
#> Mean : 3.467 Mean : 3.9405 Mean : 0.1264 Mean : 0.03299
#> 3rd Qu.: 4.663 3rd Qu.: 5.5754 3rd Qu.: 0.6650 3rd Qu.: 1.06506
#> Max. : 9.356 Max. : 9.3048 Max. : 2.4480 Max. : 3.70926
#> ------------------------------------------------------
#> Diagnostics:
#> Effective degrees of freedom: 840.93
#> AICc: 5558.63
#> Residual sum of squares: 10389.4
#> RMSE: 3.2233
#> ------------------------------------------------------
# Evaluate In-Sample Estimation Quality
Beta_RMSE_GWR <- apply((model_gwr@Betav - TRUEBETA), 2, rmse)
cat("Mean RMSE for Betas (Spatial GWR):", mean(Beta_RMSE_GWR), "\n")
#> Mean RMSE for Betas (Spatial GWR): 1.410495
Now, we estimate the model using a Space-Time kernel
(Type = 'GDT'). Note the definition of
ht_range to cover the temporal span. We use
alpha = 1 (default) for a standard multiplicative
kernel.
# Space-Time GWR with Gaussian Space and Gaussian Time Kernels
res_st <- search_bandwidths(
formula = myformula,
data = mydata,
coords = coords,
kernels = c("gauss", "gauss"),
Model = "GWR",
control = list(
Z = mydata$time, # Time vector is mandatory here
criterion = "AICc",
adaptive = c(FALSE, FALSE),
Type = "GDT",
alpha = 1, # Multiplicative kernel
verbose = FALSE
),
hs_range = c(0, 1.4),
ht_range = c(0, max(mydata$time) - min(mydata$time)), # Range for temporal bandwidth
n_seq = 12,
n_rounds = 3,
ncore = 8,
tol = c(0.001, 0.001),
verbose = FALSE,
refine=F,
show_progress = FALSE
)
model_gtwr <- res_st$best_model
summary(model_gtwr)
#> ------------------------------------------------------
#> Call:
#> MGWRSAR(formula = formula, data = HEAVY$data, coords = HEAVY$coords,
#> fixed_vars = HEAVY$fixed_vars, kernels = kernels, H = H_final,
#> Model = Model, control = control_o)
#>
#> Model: GWR
#> ------------------------------------------------------
#> Kernels Configuration:
#> Spatial Kernel : gauss (Fixed / Distance)
#> Temporal Kernel : gauss (Fixed / Distance)
#> ------------------------------------------------------
#> Bandwidth Configuration:
#> Spatial Bandwidth (H) : 0.06
#> Temporal Bandwidth (Ht): 0.4
#> ------------------------------------------------------
#> Model Settings:
#> Computation time: 0.299 sec
#> Use of Target Points: NO
#> Use of rough kernel approximation: NO
#> Parallel computing: (ncore = 1)
#> Number of data points: 1000
#> ------------------------------------------------------
#> Coefficients Summary:
#> [Varying Parameters]
#> Intercept X1 X2 X3
#> Min. :-2.050 Min. :-0.7509 Min. :-2.48833 Min. :-4.89390
#> 1st Qu.: 1.936 1st Qu.: 2.0258 1st Qu.:-0.41790 1st Qu.:-1.29522
#> Median : 3.570 Median : 3.5687 Median : 0.09706 Median : 0.15005
#> Mean : 3.472 Mean : 3.9410 Mean : 0.13838 Mean : 0.01011
#> 3rd Qu.: 4.705 3rd Qu.: 5.6060 3rd Qu.: 0.71397 3rd Qu.: 1.26170
#> Max. :10.760 Max. :10.5363 Max. : 2.88157 Max. : 4.35230
#> ------------------------------------------------------
#> Diagnostics:
#> Effective degrees of freedom: 747.5
#> AICc: 5484.3
#> Residual sum of squares: 7160.58
#> RMSE: 2.6759
#> ------------------------------------------------------
# Evaluate In-Sample Estimation Quality
Beta_RMSE_GTWR <- apply((model_gtwr@Betav - TRUEBETA), 2, rmse)
cat("Mean RMSE for Betas (GTWR):", mean(Beta_RMSE_GTWR), "\n")
#> Mean RMSE for Betas (GTWR): 1.222939
The reduction in RMSE confirms that incorporating the temporal dimension improves the fit for spatiotemporal processes.
We now assess the predictive performance on a larger dataset (\(N=2000\)). We split the data into an estimation set (Early times, \(t < 0.7\)) and a prediction set (Late times, \(t \ge 0.7\)).
Important: For spatiotemporal prediction, the
newdata_coords argument must be a matrix with 3
columns: [Longitude, Latitude, Time].
# 1. New Simulation for Train/Test split
simu3 <- simu_multiscale(
n = 2000,
myseed = 3,
type = 'GG2024',
constant = NULL,
nuls = NULL,
config_beta = 'spatiotemp_old',
config_snr = 0.7,
config_eps = 'normal'
)
mydata_full <- simu3$mydata
coords_full <- simu3$coords
# Split Data
index_in <- which(mydata_full$time < 0.7)
index_out <- setdiff(1:2000, index_in)
# Training Data
data_in <- mydata_full[index_in, ]
coords_in <- coords_full[index_in, ]
time_in <- mydata_full$time[index_in]
# Test Data
data_out <- mydata_full[index_out, ]
# Matrix with 3 columns for prediction: X, Y, Time
coords_time_out <- cbind(coords_full[index_out, ], mydata_full$time[index_out])
### A. Estimation of Spatial GWR (Benchmark)
res_s_in <- search_bandwidths(
formula = myformula,
data = data_in,
coords = coords_in,
kernels = c("gauss"),
Model = "GWR",
control = list(criterion = "AICc", adaptive = FALSE, Type = "GD"),
hs_range = c(0, 1.4),
n_seq = 12,
n_rounds = 3,
refine = TRUE,
ncore = 2,
verbose = FALSE,
show_progress = FALSE
)
model_gwr_in <- res_s_in$best_model
### B. Estimation of GTWR (Space-Time)
res_st_in <- search_bandwidths(
formula = myformula,
data = data_in,
coords = coords_in,
kernels = c("gauss", "gauss"),
Model = "GWR",
control = list(
Z = time_in,
criterion = "AICc",
adaptive = c(FALSE, FALSE),
Type = "GDT",
alpha = 1
),
hs_range = c(0, 1.4),
ht_range = c(0, max(time_in) - min(time_in)),
n_seq = 16,
n_rounds = 3,
ncore =8,
refine = F,
verbose = F,
show_progress = F
)
model_gtwr_in <- res_st_in$best_model
### C. Prediction Comparison
# Prediction 1: Spatial GWR
# Using 'tWtp_model' to extrapolate coefficients based on spatial position only
Y_pred_gwr <- predict(
model_gwr_in,
newdata = data_out,
newdata_coords = coords_full[index_out, ], # Only spatial coords needed
method_pred = 'tWtp_model'
)
rmse_gwr <- sqrt(mean((data_out$Y - Y_pred_gwr)^2))
# Prediction 2: GTWR
# Using 'model' method which computes exact weights using the provided space-time kernel
Y_pred_gtwr <- predict(
model_gtwr_in,
newdata = data_out,
newdata_coords = coords_time_out, # 3 Columns: X, Y, Time
method_pred = 'model'
)
rmse_gtwr <- sqrt(mean((data_out$Y - Y_pred_gtwr)^2))
# Results
cat("Out-of-sample RMSE (Spatial GWR):", rmse_gwr, "\n")
#> Out-of-sample RMSE (Spatial GWR): 6.212083
cat("Out-of-sample RMSE (GTWR): ", rmse_gtwr, "\n")
#> Out-of-sample RMSE (GTWR): 5.941904
The GTWR model, by leveraging the temporal proximity of the most recent observations in the training set to predict future observations, provides a lower Root Mean Square prediction error.