run_out_of_sample_exercise()
Run out-of-sample exercise to evaluate model performance.
Usage
run_out_of_sample_exercise(
df,
macro_names,
region_names,
region_covariate_names,
n_factors=4,
aggregate_measure="gva_q_on_q",
aggregation_region="uk",
region_measure="gva_q_on_4q",
region_q_on_q_measure=None,
step_size=1,
init_chunk_size=20,
lag_qtrs=6,
lag_qtrs_qoq=1,
n_its=100000,
n_posterior_samples=3000
)Masks the most recent annual regional data by lag_qtrs quarters and fits the model in rolling chunks. Out-of-sample nowcasts and outturns for each step are collected and returned.
Parameters
df: pd.DataFrame-
Dataframe containing relevant columns.
macro_names: list[str]-
Names of macro series.
region_names: list[str]-
Names of regions.
region_covariate_names: list[str]-
Names of by-region covariates.
n_factors: int = 4-
Number of factors. Defaults to 4.
aggregate_measure: str = "gva_q_on_q"-
Nation-wide measure. Defaults to “gva_q_on_q”.
aggregation_region: str = "uk"-
Top level geography. Defaults to “uk”.
region_measure: str = "gva_q_on_4q"-
Growth measure regional. Defaults to “gva_q_on_4q”.
region_q_on_q_measure: str | None = None-
Optional measure name in
dfsupplying published quarterly (q-on-q) regional growth rates as a hard clamp ony_reg. Same format as every other measure (datetime | measure | region | value); values must be decimal growth rates. Inside the out-of-sample window this exercise NaN-masks these rows for every step (as withregion_measure), so published quarterly values inside the OOS window do not leak into the nowcast. Older published values remain as clamps. Defaults toNone(no q-on-q clamp, backwards compatible). step_size: int = 1-
Quarters to advance per OOS step. Defaults to 1.
init_chunk_size: int = 20-
Initial learning window size. Defaults to 20.
lag_qtrs: int = 6-
How many quarters before annual regional data are published; drives the OOS mask for
region_measure. Tuned for the ONS regional annual GVA release (~6 quarters). Defaults to 6. lag_qtrs_qoq: int = 1-
How many quarters before quarterly regional data are published; drives a separate OOS mask for
region_q_on_q_measure. Scot Gov quarterly GDP publishes with ~1 quarter lag, so a smaller value thanlag_qtrsis realistic. Only used whenregion_q_on_q_measureis notNone. Defaults to 1. n_its: int = 100000-
Iterations of ADVI for Bayesian inference. Defaults to 100000.
n_posterior_samples: int = 3000-
Samples of the posterior. Defaults to 3000.
Returns
pd.DataFrame-
pd.DataFrame: Combined out-of-sample nowcasts and outturns across all rolling steps.