from ambric import Ambric
from ambric.utilities import generate_realistic_simulated_dataHow to use AMBRIC
Let’s import the package.
As a user, we have to bring a few things to the party. The first, of course, is data, which we will simulate.
Let’s set-up some simulated data. We’ll specify how many underlying factors are driving regional dynamics first.
n_factors = 2
df = generate_realistic_simulated_data(n_factors=n_factors, R=12)2026-05-10 15:03:30.396 | INFO | ambric.utilities:generate_realistic_simulated_data:47 - Simulating Data: T=130, R=12, J=4, Macro=2, Factors=2, lag=6
Data that are input into the model must have this structure:
df.sample(10)| datetime | value | region | measure | |
|---|---|---|---|---|
| 1177 | 1991-12-31 | -0.077809 | region_02 | regional_covar_01 |
| 784 | 2000-03-31 | NaN | region_06 | gva_q_on_4q |
| 2822 | 2013-03-31 | -0.153500 | region_05 | regional_covar_01 |
| 2493 | 1995-12-31 | 0.254217 | region_04 | regional_covar_03 |
| 2843 | 2018-06-30 | -0.266745 | region_05 | regional_covar_01 |
| 2461 | 2020-06-30 | -0.192657 | region_04 | regional_covar_02 |
| 3436 | 2004-03-31 | -0.061521 | region_06 | regional_covar_02 |
| 5563 | 2015-12-31 | 0.125899 | region_10 | regional_covar_02 |
| 1795 | 2016-06-30 | 0.133326 | region_03 | regional_covar_01 |
| 4445 | 1996-06-30 | 0.177914 | region_08 | regional_covar_02 |
The user must also specify the details of what the model will use. In particular, which regional variables, macroeconomic indicators (these should always have the aggregate region/top level geography as their region), and regional covariates to use. We’ll just use all of these:
aggregation_region = "uk"
region_names = [x for x in df["region"].unique() if x != aggregation_region]
macro_names = [x for x in df["measure"].unique() if "macro" in x]
region_covariate_names = [x for x in df["measure"].unique() if "regional_covar" in x]Gotcha: you must ensure that the annual regional data have datetime index entries for all quarters up to the last published annual macro value. For non-year end quarters, the values should be nan.
In a typical use case, you will have non-nan quarters of quarterly growth at the aggregate region (eg the UK) for which the regional data are nan. Those nans in the time period between is what we are nowcasting.
Okay, we’re ready to build a AMBRIC model!
amb = Ambric(
df,
macro_names,
region_names,
region_covariate_names,
n_factors=n_factors,
)
amb2026-05-10 15:03:30.451 | INFO | ambric:__init__:813 - Initialising ambric model.
2026-05-10 15:03:30.469 | INFO | ambric.utilities:prep_data_for_model_run:177 - Prepping data for model run.
2026-05-10 15:03:30.485 | INFO | ambric.utilities:prep_data_for_model_run:225 - Adding 6 extra rows of nans to q-on-4q regional data to match number of rows in quarterly data; these extra rows will be estimated by the model.
2026-05-10 15:03:30.486 | INFO | ambric.utilities:prep_data_for_model_run:244 - Per-region trailing NaN counts: [6 6 6 6 6 6 6 6 6 6 6 6]; using modal lag_qtrs=6
2026-05-10 15:03:30.499 | INFO | ambric.utilities:prep_data_for_model_run:281 - Input lengths after data prep:
2026-05-10 15:03:30.499 | INFO | ambric.utilities:prep_data_for_model_run:282 - Quarterly gva_q_on_q: T_max = 130
2026-05-10 15:03:30.500 | INFO | ambric.utilities:prep_data_for_model_run:283 - Annual gva_q_on_4q: T_max = 130
2026-05-10 15:03:30.501 | INFO | ambric.utilities:prep_data_for_model_run:284 - Macro series: 2
2026-05-10 15:03:30.502 | INFO | ambric.utilities:prep_data_for_model_run:285 - Regional covariate series: 4 per region
------------ambric Model-------------
Model ID: e9560875
Parameters:
Time periods (quarters), T=130
Earliest: 1990-March; Latest: 2022-June
Regions, R=12
Macroeconomic series, M=2
Regional indicators, J=4
n_factors=2
Model fitted: No
-------------------------------------
Note that the model has specified all of its details, including that it sees that there are 6 rows of the regional data missing that will be estimated by the model. The model also tells us it isn’t fitted, so let’s sort that. We recommended using at least 100k iterations.
n_iterations = 200000
n_posterior_samples = 3000
amb.fit(n_iterations, n_posterior_samples)2026-05-10 15:03:30.512 | INFO | ambric:fit:969 -
[1/5] Extracting 2 factors from indicator panel...
2026-05-10 15:03:30.513 | INFO | ambric:extract_factors_from_panel:210 - Creating factors:
2026-05-10 15:03:30.513 | INFO | ambric:extract_factors_from_panel:211 - From 4 series per region, creating...
2026-05-10 15:03:30.514 | INFO | ambric:extract_factors_from_panel:212 - ...2 factors
2026-05-10 15:03:30.521 | INFO | ambric:extract_factors_from_panel:232 - FA: Extracted 2 factors. Approximate explained variance: 9.3%
2026-05-10 15:03:30.521 | DEBUG | ambric:extract_factors_from_panel:235 - Noise variances range: 0.784 - 0.999
2026-05-10 15:03:30.522 | INFO | ambric:fit:977 -
[2/5] Training XGBoost on annual regional growth...
2026-05-10 15:03:30.524 | INFO | ambric:train_xgboost_annual:318 - XGBoost: Training on 360 region-year observations (6 features: 4 regional indicators + 2 macro).
2026-05-10 15:03:30.696 | INFO | ambric:train_xgboost_annual:351 - XGBoost: In-sample RMSE = 0.017171
2026-05-10 15:03:30.697 | INFO | ambric:fit:986 -
[3/5] Fitting MIDAS bridge equation...
2026-05-10 15:03:30.989 | DEBUG | ambric:fit_bridge_equation:489 - Bridge equation: Estimated Almon params theta=(-1.1461, 0.3572)
2026-05-10 15:03:30.990 | DEBUG | ambric:fit_bridge_equation:492 - Bridge equation: MIDAS weights = [0.3737, 0.1698, 0.1576, 0.2989] (Q1->Q4)
2026-05-10 15:03:30.993 | INFO | ambric:fit_bridge_equation:537 - Bridge: R² = 0.0266, RMSE = 0.020115
2026-05-10 15:03:30.994 | DEBUG | ambric:fit_bridge_equation:538 - Bridge: delta (XGBoost loading) = 0.0262
2026-05-10 15:03:30.995 | DEBUG | ambric:fit_bridge_equation:539 - Bridge: intercept = 0.0026
2026-05-10 15:03:31.004 | DEBUG | ambric:fit_bridge_equation:556 - Bridge equation: Quarterly signal shape = (130, 12), mean = 0.000658, std = 0.000814
2026-05-10 15:03:31.005 | INFO | ambric:fit:998 -
[4/5] Building AMBRIC Bayesian model...
2026-05-10 15:03:31.006 | INFO | ambric:build_ambric_model:625 - Building AMBRIC model:
2026-05-10 15:03:31.006 | DEBUG | ambric:build_ambric_model:626 - 2 factors, 2 macro series, 130 quarters, 12 regions
2026-05-10 15:03:31.007 | DEBUG | ambric:build_ambric_model:627 - Bridge signal: shape (130, 12)
/home/runner/work/ambric/ambric/.venv/lib/python3.14/site-packages/pymc/model/core.py:1316: ImputationWarning: Data in obs_annual contains missing values and will be automatically imputed from the sampling distribution.
warnings.warn(impute_message, ImputationWarning)
2026-05-10 15:03:31.254 | INFO | ambric:fit:1009 -
[5/5] Running variational inference (200000 iterations)...
/home/runner/work/ambric/ambric/.venv/lib/python3.14/site-packages/pytensor/link/c/cmodule.py:2986: UserWarning: PyTensor could not link to a BLAS installation. Operations that might benefit from BLAS will be severely degraded.
This usually happens when PyTensor is installed via pip. We recommend it be installed via conda/mamba/pixi instead.
Alternatively, you can use an experimental backend such as Numba or JAX that perform their own BLAS optimizations, by setting `pytensor.config.mode == 'NUMBA'` or passing `mode='NUMBA'` when compiling a PyTensor function.
For more options and details see https://pytensor.readthedocs.io/en/latest/troubleshooting.html#how-do-i-configure-test-my-blas-library
warnings.warn(
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Finished [100%]: Average Loss = 171.46
------------ambric Model-------------
Model ID: e9560875
Parameters:
Time periods (quarters), T=130
Earliest: 1990-March; Latest: 2022-June
Regions, R=12
Macroeconomic series, M=2
Regional indicators, J=4
n_factors=2
Bridge R²=0.0266, Model fitted: Yes
Posterior samples: 3000
ADVI iterations: 200000
-------------------------------------
That’s it! It’s done. Now let’s look at some results.
First, our regional estimates of quarterly growth must be consistent with the observed national growth. We can check the implied vs the true growth at the national level.
amb.plot_national_quarterly_vs_implied()2026-05-10 15:08:04.406 | INFO | ambric.diagnostics:rmse_national_quarterly:118 - UK Quarterly RMSE (true vs implied): 0.0013
2026-05-10 15:08:04.691 | INFO | ambric:plot_national_quarterly_vs_implied:1125 - Plotted national quarterly growth rates vs implied estimates from the model.
Next let’s look at what the regional growth (q on 4 q earlier) looks like for all regions.
amb.plot_regional_annual_estimate()2026-05-10 15:08:05.247 | INFO | ambric.diagnostics:rmse_regions_annual:98 - RMSE (true vs estimated) per region (annual):
2026-05-10 15:08:05.247 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 1: 0.0114
2026-05-10 15:08:05.248 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 2: 0.0113
2026-05-10 15:08:05.249 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 3: 0.0101
2026-05-10 15:08:05.249 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 4: 0.0161
2026-05-10 15:08:05.250 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 5: 0.0107
2026-05-10 15:08:05.250 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 6: 0.0112
2026-05-10 15:08:05.251 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 7: 0.0095
2026-05-10 15:08:05.251 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 8: 0.0117
2026-05-10 15:08:05.252 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 9: 0.0094
2026-05-10 15:08:05.252 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 10: 0.0098
2026-05-10 15:08:05.252 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 11: 0.0100
2026-05-10 15:08:05.253 | INFO | ambric.diagnostics:rmse_regions_annual:100 - Region 12: 0.0114
2026-05-10 15:08:05.253 | INFO | ambric.diagnostics:rmse_regions_annual:101 - Average: 0.0110
2026-05-10 15:08:07.881 | INFO | ambric:plot_regional_annual_estimate:1151 - Plotted regional annual growth rates vs estimated from the model.
We can also look at the underlying quarterly regional growth estimates (the latent \(y_{t,r}\)):
amb.plot_estimated_regional_quarterly()And, if we want tables of the nowcasts, there’s a built-in for that at either q-on-4q
amb.point_estimates_q_on_4q().iloc[-3:, :]| region_00 | region_01 | region_02 | region_03 | region_04 | region_05 | region_06 | region_07 | region_08 | region_09 | region_10 | region_11 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| datetime | ||||||||||||
| 2021-12-31 | -3.75 | -2.83 | -4.84 | -2.50 | -4.25 | -4.11 | -5.20 | -4.39 | -3.95 | -3.13 | -3.30 | -4.29 |
| 2022-03-31 | -3.49 | -2.54 | -4.39 | -2.24 | -3.71 | -3.92 | -4.93 | -4.27 | -3.84 | -3.00 | -3.20 | -3.98 |
| 2022-06-30 | -3.48 | -2.65 | -4.02 | -2.09 | -3.37 | -3.56 | -4.33 | -3.90 | -3.55 | -2.95 | -2.99 | -3.58 |
or q-on-q:
amb.point_estimates_q_on_q().iloc[-3:, :]| region_00 | region_01 | region_02 | region_03 | region_04 | region_05 | region_06 | region_07 | region_08 | region_09 | region_10 | region_11 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| datetime | ||||||||||||
| 2021-12-31 | -0.45 | -0.96 | -1.41 | -0.70 | -1.29 | -0.44 | -0.45 | -1.23 | -0.79 | -1.01 | -0.82 | -0.49 |
| 2022-03-31 | -0.61 | -0.26 | -0.54 | -0.82 | -0.65 | -1.55 | -1.19 | -0.43 | -0.90 | -0.35 | -0.73 | -1.04 |
| 2022-06-30 | -0.96 | -1.28 | -0.77 | -0.74 | -1.09 | -0.88 | -0.75 | -0.57 | -0.57 | -1.22 | -0.49 | -0.61 |
These can be turned into a regional index, rebased to 100 at the start of the sample:
amb.to_index_q_on_q().iloc[-3:, :]| region_00 | region_01 | region_02 | region_03 | region_04 | region_05 | region_06 | region_07 | region_08 | region_09 | region_10 | region_11 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| datetime | ||||||||||||
| 2021-12-31 | 112.54 | 104.73 | 107.18 | 99.32 | 105.61 | 111.12 | 96.64 | 99.00 | 99.58 | 102.82 | 102.95 | 94.33 |
| 2022-03-31 | 111.85 | 104.45 | 106.60 | 98.50 | 104.93 | 109.40 | 95.49 | 98.58 | 98.68 | 102.46 | 102.20 | 93.35 |
| 2022-06-30 | 110.78 | 103.12 | 105.78 | 97.77 | 103.78 | 108.43 | 94.78 | 98.02 | 98.12 | 101.21 | 101.70 | 92.78 |
For a less granular binned signal, bands_indicator() classifies each period into growth bands rather than just recession/expansion:
amb.bands_indicator().set_index(["region", "datetime"]).unstack(0).tail()2026-05-10 15:08:11.412 | DEBUG | pydemetra._java:start_jvm:134 - Starting JVM with 13 JARs, max_heap=512m
| classification | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| region | region_00 | region_01 | region_02 | region_03 | region_04 | region_05 | region_06 | region_07 | region_08 | region_09 | region_10 | region_11 |
| datetime | ||||||||||||
| 2021-06-30 | strong contraction | contraction | strong contraction | contraction | contraction | strong contraction | strong contraction | strong contraction | strong contraction | contraction | contraction | strong contraction |
| 2021-09-30 | strong contraction | contraction | strong contraction | contraction | contraction | contraction | strong contraction | strong contraction | strong contraction | contraction | contraction | contraction |
| 2021-12-31 | contraction | contraction | strong contraction | contraction | contraction | contraction | contraction | strong contraction | strong contraction | contraction | contraction | contraction |
| 2022-03-31 | contraction | contraction | contraction | strong contraction | strong contraction | strong contraction | strong contraction | contraction | strong contraction | contraction | contraction | contraction |
| 2022-06-30 | strong contraction | strong contraction | contraction | strong contraction | strong contraction | strong contraction | strong contraction | contraction | contraction | strong contraction | contraction | contraction |
There is also access to all of the internal data generated when the model runs. The raw Bayesian samples can be retrieved using amb.trace, while the full set of predictions and outturns are available through amb.populate_results():
amb.populate_results()| datetime | value | region | measure | type | |
|---|---|---|---|---|---|
| 0 | 1990-03-31 | 0.000348 | uk | gva_q_on_q | outturn |
| 1 | 1990-06-30 | -0.001297 | uk | gva_q_on_q | outturn |
| 2 | 1990-09-30 | -0.002929 | uk | gva_q_on_q | outturn |
| 3 | 1990-12-31 | -0.003043 | uk | gva_q_on_q | outturn |
| 4 | 1991-03-31 | -0.001388 | uk | gva_q_on_q | outturn |
| ... | ... | ... | ... | ... | ... |
| 1555 | 2021-06-30 | -0.009820 | region_11 | q_on_q | nowcast |
| 1556 | 2021-09-30 | -0.007707 | region_11 | q_on_q | nowcast |
| 1557 | 2021-12-31 | -0.004921 | region_11 | q_on_q | nowcast |
| 1558 | 2022-03-31 | -0.010416 | region_11 | q_on_q | nowcast |
| 1559 | 2022-06-30 | -0.006142 | region_11 | q_on_q | nowcast |
4868 rows × 5 columns
Factor and macro loadings
AMBRIC’s hierarchical loadings — \(\Lambda\) for the regional factors, \(\Gamma\) for the macro covariates, and \(\delta_r\) for the XGBoost bridge signal (see the README for the full specification) — can be inspected after fitting.
amb.assemble_loadings_data().head()2026-05-10 15:08:18.822 | INFO | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds
| region | loading_name | broad_type | mean | hdi_low | hdi_high | scaled | |
|---|---|---|---|---|---|---|---|
| 0 | region_00 | factor_0 | factors | -0.002508 | -0.019945 | 0.014487 | True |
| 1 | region_00 | factor_1 | factors | 0.000643 | -0.016979 | 0.019846 | True |
| 2 | region_01 | factor_0 | factors | -0.001983 | -0.018213 | 0.014985 | True |
| 3 | region_01 | factor_1 | factors | 0.002221 | -0.016776 | 0.022254 | True |
| 4 | region_02 | factor_0 | factors | -0.000808 | -0.017801 | 0.015884 | True |
amb.plot_loadings_by_region()2026-05-10 15:08:18.857 | INFO | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds
2026-05-10 15:08:19.862 | INFO | ambric:plot_loadings_by_region:1297 - Plotted loadings by region.
amb.plot_loadings_aggregate()2026-05-10 15:08:19.886 | INFO | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds
2026-05-10 15:08:20.570 | INFO | ambric:plot_loadings_aggregate:1322 - Plotted loadings in aggregate by broad type.
If you want to persist the fitted posterior to disk for reuse, call amb.save_trace('path/to/trace.nc').