How to use AMBRIC

Let’s import the package.

from ambric import Ambric
from ambric.utilities import generate_realistic_simulated_data

As a user, we have to bring a few things to the party. The first, of course, is data, which we will simulate.

Let’s set-up some simulated data. We’ll specify how many underlying factors are driving regional dynamics first.

n_factors = 2
df = generate_realistic_simulated_data(n_factors=n_factors, R=12)
2026-05-10 15:03:30.396 | INFO     | ambric.utilities:generate_realistic_simulated_data:47 - Simulating Data: T=130, R=12, J=4, Macro=2, Factors=2, lag=6

Data that are input into the model must have this structure:

df.sample(10)
datetime value region measure
1177 1991-12-31 -0.077809 region_02 regional_covar_01
784 2000-03-31 NaN region_06 gva_q_on_4q
2822 2013-03-31 -0.153500 region_05 regional_covar_01
2493 1995-12-31 0.254217 region_04 regional_covar_03
2843 2018-06-30 -0.266745 region_05 regional_covar_01
2461 2020-06-30 -0.192657 region_04 regional_covar_02
3436 2004-03-31 -0.061521 region_06 regional_covar_02
5563 2015-12-31 0.125899 region_10 regional_covar_02
1795 2016-06-30 0.133326 region_03 regional_covar_01
4445 1996-06-30 0.177914 region_08 regional_covar_02

The user must also specify the details of what the model will use. In particular, which regional variables, macroeconomic indicators (these should always have the aggregate region/top level geography as their region), and regional covariates to use. We’ll just use all of these:

aggregation_region = "uk"
region_names = [x for x in df["region"].unique() if x != aggregation_region]
macro_names = [x for x in df["measure"].unique() if "macro" in x]
region_covariate_names = [x for x in df["measure"].unique() if "regional_covar" in x]

Gotcha: you must ensure that the annual regional data have datetime index entries for all quarters up to the last published annual macro value. For non-year end quarters, the values should be nan.

In a typical use case, you will have non-nan quarters of quarterly growth at the aggregate region (eg the UK) for which the regional data are nan. Those nans in the time period between is what we are nowcasting.

Okay, we’re ready to build a AMBRIC model!

amb = Ambric(
    df,
    macro_names,
    region_names,
    region_covariate_names,
    n_factors=n_factors,
)
amb
2026-05-10 15:03:30.451 | INFO     | ambric:__init__:813 - Initialising ambric model.
2026-05-10 15:03:30.469 | INFO     | ambric.utilities:prep_data_for_model_run:177 - Prepping data for model run.
2026-05-10 15:03:30.485 | INFO     | ambric.utilities:prep_data_for_model_run:225 - Adding 6 extra rows of nans to q-on-4q regional data to match number of rows in quarterly data; these extra rows will be estimated by the model.
2026-05-10 15:03:30.486 | INFO     | ambric.utilities:prep_data_for_model_run:244 - Per-region trailing NaN counts: [6 6 6 6 6 6 6 6 6 6 6 6]; using modal lag_qtrs=6
2026-05-10 15:03:30.499 | INFO     | ambric.utilities:prep_data_for_model_run:281 - Input lengths after data prep:
2026-05-10 15:03:30.499 | INFO     | ambric.utilities:prep_data_for_model_run:282 -   Quarterly gva_q_on_q: T_max = 130
2026-05-10 15:03:30.500 | INFO     | ambric.utilities:prep_data_for_model_run:283 -   Annual gva_q_on_4q: T_max = 130
2026-05-10 15:03:30.501 | INFO     | ambric.utilities:prep_data_for_model_run:284 -   Macro series: 2
2026-05-10 15:03:30.502 | INFO     | ambric.utilities:prep_data_for_model_run:285 -   Regional covariate series: 4 per region
------------ambric Model-------------
Model ID: e9560875
Parameters:
   Time periods (quarters), T=130
   Earliest: 1990-March; Latest: 2022-June
   Regions, R=12
   Macroeconomic series, M=2
   Regional indicators, J=4
   n_factors=2
Model fitted: No
-------------------------------------

Note that the model has specified all of its details, including that it sees that there are 6 rows of the regional data missing that will be estimated by the model. The model also tells us it isn’t fitted, so let’s sort that. We recommended using at least 100k iterations.

n_iterations = 200000
n_posterior_samples = 3000
amb.fit(n_iterations, n_posterior_samples)
2026-05-10 15:03:30.512 | INFO     | ambric:fit:969 - 
[1/5] Extracting 2 factors from indicator panel...
2026-05-10 15:03:30.513 | INFO     | ambric:extract_factors_from_panel:210 - Creating factors:
2026-05-10 15:03:30.513 | INFO     | ambric:extract_factors_from_panel:211 -    From 4 series per region, creating...
2026-05-10 15:03:30.514 | INFO     | ambric:extract_factors_from_panel:212 -    ...2 factors
2026-05-10 15:03:30.521 | INFO     | ambric:extract_factors_from_panel:232 - FA: Extracted 2 factors. Approximate explained variance: 9.3%
2026-05-10 15:03:30.521 | DEBUG    | ambric:extract_factors_from_panel:235 - Noise variances range: 0.784 - 0.999
2026-05-10 15:03:30.522 | INFO     | ambric:fit:977 - 
[2/5] Training XGBoost on annual regional growth...
2026-05-10 15:03:30.524 | INFO     | ambric:train_xgboost_annual:318 - XGBoost: Training on 360 region-year observations (6 features: 4 regional indicators + 2 macro).
2026-05-10 15:03:30.696 | INFO     | ambric:train_xgboost_annual:351 - XGBoost: In-sample RMSE = 0.017171
2026-05-10 15:03:30.697 | INFO     | ambric:fit:986 - 
[3/5] Fitting MIDAS bridge equation...
2026-05-10 15:03:30.989 | DEBUG    | ambric:fit_bridge_equation:489 - Bridge equation: Estimated Almon params theta=(-1.1461, 0.3572)
2026-05-10 15:03:30.990 | DEBUG    | ambric:fit_bridge_equation:492 - Bridge equation: MIDAS weights = [0.3737, 0.1698, 0.1576, 0.2989] (Q1->Q4)
2026-05-10 15:03:30.993 | INFO     | ambric:fit_bridge_equation:537 - Bridge: R² = 0.0266, RMSE = 0.020115
2026-05-10 15:03:30.994 | DEBUG    | ambric:fit_bridge_equation:538 - Bridge: delta (XGBoost loading) = 0.0262
2026-05-10 15:03:30.995 | DEBUG    | ambric:fit_bridge_equation:539 - Bridge: intercept = 0.0026
2026-05-10 15:03:31.004 | DEBUG    | ambric:fit_bridge_equation:556 - Bridge equation: Quarterly signal shape = (130, 12), mean = 0.000658, std = 0.000814
2026-05-10 15:03:31.005 | INFO     | ambric:fit:998 - 
[4/5] Building AMBRIC Bayesian model...
2026-05-10 15:03:31.006 | INFO     | ambric:build_ambric_model:625 - Building AMBRIC model:
2026-05-10 15:03:31.006 | DEBUG    | ambric:build_ambric_model:626 -    2 factors, 2 macro series, 130 quarters, 12 regions
2026-05-10 15:03:31.007 | DEBUG    | ambric:build_ambric_model:627 -    Bridge signal: shape (130, 12)
/home/runner/work/ambric/ambric/.venv/lib/python3.14/site-packages/pymc/model/core.py:1316: ImputationWarning: Data in obs_annual contains missing values and will be automatically imputed from the sampling distribution.
  warnings.warn(impute_message, ImputationWarning)
2026-05-10 15:03:31.254 | INFO     | ambric:fit:1009 - 
[5/5] Running variational inference (200000 iterations)...
/home/runner/work/ambric/ambric/.venv/lib/python3.14/site-packages/pytensor/link/c/cmodule.py:2986: UserWarning: PyTensor could not link to a BLAS installation. Operations that might benefit from BLAS will be severely degraded.
This usually happens when PyTensor is installed via pip. We recommend it be installed via conda/mamba/pixi instead.
Alternatively, you can use an experimental backend such as Numba or JAX that perform their own BLAS optimizations, by setting `pytensor.config.mode == 'NUMBA'` or passing `mode='NUMBA'` when compiling a PyTensor function.
For more options and details see https://pytensor.readthedocs.io/en/latest/troubleshooting.html#how-do-i-configure-test-my-blas-library
  warnings.warn(
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

Finished [100%]: Average Loss = 171.46
------------ambric Model-------------
Model ID: e9560875
Parameters:
   Time periods (quarters), T=130
   Earliest: 1990-March; Latest: 2022-June
   Regions, R=12
   Macroeconomic series, M=2
   Regional indicators, J=4
   n_factors=2
   Bridge R²=0.0266, Model fitted: Yes
   Posterior samples: 3000
   ADVI iterations: 200000
-------------------------------------

That’s it! It’s done. Now let’s look at some results.

First, our regional estimates of quarterly growth must be consistent with the observed national growth. We can check the implied vs the true growth at the national level.

amb.plot_national_quarterly_vs_implied()
2026-05-10 15:08:04.406 | INFO     | ambric.diagnostics:rmse_national_quarterly:118 - UK Quarterly RMSE (true vs implied): 0.0013

2026-05-10 15:08:04.691 | INFO     | ambric:plot_national_quarterly_vs_implied:1125 - Plotted national quarterly growth rates vs implied estimates from the model.

Next let’s look at what the regional growth (q on 4 q earlier) looks like for all regions.

amb.plot_regional_annual_estimate()
2026-05-10 15:08:05.247 | INFO     | ambric.diagnostics:rmse_regions_annual:98 - RMSE (true vs estimated) per region (annual):
2026-05-10 15:08:05.247 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 1: 0.0114
2026-05-10 15:08:05.248 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 2: 0.0113
2026-05-10 15:08:05.249 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 3: 0.0101
2026-05-10 15:08:05.249 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 4: 0.0161
2026-05-10 15:08:05.250 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 5: 0.0107
2026-05-10 15:08:05.250 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 6: 0.0112
2026-05-10 15:08:05.251 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 7: 0.0095
2026-05-10 15:08:05.251 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 8: 0.0117
2026-05-10 15:08:05.252 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 9: 0.0094
2026-05-10 15:08:05.252 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 10: 0.0098
2026-05-10 15:08:05.252 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 11: 0.0100
2026-05-10 15:08:05.253 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 12: 0.0114
2026-05-10 15:08:05.253 | INFO     | ambric.diagnostics:rmse_regions_annual:101 -   Average: 0.0110

2026-05-10 15:08:07.881 | INFO     | ambric:plot_regional_annual_estimate:1151 - Plotted regional annual growth rates vs estimated from the model.

We can also look at the underlying quarterly regional growth estimates (the latent \(y_{t,r}\)):

amb.plot_estimated_regional_quarterly()

And, if we want tables of the nowcasts, there’s a built-in for that at either q-on-4q

amb.point_estimates_q_on_4q().iloc[-3:, :]
region_00 region_01 region_02 region_03 region_04 region_05 region_06 region_07 region_08 region_09 region_10 region_11
datetime
2021-12-31 -3.75 -2.83 -4.84 -2.50 -4.25 -4.11 -5.20 -4.39 -3.95 -3.13 -3.30 -4.29
2022-03-31 -3.49 -2.54 -4.39 -2.24 -3.71 -3.92 -4.93 -4.27 -3.84 -3.00 -3.20 -3.98
2022-06-30 -3.48 -2.65 -4.02 -2.09 -3.37 -3.56 -4.33 -3.90 -3.55 -2.95 -2.99 -3.58

or q-on-q:

amb.point_estimates_q_on_q().iloc[-3:, :]
region_00 region_01 region_02 region_03 region_04 region_05 region_06 region_07 region_08 region_09 region_10 region_11
datetime
2021-12-31 -0.45 -0.96 -1.41 -0.70 -1.29 -0.44 -0.45 -1.23 -0.79 -1.01 -0.82 -0.49
2022-03-31 -0.61 -0.26 -0.54 -0.82 -0.65 -1.55 -1.19 -0.43 -0.90 -0.35 -0.73 -1.04
2022-06-30 -0.96 -1.28 -0.77 -0.74 -1.09 -0.88 -0.75 -0.57 -0.57 -1.22 -0.49 -0.61

These can be turned into a regional index, rebased to 100 at the start of the sample:

amb.to_index_q_on_q().iloc[-3:, :]
region_00 region_01 region_02 region_03 region_04 region_05 region_06 region_07 region_08 region_09 region_10 region_11
datetime
2021-12-31 112.54 104.73 107.18 99.32 105.61 111.12 96.64 99.00 99.58 102.82 102.95 94.33
2022-03-31 111.85 104.45 106.60 98.50 104.93 109.40 95.49 98.58 98.68 102.46 102.20 93.35
2022-06-30 110.78 103.12 105.78 97.77 103.78 108.43 94.78 98.02 98.12 101.21 101.70 92.78

For a less granular binned signal, bands_indicator() classifies each period into growth bands rather than just recession/expansion:

amb.bands_indicator().set_index(["region", "datetime"]).unstack(0).tail()
2026-05-10 15:08:11.412 | DEBUG    | pydemetra._java:start_jvm:134 - Starting JVM with 13 JARs, max_heap=512m
classification
region region_00 region_01 region_02 region_03 region_04 region_05 region_06 region_07 region_08 region_09 region_10 region_11
datetime
2021-06-30 strong contraction contraction strong contraction contraction contraction strong contraction strong contraction strong contraction strong contraction contraction contraction strong contraction
2021-09-30 strong contraction contraction strong contraction contraction contraction contraction strong contraction strong contraction strong contraction contraction contraction contraction
2021-12-31 contraction contraction strong contraction contraction contraction contraction contraction strong contraction strong contraction contraction contraction contraction
2022-03-31 contraction contraction contraction strong contraction strong contraction strong contraction strong contraction contraction strong contraction contraction contraction contraction
2022-06-30 strong contraction strong contraction contraction strong contraction strong contraction strong contraction strong contraction contraction contraction strong contraction contraction contraction

There is also access to all of the internal data generated when the model runs. The raw Bayesian samples can be retrieved using amb.trace, while the full set of predictions and outturns are available through amb.populate_results():

amb.populate_results()
datetime value region measure type
0 1990-03-31 0.000348 uk gva_q_on_q outturn
1 1990-06-30 -0.001297 uk gva_q_on_q outturn
2 1990-09-30 -0.002929 uk gva_q_on_q outturn
3 1990-12-31 -0.003043 uk gva_q_on_q outturn
4 1991-03-31 -0.001388 uk gva_q_on_q outturn
... ... ... ... ... ...
1555 2021-06-30 -0.009820 region_11 q_on_q nowcast
1556 2021-09-30 -0.007707 region_11 q_on_q nowcast
1557 2021-12-31 -0.004921 region_11 q_on_q nowcast
1558 2022-03-31 -0.010416 region_11 q_on_q nowcast
1559 2022-06-30 -0.006142 region_11 q_on_q nowcast

4868 rows × 5 columns

Factor and macro loadings

AMBRIC’s hierarchical loadings — \(\Lambda\) for the regional factors, \(\Gamma\) for the macro covariates, and \(\delta_r\) for the XGBoost bridge signal (see the README for the full specification) — can be inspected after fitting.

amb.assemble_loadings_data().head()
2026-05-10 15:08:18.822 | INFO     | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds
region loading_name broad_type mean hdi_low hdi_high scaled
0 region_00 factor_0 factors -0.002508 -0.019945 0.014487 True
1 region_00 factor_1 factors 0.000643 -0.016979 0.019846 True
2 region_01 factor_0 factors -0.001983 -0.018213 0.014985 True
3 region_01 factor_1 factors 0.002221 -0.016776 0.022254 True
4 region_02 factor_0 factors -0.000808 -0.017801 0.015884 True
amb.plot_loadings_by_region()
2026-05-10 15:08:18.857 | INFO     | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds

2026-05-10 15:08:19.862 | INFO     | ambric:plot_loadings_by_region:1297 - Plotted loadings by region.
amb.plot_loadings_aggregate()
2026-05-10 15:08:19.886 | INFO     | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds

2026-05-10 15:08:20.570 | INFO     | ambric:plot_loadings_aggregate:1322 - Plotted loadings in aggregate by broad type.

If you want to persist the fitted posterior to disk for reuse, call amb.save_trace('path/to/trace.nc').