How to use AMBRIC

Let’s import the package.

from ambric import Ambric
from ambric.utilities import generate_realistic_simulated_data

As a user, we have to bring a few things to the party. The first, of course, is data, which we will simulate.

Let’s set-up some simulated data. We’ll specify how many underlying factors are driving regional dynamics first.

n_factors = 2
df = generate_realistic_simulated_data(n_factors=n_factors, R=12)

2026-05-10 15:03:30.396 | INFO     | ambric.utilities:generate_realistic_simulated_data:47 - Simulating Data: T=130, R=12, J=4, Macro=2, Factors=2, lag=6

Data that are input into the model must have this structure:

df.sample(10)

	datetime	value	region	measure
1177	1991-12-31	-0.077809	region_02	regional_covar_01
784	2000-03-31	NaN	region_06	gva_q_on_4q
2822	2013-03-31	-0.153500	region_05	regional_covar_01
2493	1995-12-31	0.254217	region_04	regional_covar_03
2843	2018-06-30	-0.266745	region_05	regional_covar_01
2461	2020-06-30	-0.192657	region_04	regional_covar_02
3436	2004-03-31	-0.061521	region_06	regional_covar_02
5563	2015-12-31	0.125899	region_10	regional_covar_02
1795	2016-06-30	0.133326	region_03	regional_covar_01
4445	1996-06-30	0.177914	region_08	regional_covar_02

The user must also specify the details of what the model will use. In particular, which regional variables, macroeconomic indicators (these should always have the aggregate region/top level geography as their region), and regional covariates to use. We’ll just use all of these:

aggregation_region = "uk"
region_names = [x for x in df["region"].unique() if x != aggregation_region]
macro_names = [x for x in df["measure"].unique() if "macro" in x]
region_covariate_names = [x for x in df["measure"].unique() if "regional_covar" in x]

Gotcha: you must ensure that the annual regional data have datetime index entries for all quarters up to the last published annual macro value. For non-year end quarters, the values should be nan.

In a typical use case, you will have non-nan quarters of quarterly growth at the aggregate region (eg the UK) for which the regional data are nan. Those nans in the time period between is what we are nowcasting.

Okay, we’re ready to build a AMBRIC model!

amb = Ambric(
    df,
    macro_names,
    region_names,
    region_covariate_names,
    n_factors=n_factors,
)
amb

2026-05-10 15:03:30.451 | INFO     | ambric:__init__:813 - Initialising ambric model.
2026-05-10 15:03:30.469 | INFO     | ambric.utilities:prep_data_for_model_run:177 - Prepping data for model run.
2026-05-10 15:03:30.485 | INFO     | ambric.utilities:prep_data_for_model_run:225 - Adding 6 extra rows of nans to q-on-4q regional data to match number of rows in quarterly data; these extra rows will be estimated by the model.
2026-05-10 15:03:30.486 | INFO     | ambric.utilities:prep_data_for_model_run:244 - Per-region trailing NaN counts: [6 6 6 6 6 6 6 6 6 6 6 6]; using modal lag_qtrs=6
2026-05-10 15:03:30.499 | INFO     | ambric.utilities:prep_data_for_model_run:281 - Input lengths after data prep:
2026-05-10 15:03:30.499 | INFO     | ambric.utilities:prep_data_for_model_run:282 -   Quarterly gva_q_on_q: T_max = 130
2026-05-10 15:03:30.500 | INFO     | ambric.utilities:prep_data_for_model_run:283 -   Annual gva_q_on_4q: T_max = 130
2026-05-10 15:03:30.501 | INFO     | ambric.utilities:prep_data_for_model_run:284 -   Macro series: 2
2026-05-10 15:03:30.502 | INFO     | ambric.utilities:prep_data_for_model_run:285 -   Regional covariate series: 4 per region

------------ambric Model-------------
Model ID: e9560875
Parameters:
   Time periods (quarters), T=130
   Earliest: 1990-March; Latest: 2022-June
   Regions, R=12
   Macroeconomic series, M=2
   Regional indicators, J=4
   n_factors=2
Model fitted: No
-------------------------------------

Note that the model has specified all of its details, including that it sees that there are 6 rows of the regional data missing that will be estimated by the model. The model also tells us it isn’t fitted, so let’s sort that. We recommended using at least 100k iterations.

n_iterations = 200000
n_posterior_samples = 3000
amb.fit(n_iterations, n_posterior_samples)

2026-05-10 15:03:30.512 | INFO     | ambric:fit:969 - 
[1/5] Extracting 2 factors from indicator panel...
2026-05-10 15:03:30.513 | INFO     | ambric:extract_factors_from_panel:210 - Creating factors:
2026-05-10 15:03:30.513 | INFO     | ambric:extract_factors_from_panel:211 -    From 4 series per region, creating...
2026-05-10 15:03:30.514 | INFO     | ambric:extract_factors_from_panel:212 -    ...2 factors
2026-05-10 15:03:30.521 | INFO     | ambric:extract_factors_from_panel:232 - FA: Extracted 2 factors. Approximate explained variance: 9.3%
2026-05-10 15:03:30.521 | DEBUG    | ambric:extract_factors_from_panel:235 - Noise variances range: 0.784 - 0.999
2026-05-10 15:03:30.522 | INFO     | ambric:fit:977 - 
[2/5] Training XGBoost on annual regional growth...
2026-05-10 15:03:30.524 | INFO     | ambric:train_xgboost_annual:318 - XGBoost: Training on 360 region-year observations (6 features: 4 regional indicators + 2 macro).
2026-05-10 15:03:30.696 | INFO     | ambric:train_xgboost_annual:351 - XGBoost: In-sample RMSE = 0.017171
2026-05-10 15:03:30.697 | INFO     | ambric:fit:986 - 
[3/5] Fitting MIDAS bridge equation...
2026-05-10 15:03:30.989 | DEBUG    | ambric:fit_bridge_equation:489 - Bridge equation: Estimated Almon params theta=(-1.1461, 0.3572)
2026-05-10 15:03:30.990 | DEBUG    | ambric:fit_bridge_equation:492 - Bridge equation: MIDAS weights = [0.3737, 0.1698, 0.1576, 0.2989] (Q1->Q4)
2026-05-10 15:03:30.993 | INFO     | ambric:fit_bridge_equation:537 - Bridge: R² = 0.0266, RMSE = 0.020115
2026-05-10 15:03:30.994 | DEBUG    | ambric:fit_bridge_equation:538 - Bridge: delta (XGBoost loading) = 0.0262
2026-05-10 15:03:30.995 | DEBUG    | ambric:fit_bridge_equation:539 - Bridge: intercept = 0.0026
2026-05-10 15:03:31.004 | DEBUG    | ambric:fit_bridge_equation:556 - Bridge equation: Quarterly signal shape = (130, 12), mean = 0.000658, std = 0.000814
2026-05-10 15:03:31.005 | INFO     | ambric:fit:998 - 
[4/5] Building AMBRIC Bayesian model...
2026-05-10 15:03:31.006 | INFO     | ambric:build_ambric_model:625 - Building AMBRIC model:
2026-05-10 15:03:31.006 | DEBUG    | ambric:build_ambric_model:626 -    2 factors, 2 macro series, 130 quarters, 12 regions
2026-05-10 15:03:31.007 | DEBUG    | ambric:build_ambric_model:627 -    Bridge signal: shape (130, 12)
/home/runner/work/ambric/ambric/.venv/lib/python3.14/site-packages/pymc/model/core.py:1316: ImputationWarning: Data in obs_annual contains missing values and will be automatically imputed from the sampling distribution.
  warnings.warn(impute_message, ImputationWarning)
2026-05-10 15:03:31.254 | INFO     | ambric:fit:1009 - 
[5/5] Running variational inference (200000 iterations)...
/home/runner/work/ambric/ambric/.venv/lib/python3.14/site-packages/pytensor/link/c/cmodule.py:2986: UserWarning: PyTensor could not link to a BLAS installation. Operations that might benefit from BLAS will be severely degraded.
This usually happens when PyTensor is installed via pip. We recommend it be installed via conda/mamba/pixi instead.
Alternatively, you can use an experimental backend such as Numba or JAX that perform their own BLAS optimizations, by setting `pytensor.config.mode == 'NUMBA'` or passing `mode='NUMBA'` when compiling a PyTensor function.
For more options and details see https://pytensor.readthedocs.io/en/latest/troubleshooting.html#how-do-i-configure-test-my-blas-library
  warnings.warn(
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

Finished [100%]: Average Loss = 171.46

------------ambric Model-------------
Model ID: e9560875
Parameters:
   Time periods (quarters), T=130
   Earliest: 1990-March; Latest: 2022-June
   Regions, R=12
   Macroeconomic series, M=2
   Regional indicators, J=4
   n_factors=2
   Bridge R²=0.0266, Model fitted: Yes
   Posterior samples: 3000
   ADVI iterations: 200000
-------------------------------------

That’s it! It’s done. Now let’s look at some results.

First, our regional estimates of quarterly growth must be consistent with the observed national growth. We can check the implied vs the true growth at the national level.

amb.plot_national_quarterly_vs_implied()

2026-05-10 15:08:04.406 | INFO     | ambric.diagnostics:rmse_national_quarterly:118 - UK Quarterly RMSE (true vs implied): 0.0013

2026-05-10 15:08:04.691 | INFO     | ambric:plot_national_quarterly_vs_implied:1125 - Plotted national quarterly growth rates vs implied estimates from the model.

Next let’s look at what the regional growth (q on 4 q earlier) looks like for all regions.

amb.plot_regional_annual_estimate()

2026-05-10 15:08:05.247 | INFO     | ambric.diagnostics:rmse_regions_annual:98 - RMSE (true vs estimated) per region (annual):
2026-05-10 15:08:05.247 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 1: 0.0114
2026-05-10 15:08:05.248 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 2: 0.0113
2026-05-10 15:08:05.249 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 3: 0.0101
2026-05-10 15:08:05.249 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 4: 0.0161
2026-05-10 15:08:05.250 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 5: 0.0107
2026-05-10 15:08:05.250 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 6: 0.0112
2026-05-10 15:08:05.251 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 7: 0.0095
2026-05-10 15:08:05.251 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 8: 0.0117
2026-05-10 15:08:05.252 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 9: 0.0094
2026-05-10 15:08:05.252 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 10: 0.0098
2026-05-10 15:08:05.252 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 11: 0.0100
2026-05-10 15:08:05.253 | INFO     | ambric.diagnostics:rmse_regions_annual:100 -   Region 12: 0.0114
2026-05-10 15:08:05.253 | INFO     | ambric.diagnostics:rmse_regions_annual:101 -   Average: 0.0110

2026-05-10 15:08:07.881 | INFO     | ambric:plot_regional_annual_estimate:1151 - Plotted regional annual growth rates vs estimated from the model.

We can also look at the underlying quarterly regional growth estimates (the latent \(y_{t,r}\)):

amb.plot_estimated_regional_quarterly()

And, if we want tables of the nowcasts, there’s a built-in for that at either q-on-4q

amb.point_estimates_q_on_4q().iloc[-3:, :]

	region_00	region_01	region_02	region_03	region_04	region_05	region_06	region_07	region_08	region_09	region_10	region_11
datetime
2021-12-31	-3.75	-2.83	-4.84	-2.50	-4.25	-4.11	-5.20	-4.39	-3.95	-3.13	-3.30	-4.29
2022-03-31	-3.49	-2.54	-4.39	-2.24	-3.71	-3.92	-4.93	-4.27	-3.84	-3.00	-3.20	-3.98
2022-06-30	-3.48	-2.65	-4.02	-2.09	-3.37	-3.56	-4.33	-3.90	-3.55	-2.95	-2.99	-3.58

or q-on-q:

amb.point_estimates_q_on_q().iloc[-3:, :]

	region_00	region_01	region_02	region_03	region_04	region_05	region_06	region_07	region_08	region_09	region_10	region_11
datetime
2021-12-31	-0.45	-0.96	-1.41	-0.70	-1.29	-0.44	-0.45	-1.23	-0.79	-1.01	-0.82	-0.49
2022-03-31	-0.61	-0.26	-0.54	-0.82	-0.65	-1.55	-1.19	-0.43	-0.90	-0.35	-0.73	-1.04
2022-06-30	-0.96	-1.28	-0.77	-0.74	-1.09	-0.88	-0.75	-0.57	-0.57	-1.22	-0.49	-0.61

These can be turned into a regional index, rebased to 100 at the start of the sample:

amb.to_index_q_on_q().iloc[-3:, :]

	region_00	region_01	region_02	region_03	region_04	region_05	region_06	region_07	region_08	region_09	region_10	region_11
datetime
2021-12-31	112.54	104.73	107.18	99.32	105.61	111.12	96.64	99.00	99.58	102.82	102.95	94.33
2022-03-31	111.85	104.45	106.60	98.50	104.93	109.40	95.49	98.58	98.68	102.46	102.20	93.35
2022-06-30	110.78	103.12	105.78	97.77	103.78	108.43	94.78	98.02	98.12	101.21	101.70	92.78

For a less granular binned signal, bands_indicator() classifies each period into growth bands rather than just recession/expansion:

amb.bands_indicator().set_index(["region", "datetime"]).unstack(0).tail()

2026-05-10 15:08:11.412 | DEBUG    | pydemetra._java:start_jvm:134 - Starting JVM with 13 JARs, max_heap=512m

	classification
region	region_00	region_01	region_02	region_03	region_04	region_05	region_06	region_07	region_08	region_09	region_10	region_11
datetime
2021-06-30	strong contraction	contraction	strong contraction	contraction	contraction	strong contraction	strong contraction	strong contraction	strong contraction	contraction	contraction	strong contraction
2021-09-30	strong contraction	contraction	strong contraction	contraction	contraction	contraction	strong contraction	strong contraction	strong contraction	contraction	contraction	contraction
2021-12-31	contraction	contraction	strong contraction	contraction	contraction	contraction	contraction	strong contraction	strong contraction	contraction	contraction	contraction
2022-03-31	contraction	contraction	contraction	strong contraction	strong contraction	strong contraction	strong contraction	contraction	strong contraction	contraction	contraction	contraction
2022-06-30	strong contraction	strong contraction	contraction	strong contraction	strong contraction	strong contraction	strong contraction	contraction	contraction	strong contraction	contraction	contraction

There is also access to all of the internal data generated when the model runs. The raw Bayesian samples can be retrieved using amb.trace, while the full set of predictions and outturns are available through amb.populate_results():

amb.populate_results()

	datetime	value	region	measure	type
0	1990-03-31	0.000348	uk	gva_q_on_q	outturn
1	1990-06-30	-0.001297	uk	gva_q_on_q	outturn
2	1990-09-30	-0.002929	uk	gva_q_on_q	outturn
3	1990-12-31	-0.003043	uk	gva_q_on_q	outturn
4	1991-03-31	-0.001388	uk	gva_q_on_q	outturn
...	...	...	...	...	...
1555	2021-06-30	-0.009820	region_11	q_on_q	nowcast
1556	2021-09-30	-0.007707	region_11	q_on_q	nowcast
1557	2021-12-31	-0.004921	region_11	q_on_q	nowcast
1558	2022-03-31	-0.010416	region_11	q_on_q	nowcast
1559	2022-06-30	-0.006142	region_11	q_on_q	nowcast

4868 rows × 5 columns

Factor and macro loadings

AMBRIC’s hierarchical loadings — \(\Lambda\) for the regional factors, \(\Gamma\) for the macro covariates, and \(\delta_r\) for the XGBoost bridge signal (see the README for the full specification) — can be inspected after fitting.

amb.assemble_loadings_data().head()

2026-05-10 15:08:18.822 | INFO     | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds

	region	loading_name	broad_type	mean	hdi_low	hdi_high	scaled
0	region_00	factor_0	factors	-0.002508	-0.019945	0.014487	True
1	region_00	factor_1	factors	0.000643	-0.016979	0.019846	True
2	region_01	factor_0	factors	-0.001983	-0.018213	0.014985	True
3	region_01	factor_1	factors	0.002221	-0.016776	0.022254	True
4	region_02	factor_0	factors	-0.000808	-0.017801	0.015884	True

amb.plot_loadings_by_region()

2026-05-10 15:08:18.857 | INFO     | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds

2026-05-10 15:08:19.862 | INFO     | ambric:plot_loadings_by_region:1297 - Plotted loadings by region.

amb.plot_loadings_aggregate()

2026-05-10 15:08:19.886 | INFO     | ambric.diagnostics:assemble_loadings_data:688 - Assembled loadings data: 60 rows across 12 regions, broad types: ['factors', 'macro', 'boost_signal'], scaled by variable stds

2026-05-10 15:08:20.570 | INFO     | ambric:plot_loadings_aggregate:1322 - Plotted loadings in aggregate by broad type.

If you want to persist the fitted posterior to disk for reuse, call amb.save_trace('path/to/trace.nc').