----------------------------------------------------------------------
This is the API documentation for the ambric library.
----------------------------------------------------------------------


## Classes

Main classes provided by the package


Ambric(df: pandas.core.frame.DataFrame, macro_names: list[str], region_names: list[str], region_covariate_names: list[str], n_factors: int = 4, aggregate_measure: str = 'gva_q_on_q', aggregation_region: str = 'uk', region_measure: str = 'gva_q_on_4q', region_q_on_q_measure: str | None = None)

Augmented Mixed-frequency Bayesian Regional Inference with Constraints.

Combines factor-analytic Bayesian state-space inference with XGBoost-driven
predictions via a MIDAS bridge equation for regional nowcasting.


## Ambric Methods

Methods for the Ambric class


__repr__(self) -> str

fit(self, n_model_fit_iterations: int = 200000, n_posterior_samples: int = 3000, xgb_params: dict | None = None, bridge_use_almon: bool = True, bridge_ridge_alpha: float = 1.0) -> 'Ambric'

Fit the ambric model.

Pipeline:
    1. Extract factors from regional indicator panel.
    2. Train XGBoost on annually-aggregated raw indicators to predict
       annual regional growth.
    3. Fit MIDAS bridge equation to disaggregate XGBoost annual
       predictions to quarterly frequency.
    4. Build Bayesian state-space model with factors, macro, and
       bridge signal.
    5. Run variational inference.

Args:
    n_model_fit_iterations: Number of ADVI iterations.
    n_posterior_samples: Number of posterior samples to draw.
    xgb_params: Optional XGBoost hyperparameters override.
    bridge_use_almon: Use Almon polynomial for MIDAS weights.
    bridge_ridge_alpha: Ridge regularisation for bridge equation.

Returns:
    Self for method chaining.

save_trace(self, path: str | pathlib.Path) -> None

Saves the model trace to a NetCDF file.

Args:
    path (str | Path): Path to save the trace file

populate_results(self) -> pandas.core.frame.DataFrame

Returns results from model estimation, and original data, in format:

| datetime | region | value | measure | type

where type can be "outturn" or "nowcast"

Raises:
    ValueError: If model not fitted

Returns:
    pd.DataFrame: Dataframe of results

plot_national_quarterly_vs_implied(self, path: str | pathlib.Path | None = None) -> None

Plot national quarterly growth rates vs implied estimates from the model.

Args:
    path (str | Path | None, optional): Save dir for image. Defaults to None.

Raises:
    ValueError: If model not fitted.

plot_regional_annual_estimate(self, path: str | pathlib.Path | None = None) -> None

Plot regional annual growth rates vs estimated from the model.

Args:
    path (str | Path | None, optional): Dir to save fig to. Defaults to None.

Raises:
    ValueError: If model not fitted.

plot_single_region_annual_estimate(self, region_name: str, path: str | pathlib.Path | None = None) -> None

Plot a single region's annual growth rates vs estimated from the model.

Args:
    region_name (str): The region to plot.
    path (str | Path | None, optional): Dir to save fig to. Defaults to None.

Raises:
    ValueError: If model not fitted.

plot_estimated_regional_quarterly(self, path: str | pathlib.Path | None = None) -> None

Plot estimated regional quarterly growth rates from the model.

Args:
    path (str | Path | None, optional): Dir to save fig to. Defaults to None.

Raises:
    ValueError: If model not fitted.

plot_current_nowcast(self, path: pathlib.Path | None = None) -> None

Plot the latest nowcast (ie the period for which no annual regional observations are available.)

Args:
    path (Path | None, optional): Dir to save figure to. Defaults to None.

Raises:
    ValueError: If model not fitted.

assemble_loadings_data(self) -> pandas.core.frame.DataFrame

Assemble estimated loadings from the model posterior.

Separates data assembly from plotting so the returned frame can be
inspected, exported, or passed to the companion plot methods.  The
frame contains one row per (region, loading) combination with the
posterior mean and 94 % HDI bounds.  Loadings are scaled by the
standard deviation of their corresponding input variable so that
the three signal types are on a comparable *contribution* scale.

Raises:
    ValueError: If the model has not been fitted yet.

Returns:
    pd.DataFrame: Long-format loadings frame; see
        :func:`~ambric.diagnostics.assemble_loadings_data` for
        column details.

plot_loadings_by_region(self, path: pathlib.Path | None = None) -> None

Plot estimated loadings for each region, coloured by broad type.

Assembles loadings from the posterior and passes them to
:func:`~ambric.diagnostics.plot_loadings_by_region`.  One panel per
region shows all factor, macro, and bridge-signal loadings as a
horizontal dot chart with 94 % HDI bars, enabling within-region
comparison of the three signal categories.

Args:
    path (Path | None): Directory in which to save the figure as
        SVG.  When ``None`` the figure is displayed interactively.

Raises:
    ValueError: If the model has not been fitted yet.

plot_loadings_aggregate(self, path: pathlib.Path | None = None) -> None

Plot loading distributions across regions, grouped by broad type.

Assembles loadings from the posterior and passes them to
:func:`~ambric.diagnostics.plot_loadings_aggregate`.  One panel per
broad loading type (factors, macro, boost_signal) compares individual
region estimates against the cross-region mean, enabling assessment
of which signal category dominates model dynamics and how consistently
loadings behave across regions.

Args:
    path (Path | None): Directory in which to save the figure as
        SVG.  When ``None`` the figure is displayed interactively.

Raises:
    ValueError: If the model has not been fitted yet.

bands_indicator(self, path: pathlib.Path | None = None) -> pandas.core.frame.DataFrame

Produce a table indicating bands

Uses seasonally adjusted q-on-q growth estimates at quarterly frequency.

Args:
    path (Path | None): Directory to save the table as Parquet. When
        ``None`` no file is written.

Raises:
    ValueError: If model not fitted.

Returns:
    pd.DataFrame: Wide-format table with datetime index and one
        column per region containing the classification.

point_estimates_q_on_4q(self, path: pathlib.Path | None = None) -> pandas.core.frame.DataFrame

Produce a table of nowcast point estimates by region.

Returns q-on-4q annual growth estimates (in percentage points,
rounded to 2 d.p.) at quarterly frequency.

Args:
    path (Path | None): Directory to save the table as Parquet. When
        ``None`` no file is written.

Raises:
    ValueError: If model not fitted.

Returns:
    pd.DataFrame: Wide-format table with datetime index and one
        column per region containing the point estimate.

point_estimates_q_on_q(self, path: pathlib.Path | None = None) -> pandas.core.frame.DataFrame

Produce a table of nowcast point estimates by region.

Returns q-on-q annual growth estimates (in percentage points,
rounded to 2 d.p.) at quarterly frequency.

Args:
    path (Path | None): Directory to save the table as Parquet. When
        ``None`` no file is written.

Raises:
    ValueError: If model not fitted.

Returns:
    pd.DataFrame: Wide-format table with datetime index and one
        column per region containing the point estimate.

to_index_q_on_q(self, path: pathlib.Path | None = None) -> pandas.core.frame.DataFrame

Produce a table of nowcast index.

Returns index to earliest data point (rounded to 2 d.p.) at quarterly frequency.

Args:
    path (Path | None): Directory to save the table as Parquet. When
        ``None`` no file is written.

Raises:
    ValueError: If model not fitted.

Returns:
    pd.DataFrame: Wide-format table with datetime index and one
        column per region containing the point estimate.

seasonally_adjusted_index_and_growth_by_region(self, path: pathlib.Path | None = None) -> pandas.core.frame.DataFrame

Produce seasonally adjusted index and q-on-q growth rates

Returns data to earliest data point (rounded to 2 d.p.) at quarterly frequency.

Args:
    path (Path | None): Directory to save the table as Parquet. When
        ``None`` no file is written.

Raises:
    ValueError: If model not fitted.

Returns:
    pd.DataFrame: Wide-format table with datetime index and one
        column per region containing the point estimate.


## Functions

Utility functions


bands_indicator_out_of_sample_results(df_results: pandas.core.frame.DataFrame, path: pathlib.Path | None = None, bands: list[float] = [-0.8, -0.1, 0.1, 0.8]) -> pandas.core.frame.DataFrame

Bands classification of out-of-sample trend q-on-q nowcasts.

For every ``(region, datetime, quarters_to_publication, nowcast_index)``
in the OOS results, extracts the X13 trend of the q-on-q nowcast (per
``quarters_to_publication`` slice), converts it back to q-on-q growth in
percentage points, and classifies into bands using
:func:`~ambric.diagnostics.bands_indicator`.

Args:
    df_results (pd.DataFrame): Output of :func:`run_out_of_sample_exercise`.
    path (Path | None): Directory to save the table as Parquet. When
        ``None`` no file is written.
    bands (list[float]): Interior bin edges in percentage points.
        Defaults to ``[-0.8, -0.1, 0.1, 0.8]``.

Returns:
    pd.DataFrame: Long-format frame with columns ``region``, ``datetime``,
        ``quarters_to_publication``, ``nowcast_index``, ``classification``.

build_ambric_model(y_uk: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], y_annual: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], factors: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], macro: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], bridge_signal: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], region_q_on_q: numpy.ndarray[tuple[Any, ...], numpy.dtype[numpy.float64]] | None = None) -> pymc.model.core.Model

Build the AMBRIC Bayesian state-space model.

The exogenous mean for latent regional growth is:

    mu_exog[t,r] = Lambda[r] @ F[t] + Gamma[r] @ X[t] + delta_r * s[t,r]

where s[t,r] is the quarterly bridge signal from XGBoost + MIDAS.
delta_r has a hierarchical shrinkage prior centred at zero.

Args:
    y_uk: UK quarterly growth rates, shape (T,).
    y_annual: Regional annual growth rates, shape (T, R).
    factors: Extracted factors, shape (T, K).
    macro: Macro UK series, shape (T, M).
    bridge_signal: Quarterly bridge signal, shape (T, R).
    region_q_on_q: Optional published quarterly regional growth rates,
        shape ``(T, R)`` with NaN in unobserved cells. Values must be
        decimal growth rates (e.g. ``0.005`` for 0.5%). When supplied,
        ``y_reg`` is built as a hybrid ``pm.Deterministic``: at every
        ``(t, r)`` with an observed value, ``y_reg[t, r]`` is
        hard-clamped to that value; at NaN cells, ``y_reg[t, r]``
        equals the sampled latent ``y_reg_free[t, r]``. The clamped
        values then propagate through the UK aggregation, annual
        aggregation, and AR(1) dynamics, informing unobserved
        neighbours rather than competing with them through a
        likelihood term. Defaults to ``None`` (no clamp — backwards
        compatible).

Returns:
    PyMC model object.

oos_q_on_4q_performance_table(df_results: pandas.core.frame.DataFrame, region_measure: str, path: pathlib.Path | None = None) -> pandas.core.frame.DataFrame

Compute up/down classification accuracy by region and horizon.

Compares the sign of each nowcast to its corresponding outturn.
Returns a pivot table of percentage accuracy grouped by region and
quarters-to-publication.

Args:
    df_results (pd.DataFrame): Out-of-sample results from
        :func:`~ambric.run_out_of_sample_exercise`.
    region_measure (str): Regional measure name to filter on.
    path (Path | None): Directory to save a CSV of the table. When
        ``None`` no file is written.

Returns:
    pd.DataFrame: Pivot table of classification accuracy (%) with
        regions as rows and quarters-to-publication as columns.

plot_out_of_sample_nowcasts(df_results: pandas.core.frame.DataFrame, region_measure: str, path: pathlib.Path | None = None) -> None

Plot out-of-sample nowcasts vs outturns for each region.

Produces one figure per region showing observed outturns as dots and
nowcasts at varying horizons with transparency indicating proximity to
publication.

Args:
    df_results (pd.DataFrame): Out-of-sample results from
        :func:`~ambric.run_out_of_sample_exercise`.
    region_measure (str): Regional measure name to filter on.
    path (Path | None): Directory to save the figures. When ``None``
        the figures are displayed interactively.

plot_out_of_sample_rmse(df_results: pandas.core.frame.DataFrame, region_measure: str, path: pathlib.Path | None = None) -> None

Plot out-of-sample RMSEs by quarters-to-publication for each region.

One subplot per region in a grid layout showing how forecast accuracy
improves as publication approaches.

Args:
    df_results (pd.DataFrame): Out-of-sample results from
        :func:`~ambric.run_out_of_sample_exercise`.
    region_measure (str): Regional measure name to filter on.
    path (Path | None): Directory to save the figure. When ``None``
        the figure is displayed interactively.

plot_seasonally_adjusted_q_on_q_growth(df_sa_trend_orig: pandas.core.frame.DataFrame, path: pathlib.Path | None) -> None

_summary_

Args:
    df_sa_trend_orig (pd.DataFrame): _description_

prep_data_for_model_run(df: pandas.core.frame.DataFrame, macro_names: list[str], region_names: list[str], region_covariate_names: list[str], aggregate_measure: str = 'gva_q_on_q', aggregation_region: str = 'uk', region_measure: str = 'gva_q_on_4q', region_q_on_q_measure: str | None = None) -> tuple[numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], list[numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], int, numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] | None]

Expects a data frame in following format:
datetime | measure | region | value


Args:
    df (pd.DataFrame): Long format dataframe with all data in
    macro_names (list[str]): Names of macro UK series
    region_names (list[str]): Names of (local) regions
    region_covariate_names (list[str]): Names of regional level series. These will be absorbed into exogenous factors.
    aggregate_measure (str, optional): UK-wide measure in q-on-q growth rate. Defaults to "gva_q_on_q".
    aggregation_region (str, optional): Highest level geography, which other regions sum to. Defaults to "uk".
    region_measure (str, optional): Regional measure, q-on-4q growth rate. Defaults to "gva_q_on_4q".
    region_q_on_q_measure (str | None, optional): Optional measure name in ``df`` supplying
        published quarterly (q-on-q) growth rates for any subset of regions and quarters.
        Rows live in the same long dataframe as every other input, with the standard
        columns ``datetime | measure | region | value``:

            * ``datetime``: quarter-end timestamp.
            * ``measure``: equal to the string passed here.
            * ``region``: one of the names in ``region_names``.
            * ``value``: decimal q-on-q growth rate (e.g. ``0.005`` for 0.5% —
              **not** a percentage), matching the convention used for
              ``aggregate_measure`` / ``region_measure``.

        Partial coverage is fully supported: rows may be supplied for only a
        subset of regions and a subset of quarters. After pivoting, absent
        region/quarter pairs become NaN and are automatically masked out of
        the downstream likelihood, so uncovered regions and quarters
        contribute nothing. Defaults to ``None`` (no q-on-q observations —
        backwards compatible).

Returns:
    tuple: ``(y_uk_extracted, y_a_r_extracted, Z_panel_extract, macro_extracted, lag_qtrs, y_qoq_r_extracted)``.
    ``y_qoq_r_extracted`` is a ``(T, R)`` array aligned to ``region_names`` with NaN in
    unobserved cells, or ``None`` when ``region_q_on_q_measure`` is ``None``.

run_out_of_sample_exercise(df: pandas.core.frame.DataFrame, macro_names: list[str], region_names: list[str], region_covariate_names: list[str], n_factors: int = 4, aggregate_measure: str = 'gva_q_on_q', aggregation_region: str = 'uk', region_measure: str = 'gva_q_on_4q', region_q_on_q_measure: str | None = None, step_size: int = 1, init_chunk_size: int = 20, lag_qtrs: int = 6, lag_qtrs_qoq: int = 1, n_its: int = 100000, n_posterior_samples: int = 3000) -> pandas.core.frame.DataFrame

Run out-of-sample exercise to evaluate model performance.

Masks the most recent annual regional data by ``lag_qtrs`` quarters
and fits the model in rolling chunks.  Out-of-sample nowcasts and
outturns for each step are collected and returned.

Args:
    df (pd.DataFrame): Dataframe containing relevant columns.
    macro_names (list[str]): Names of macro series.
    region_names (list[str]): Names of regions.
    region_covariate_names (list[str]): Names of by-region covariates.
    n_factors (int): Number of factors. Defaults to 4.
    aggregate_measure (str): Nation-wide measure. Defaults to "gva_q_on_q".
    aggregation_region (str): Top level geography. Defaults to "uk".
    region_measure (str): Growth measure regional. Defaults to "gva_q_on_4q".
    region_q_on_q_measure (str | None): Optional measure name in ``df`` supplying
        published quarterly (q-on-q) regional growth rates as a hard clamp on
        ``y_reg``. Same format as every other measure (``datetime | measure |
        region | value``); values must be decimal growth rates. Inside the
        out-of-sample window this exercise NaN-masks these rows for every step
        (as with ``region_measure``), so published quarterly values inside the
        OOS window do not leak into the nowcast. Older published values
        remain as clamps. Defaults to ``None`` (no q-on-q clamp, backwards
        compatible).
    step_size (int): Quarters to advance per OOS step. Defaults to 1.
    init_chunk_size (int): Initial learning window size. Defaults to 20.
    lag_qtrs (int): How many quarters before annual regional data are
        published; drives the OOS mask for ``region_measure``. Tuned for the
        ONS regional annual GVA release (~6 quarters). Defaults to 6.
    lag_qtrs_qoq (int): How many quarters before quarterly regional data are
        published; drives a separate OOS mask for ``region_q_on_q_measure``.
        Scot Gov quarterly GDP publishes with ~1 quarter lag, so a smaller
        value than ``lag_qtrs`` is realistic. Only used when
        ``region_q_on_q_measure`` is not ``None``. Defaults to 1.
    n_its (int): Iterations of ADVI for Bayesian inference. Defaults to 100000.
    n_posterior_samples (int): Samples of the posterior. Defaults to 3000.

Returns:
    pd.DataFrame: Combined out-of-sample nowcasts and outturns across
        all rolling steps.

trace_to_series(trace: arviz.data.inference_data.InferenceData) -> tuple[numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]]]

Convert a trace to the relevant estimated series coming out of the model.

Args:
    trace (az.InferenceData): Trace containing posterior.

Returns:
    tuple[npt.NDArray[np.float64], npt.NDArray[np.float64], npt.NDArray[np.float64]]: Estimated UK quarterly, regional quarterly, and annual growth rates

trend_adjust_out_of_sample_results(df_results: pandas.core.frame.DataFrame, path: pathlib.Path | None = None, quarters_to_pub: float = 6.0)

generate_realistic_simulated_data(T: int = 130, R: int = 12, J: int = 4, n_factors: int = 2, n_macro: int = 2, lag_qtrs: int = 6) -> pandas.core.frame.DataFrame

Generate simulated mixed-frequency regional data in long format.

Produces a ``pd.DataFrame`` resembling real-world data suitable for
initialising an :class:`~ambric.Ambric` model, including UK quarterly
growth, macro series, regional covariates, and lagged annual regional
growth.

Args:
    T (int): Number of quarterly time periods.
    R (int): Number of regions.
    J (int): Number of regional covariate panels.
    n_factors (int): Number of latent factors in the DGP.
    n_macro (int): Number of macro indicator series.
    lag_qtrs (int): Publication lag in quarters for annual regional data.

Returns:
    pd.DataFrame: Long-format frame with columns
        ``datetime``, ``measure``, ``region``, ``value``.

simulate_data(T: int = 80, R: int = 6, J: int = 3, n_factors: int = 2, n_macro: int = 2, seed: int = 42) -> tuple[numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], list[numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]]]

Simulate mixed-frequency data with regional panels and macro indicators.

Args:
    T (int): Number of quarterly time periods.
    R (int): Number of regions.
    J (int): Number of regional covariate panels.
    n_factors (int): Number of latent factors.
    n_macro (int): Number of macro indicator series.
    seed (int): Random seed for reproducibility.

Returns:
    tuple: ``(y_uk, y_annual, y_reg_true, Z_panel, macro)`` — national
        quarterly growth (T,), annual regional growth (T, R) with NaNs,
        true quarterly regional growth (T, R), regional covariate
        panels (list of J arrays each (T, R)), and macro series (T, M).

simulate_real_time_data(T: int = 80, R: int = 6, J: int = 3, n_factors: int = 2, n_macro: int = 2, annnual_regional_lag_qrtrs: int = 6, seed: int = 42) -> tuple[numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], list[numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]]]

Simulate mixed-frequency data with a realistic publication lag.

Wraps :func:`simulate_data` and removes the most recent
``annnual_regional_lag_qrtrs`` quarters of annual regional data to
mimic real-time data availability.

Args:
    T (int): Number of quarterly time periods.
    R (int): Number of regions.
    J (int): Number of regional covariate panels.
    n_factors (int): Number of latent factors.
    n_macro (int): Number of macro indicator series.
    annnual_regional_lag_qrtrs (int): Quarters of annual data to mask.
    seed (int): Random seed for reproducibility.

Returns:
    tuple: ``(y_uk, y_annual, y_reg_true, Z_panel, macro, y_annual_no_lags)``
        — same as :func:`simulate_data` with an additional copy of the
        annual data before the lag was applied.


## Constants

Module-level constants and data


utilities.OMEGA

ndarray(shape, dtype=None, buffer=None, offset=0, strides=None, order=None)
--

ndarray(shape, dtype=float, buffer=None, offset=0, strides=None, order=None)

An array object represents a multidimensional, homogeneous array
of fixed-size items.  An associated data-type object describes the
format of each element in the array (its byte-order, how many bytes it
occupies in memory, whether it is an integer, a floating point number,
or something else, etc.)

Arrays should be constructed using `array`, `zeros` or `empty` (refer
to the See Also section below).  The parameters given here refer to
a low-level method (`ndarray(...)`) for instantiating an array.

For more information, refer to the `numpy` module and examine the
methods and attributes of an array.

Parameters
----------
(for the __new__ method; see Notes below)

shape : tuple of ints
    Shape of created array.
dtype : data-type, optional
    Any object that can be interpreted as a numpy data type.
    Default is `numpy.float64`.
buffer : object exposing buffer interface, optional
    Used to fill the array with data.
offset : int, optional
    Offset of array data in buffer.
strides : tuple of ints, optional
    Strides of data in memory.
order : {'C', 'F'}, optional
    Row-major (C-style) or column-major (Fortran-style) order.

Attributes
----------
T : ndarray
    Transpose of the array.
data : buffer
    The array's elements, in memory.
dtype : dtype object
    Describes the format of the elements in the array.
flags : dict
    Dictionary containing information related to memory use, e.g.,
    'C_CONTIGUOUS', 'OWNDATA', 'WRITEABLE', etc.
flat : numpy.flatiter object
    Flattened version of the array as an iterator.  The iterator
    allows assignments, e.g., ``x.flat = 3`` (See `ndarray.flat` for
    assignment examples; TODO).
imag : ndarray
    Imaginary part of the array.
real : ndarray
    Real part of the array.
size : int
    Number of elements in the array.
itemsize : int
    The memory use of each array element in bytes.
nbytes : int
    The total number of bytes required to store the array data,
    i.e., ``itemsize * size``.
ndim : int
    The array's number of dimensions.
shape : tuple of ints
    Shape of the array.
strides : tuple of ints
    The step-size required to move from one element to the next in
    memory. For example, a contiguous ``(3, 4)`` array of type
    ``int16`` in C-order has strides ``(8, 2)``.  This implies that
    to move from element to element in memory requires jumps of 2 bytes.
    To move from row-to-row, one needs to jump 8 bytes at a time
    (``2 * 4``).
ctypes : ctypes object
    Class containing properties of the array needed for interaction
    with ctypes.
base : ndarray
    If the array is a view into another array, that array is its `base`
    (unless that array is also a view).  The `base` array is where the
    array data is actually stored.

See Also
--------
array : Construct an array.
zeros : Create an array, each element of which is zero.
empty : Create an array, but leave its allocated memory unchanged (i.e.,
        it contains "garbage").
dtype : Create a data-type.
numpy.typing.NDArray : An ndarray alias :term:`generic <generic type>`
                       w.r.t. its `dtype.type <numpy.dtype.type>`.

Notes
-----
There are two modes of creating an array using ``__new__``:

1. If `buffer` is None, then only `shape`, `dtype`, and `order`
   are used.
2. If `buffer` is an object exposing the buffer interface, then
   all keywords are interpreted.

No ``__init__`` method is needed because the array is fully initialized
after the ``__new__`` method.

Examples
--------
These examples illustrate the low-level `ndarray` constructor.  Refer
to the `See Also` section above for easier ways of constructing an
ndarray.

First mode, `buffer` is None:

>>> import numpy as np
>>> np.ndarray(shape=(2,2), dtype=float, order='F')
array([[0.0e+000, 0.0e+000], # random
       [     nan, 2.5e-323]])

Second mode:

>>> np.ndarray((2,), buffer=np.array([1,2,3]),
...            offset=np.int_().itemsize,
...            dtype=int) # offset = 1*itemsize, i.e. skip first element
array([2, 3])


----------------------------------------------------------------------
This is the User Guide documentation for the package.
----------------------------------------------------------------------

### How to use AMBRIC

```{python}
#| echo: false
import matplotlib_inline.backend_inline

matplotlib_inline.backend_inline.set_matplotlib_formats("svg")
```

Let's import the package.

```{python}
from ambric import Ambric
from ambric.utilities import generate_realistic_simulated_data
```

As a user, we have to bring a few things to the party. The first, of course, is data, which we will simulate.

Let's set-up some simulated data. We'll specify how many underlying factors are driving regional dynamics first.

```{python}
n_factors = 2
df = generate_realistic_simulated_data(n_factors=n_factors, R=12)
```

Data that are input into the model must have this structure:

```{python}
df.sample(10)
```

The user must also specify the details of what the model will use. In particular, which regional variables, macroeconomic indicators (these should always have the aggregate region/top level geography as their region), and regional covariates to use. We'll just use all of these:

```{python}
aggregation_region = "uk"
region_names = [x for x in df["region"].unique() if x != aggregation_region]
macro_names = [x for x in df["measure"].unique() if "macro" in x]
region_covariate_names = [x for x in df["measure"].unique() if "regional_covar" in x]
```

Gotcha: you must ensure that the annual regional data have datetime index entries for **all** quarters up to the last published annual macro value. For non-year end quarters, the values should be nan.

In a typical use case, you will have non-nan quarters of quarterly growth at the aggregate region (eg the UK) for which the regional data are nan. Those nans in the time period between is what we are nowcasting.

Okay, we're ready to build a **AMBRIC** model!

```{python}
amb = Ambric(
    df,
    macro_names,
    region_names,
    region_covariate_names,
    n_factors=n_factors,
)
amb
```

Note that the model has specified all of its details, including that it sees that there are 6 rows of the regional data missing that will be estimated by the model. The model also tells us it isn't fitted, so let's sort that. We recommended using at least 100k iterations.

```{python}
n_iterations = 200000
n_posterior_samples = 3000
amb.fit(n_iterations, n_posterior_samples)
```

That's it! It's done. Now let's look at some results.

First, our regional estimates of quarterly growth must be consistent with the observed national growth. We can check the implied vs the true growth at the national level.

```{python}
amb.plot_national_quarterly_vs_implied()
```

Next let's look at what the regional growth (q on 4 q earlier) looks like for all regions.

```{python}
amb.plot_regional_annual_estimate()
```

We can also look at the underlying quarterly regional growth estimates (the latent $y_{t,r}$):

```{python}
amb.plot_estimated_regional_quarterly()
```

And, if we want tables of the nowcasts, there's a built-in for that at either q-on-4q

```{python}
amb.point_estimates_q_on_4q().iloc[-3:, :]
```

or q-on-q:

```{python}
amb.point_estimates_q_on_q().iloc[-3:, :]
```

These can be turned into a regional index, rebased to 100 at the start of the sample:

```{python}
amb.to_index_q_on_q().iloc[-3:, :]
```

For a less granular binned signal, `bands_indicator()` classifies each period into growth bands rather than just recession/expansion:

```{python}
amb.bands_indicator().set_index(["region", "datetime"]).unstack(0).tail()
```

There is also access to all of the internal data generated when the model runs. The raw Bayesian samples can be retrieved using `amb.trace`, while the full set of predictions and outturns are available through `amb.populate_results()`:

```{python}
amb.populate_results()
```

## Factor and macro loadings

AMBRIC's hierarchical loadings — $\Lambda$ for the regional factors, $\Gamma$ for the macro covariates, and $\delta_r$ for the XGBoost bridge signal (see the README for the full specification) — can be inspected after fitting.

```{python}
amb.assemble_loadings_data().head()
```

```{python}
amb.plot_loadings_by_region()
```

```{python}
amb.plot_loadings_aggregate()
```

If you want to persist the fitted posterior to disk for reuse, call `amb.save_trace('path/to/trace.nc')`.