Custom Extensions

Custom extensions are first-class method-research APIs. A researcher should be able to add a Layer 2 representation method, a Layer 3 model, a target-side transform, or a final-matrix preprocessor, then compare the custom method against built-in baselines under the same Layer 0 task, Layer 1 data treatment, split, benchmark, and evaluation path.

See also: Partial layer execution – when developing a custom hook the partial-layer helpers are the fastest way to iterate without re-running upstream every time.

Extension map

Five runtime extension points; all are decorator APIs from macroforecast.custom (re-exported under the top-level mf namespace):

import macroforecast as mf

mf.custom_feature_block(name, block_kind="temporal" | "rotation" | "factor")
mf.custom_feature_combiner(name)
mf.custom_preprocessor(name)
mf.target_transformer(name)
mf.custom_model(name)

Each section below documents that hook with: decorator usage, required signature, input contract, output contract, a runnable worked example, and the most common errors.

The matching contract metadata (returned as plain dicts) is available from:

mf.custom_method_extension_contracts()       # all five at once
mf.custom_model_contract_metadata()
mf.custom_preprocessor_contract_metadata()
mf.target_transformer_contract_metadata()

Custom benchmarks and custom metrics are not decorator APIs yet – benchmark plugins go through benchmark configuration, metric plugins should be added as a separate evaluation-layer contract.

1. Layer 3: `custom_model`

Use custom_model when the method changes the estimator, loss, likelihood, optimization, recursive rule, or prediction function but should consume the same X_train, y_train, X_test interface as built-in models.

Decorator usage

import macroforecast as mf

@mf.custom_model("my_constant_model")
def my_constant_model(X_train, y_train, X_test, context):
    return float(y_train.mean())

Required signature

def fn(
    X_train: pd.DataFrame,   # n_train x n_features
    y_train: pd.Series,      # n_train, aligned to X_train.index
    X_test: pd.DataFrame,    # 1 x n_features (one prediction row)
    context: dict[str, Any],
) -> float | Sequence[float]:
    ...

The wrapper iterates row-by-row when L4 calls predict on multi-row matrices, so the callable always sees a one-row X_test.

Input contract

Argument	Type	Shape / contents
`X_train`	`pd.DataFrame`	`n_train x n_features`; columns = Layer 2 feature names; floats with no NaNs (L3 dropna ran upstream).
`y_train`	`pd.Series`	`n_train`; index aligned with `X_train.index`; target column from the L3 `y_final` artifact.
`X_test`	`pd.DataFrame`	`1 x n_features`; one prediction row.
`context`	`dict[str, Any]`	Required keys: `contract_version` (`"custom_model_v1"`), `model_name`, `feature_names`, `params` (the recipe-supplied `params:` dict for the fit_model node).

Optional context fields appear in mf.custom_model_contract_metadata()["optional_context_fields"] – e.g. train_index, forecast_type, forecast_object, recursive_step, auxiliary_payloads (FRED-SD mixed-frequency payloads only).

Output contract

Return one of:

a Python float – single forecast value, or
a one-element sequence/array ([value], np.array([value]), etc.).

The wrapper coerces both forms via float(...).

Worked example

import macroforecast as mf
from macroforecast import custom

custom.clear_custom_models()

@mf.custom_model("constant_train_mean")
def constant_train_mean(X_train, y_train, X_test, context):
    return float(y_train.mean())

recipe = """
0_meta:
  fixed_axes: {failure_policy: fail_fast, reproducibility_mode: seeded_reproducible}
1_data:
  fixed_axes:
    custom_source_policy: custom_panel_only
    frequency: monthly
    horizon_set: custom_list
  leaf_config:
    target: y
    target_horizons: [1]
    custom_panel_inline:
      date: [2018-01-01, 2018-02-01, 2018-03-01, 2018-04-01, 2018-05-01,
             2018-06-01, 2018-07-01, 2018-08-01, 2018-09-01, 2018-10-01]
      y:  [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
      x1: [0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
2_preprocessing:
  fixed_axes: {transform_policy: no_transform, outlier_policy: none,
               imputation_policy: none_propagate, frame_edge_policy: keep_unbalanced}
3_feature_engineering:
  nodes:
    - {id: src_X, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: predictors}}}
    - {id: src_y, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: target}}}
    - {id: lag_x, type: step, op: lag, params: {n_lag: 1}, inputs: [src_X]}
    - {id: y_h,   type: step, op: target_construction,
       params: {mode: point_forecast, method: direct, horizon: 1}, inputs: [src_y]}
  sinks:
    l3_features_v1: {X_final: lag_x, y_final: y_h}
    l3_metadata_v1: auto
4_forecasting_model:
  nodes:
    - {id: src_X, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: X_final}}}
    - {id: src_y, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: y_final}}}
    - id: fit_main
      type: step
      op: fit_model
      params: {family: constant_train_mean, min_train_size: 4,
               forecast_strategy: direct, training_start_rule: expanding,
               refit_policy: every_origin, search_algorithm: none}
      inputs: [src_X, src_y]
    - {id: predict, type: step, op: predict, inputs: [fit_main, src_X]}
  sinks:
    l4_forecasts_v1: predict
    l4_model_artifacts_v1: fit_main
    l4_training_metadata_v1: auto
5_evaluation:
  fixed_axes: {primary_metric: mse, point_metrics: [mse]}
"""

result = mf.run(recipe)
forecasts = result.cells[0].runtime_result.artifacts["l4_forecasts_v1"].forecasts
print("first three forecasts:", list(forecasts.values())[:3])

art = result.cells[0].runtime_result.artifacts["l4_model_artifacts_v1"]
fitted = next(iter(art.artifacts.values()))
assert fitted.family == "constant_train_mean"
assert fitted.framework == "custom"

The fitted L4ModelArtifactsArtifact records family = "constant_train_mean" and framework = "custom" – those two fields distinguish a custom model from a built-in family in the manifest.

Common errors

Symptom	Cause	Fix
`ValueError: unknown model family 'foo'` at validate time	Recipe references `family: foo` but the registered name is different, or the module that calls `@mf.custom_model("foo")` was never imported in this Python process.	Import the module before `Experiment(...).run()`; YAML alone does not register Python callables.
`RuntimeError: custom model 'foo' predict() called before fit()`	The runtime called `predict` on the adapter without `fit` – usually a custom L4 op short-circuiting the fit path.	Use the canonical `fit_model` -> `predict` node pair (the `examples/recipes/l4_minimal_ridge.yaml` shape).
Predictions are all NaN	The custom callable returned `np.nan` or a non-numeric value; the wrapper’s `float(value)` keeps NaN as-is.	Guard inside the callable; raise `ValueError` on degenerate fit windows so `failure_policy` decides.

2. Layer 2: `custom_preprocessor`

Custom preprocessors are matrix hooks that fit between the L1 raw panel and the L3 feature DAG. v0.8.6 routes the same registered name to two distinct points inside L2 depending on applied_at.

Decorator usage

import macroforecast as mf

@mf.custom_preprocessor("clip_x1_at_2")
def clip_x1_at_2(X_train, y_train, X_test, context):
    cleaned_train = X_train.copy()
    cleaned_test = X_test.copy()
    if "x1" in cleaned_train.columns:
        cleaned_train["x1"] = cleaned_train["x1"].clip(upper=2.0)
        cleaned_test["x1"] = cleaned_test["x1"].clip(upper=2.0)
    return cleaned_train, cleaned_test

Required signature

def fn(
    X_train: pd.DataFrame,           # full L2 panel (or post-pipeline panel; see table below)
    y_train: pd.Series | None,       # None at the L2 boundary -- target is not yet split out
    X_test: pd.DataFrame,            # same frame as X_train at the L2 boundary
    context: dict[str, Any],         # the recipe's L2.leaf_config dict (read-only)
) -> tuple[pd.DataFrame, pd.DataFrame] | pd.DataFrame:
    ...

At the L2 boundary the runtime substitutes X_train = X_test = frame so the callable can do a single-pass clean. Returning a single pd.DataFrame is also accepted – the runtime treats it as both the train and pred output.

Input contract

Field	Contents
`X_train` index	`pd.DatetimeIndex` of the panel rows.
`X_train` columns	Predictor names + the target column (the target stays in the panel until L3 splits it out).
`X_train` dtypes	`float64` numeric columns; `pd.NA` for missing cells.
`y_train`	`None` at the L2 boundary – target lives inside `X_train`.
`context`	Copy of `2_preprocessing.leaf_config` from the recipe; useful for reading user-supplied options.

Output contract

Same pd.DatetimeIndex as X_train (drop / re-index only when intentional and the target column drop is acceptable).
May add or remove columns; must preserve the target column when applied_at='l2' because the canonical pipeline (transform / outlier / impute / frame_edge) still needs to run on it.
For applied_at='l3' the returned frame becomes the L2 clean panel directly – L3 reads it via the l2_clean_panel_v1 selector.

`applied_at='l2'` vs `applied_at='l3'`

The same registered callable can be wired in at two distinct timing points within L2. The choice is made by the recipe author through Experiment.use_preprocessor(name, applied_at=...) (or by setting the right L2 leaf_config key directly).

Aspect	`applied_at='l2'` (pre-pipeline)	`applied_at='l3'` (post-pipeline)
Runs at	Top of `materialize_l2`, before `transform` / `outlier` / `impute` / `frame_edge`	Bottom of `materialize_l2`, output becomes the L2 clean panel that L3 reads
Input	Raw L1 panel (untransformed; contains official t-codes if a code map was applied at L1)	L2 clean panel (post-canonical pipeline)
Output replaces	The raw input feeding the canonical pipeline	The clean panel that L3 reads
Use when	You need to clean / deflate / normalise before the McCracken-Ng pipeline	You want to override or replace the canonical pipeline output entirely
`leaf_config` key	`custom_preprocessor`	`custom_postprocessor`
Cleaning-log entry	`{"custom_preprocessor": "<name>", "applied": True}` (first step)	`{"custom_postprocessor": "<name>", "applied": True}` (last step)
Available since	v0.8.6 (#PR-A)	v0.2.5 (#251)

Worked example

import macroforecast as mf
from macroforecast import custom

custom.clear_custom_preprocessors()

@mf.custom_preprocessor("clip_x1_at_2")
def clip_x1_at_2(X_train, y_train, X_test, context):
    out_train = X_train.copy()
    out_test = X_test.copy()
    if "x1" in out_train.columns:
        out_train["x1"] = out_train["x1"].clip(upper=2.0)
        out_test["x1"] = out_test["x1"].clip(upper=2.0)
    return out_train, out_test

# Set ``custom_postprocessor`` for ``applied_at='l3'`` (post-pipeline);
# set ``custom_preprocessor`` for ``applied_at='l2'`` (pre-pipeline).
recipe = """
0_meta:
  fixed_axes: {failure_policy: fail_fast, reproducibility_mode: seeded_reproducible}
1_data:
  fixed_axes:
    custom_source_policy: custom_panel_only
    frequency: monthly
    horizon_set: custom_list
  leaf_config:
    target: y
    target_horizons: [1]
    custom_panel_inline:
      date: [2018-01-01, 2018-02-01, 2018-03-01, 2018-04-01, 2018-05-01,
             2018-06-01, 2018-07-01, 2018-08-01, 2018-09-01, 2018-10-01]
      y:  [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
      x1: [0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
2_preprocessing:
  fixed_axes: {transform_policy: no_transform, outlier_policy: none,
               imputation_policy: none_propagate, frame_edge_policy: keep_unbalanced}
  leaf_config:
    custom_postprocessor: clip_x1_at_2
3_feature_engineering:
  nodes:
    - {id: src_X, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: predictors}}}
    - {id: src_y, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: target}}}
    - {id: lag_x, type: step, op: lag, params: {n_lag: 1}, inputs: [src_X]}
    - {id: y_h,   type: step, op: target_construction,
       params: {mode: point_forecast, method: direct, horizon: 1}, inputs: [src_y]}
  sinks:
    l3_features_v1: {X_final: lag_x, y_final: y_h}
    l3_metadata_v1: auto
4_forecasting_model:
  nodes:
    - {id: src_X, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: X_final}}}
    - {id: src_y, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: y_final}}}
    - id: fit_main
      type: step
      op: fit_model
      params: {family: ridge, alpha: 0.1, min_train_size: 4,
               forecast_strategy: direct, training_start_rule: expanding,
               refit_policy: every_origin, search_algorithm: none}
      inputs: [src_X, src_y]
    - {id: predict, type: step, op: predict, inputs: [fit_main, src_X]}
  sinks:
    l4_forecasts_v1: predict
    l4_model_artifacts_v1: fit_main
    l4_training_metadata_v1: auto
5_evaluation:
  fixed_axes: {primary_metric: mse}
"""

result = mf.run(recipe)
panel = result.cells[0].runtime_result.artifacts["l2_clean_panel_v1"].panel.data
print("max(x1) after clip:", panel["x1"].max())
assert panel["x1"].max() == 2.0

The same callable can be wired in as leaf_config.custom_preprocessor (applied_at='l2' – pre-pipeline) instead of custom_postprocessor (applied_at='l3' – post-pipeline). The cleaning-log entry that lands in l2_clean_panel_v1.cleaning_log['steps'] records which timing point ran.

If you prefer the high-level builder, the equivalent two lines are:

exp = mf.Experiment(dataset="fred_md", target="INDPRO", horizons=[1])
exp.use_preprocessor("clip_x1_at_2", applied_at="l3")  # or "l2"

Note: Experiment(...) always wires the official FRED path; reach for the recipe-dict / YAML form whenever you need to drive a custom inline panel like the example above.

Switch to applied_at='l2' to run the same callable before the McCracken-Ng pipeline – useful when you want the official t-code map to operate on already-deflated columns, for example.

Common errors

Symptom	Cause	Fix
`ValueError: minimal L3 runtime requires target column in L2 clean panel`	Returned a frame that dropped the target column under `applied_at='l2'`.	Preserve the target column; the L2 pipeline downstream still needs it.
Output ignored, no error	Runtime saw an exception and silently fell back to identity (the dispatcher wraps the call in `try / except` for safety).	Manually call `mf.custom.get_custom_preprocessor(name).function(...)` once during dev to surface the traceback.
Index drift (rows missing in `l2_clean_panel_v1`)	Returned a smaller / re-indexed DataFrame and the canonical pipeline subsequently dropped more rows.	Preserve the input `DatetimeIndex` unless drop is the explicit goal of the hook.

3. Layer 2: `target_transformer`

target_transformer is the target-side extension. Use it when y itself must be transformed for fitting and forecasts must be inverted to the raw target scale before they reach L5 evaluation.

Decorator usage

import macroforecast as mf
import numpy as np

@mf.target_transformer("scale_by_std")
class ScaleByStd:
    def fit(self, target_train, context):
        self.scale_ = float(target_train.std() or 1.0)

    def transform(self, target, context):
        return target / self.scale_

    def inverse_transform_prediction(self, target_pred, context):
        return np.asarray(target_pred) * self.scale_

The decorator accepts either a class (instantiated once per L3 run) or a factory () -> instance.

Required signature

The registered object (or factory’s product) must implement three methods:

class TargetTransformer:
    def fit(self, target_train: pd.Series, context: dict[str, Any]) -> None | "TargetTransformer":
        ...

    def transform(self, target: pd.Series, context: dict[str, Any]) -> pd.Series:
        ...

    def inverse_transform_prediction(
        self,
        target_pred: Sequence[float] | np.ndarray,
        context: dict[str, Any],
    ) -> np.ndarray:
        ...

Input / output contract

Method	Input	Output
`fit`	`target_train`: `pd.Series` of training-window y values; `context`: dict (currently empty).	Side-effects only; return value is ignored.
`transform`	`target`: `pd.Series` (training y or live y); `context`: dict.	`pd.Series` on the transformed scale.
`inverse_transform_prediction`	`target_pred`: 1-D iterable of model predictions (transformed scale); `context`: dict.	`np.ndarray` on the raw target scale.

Scale rule: model fits on the transformed scale, but every reported forecast and metric is on the raw target scale. The runtime applies inverse_transform_prediction between L4 and L5.

Worked example

import macroforecast as mf
from macroforecast import custom
import numpy as np

custom.clear_custom_target_transformers()

@mf.target_transformer("scale_by_std")
class ScaleByStd:
    def fit(self, target_train, context):
        self.scale_ = float(target_train.std() or 1.0)
    def transform(self, target, context):
        return target / self.scale_
    def inverse_transform_prediction(self, target_pred, context):
        return np.asarray(target_pred) * self.scale_

experiment = mf.Experiment(dataset="custom_panel_only", target="y", horizons=[1])
recipe = experiment.to_recipe_dict()
recipe["1_data"]["leaf_config"]["target_transformer"] = "scale_by_std"
result = mf.run(recipe)

forecasts = result.cells[0].runtime_result.artifacts["l4_forecasts_v1"].forecasts
print("raw-scale forecasts:", list(forecasts.values())[:3])

Final forecasts are on the raw target scale – the inverse step ran between L4 and L5.

Common errors

Symptom	Cause	Fix
`TypeError: target transformer 'foo' must provide callable methods: ['fit', ...]`	Registered object missing one of the three required methods.	Implement all three; `fit` may be a no-op but must exist.
Forecast values look 1000x off	Forgot to wire `inverse_transform_prediction` (returned the same value).	Ensure `inverse_transform_prediction` undoes `transform` round-trip.
`feature_runtime` rejection	The feature runtime is not in the supported allowlist. See Target transformer for the gate.	Use `ols` / `ridge` / `lasso` / `elasticnet` / a registered `custom_model` with target-lag or raw-panel runtimes.

For deeper rules see Target transformer.

4. Layer 3: `custom_feature_block`

Layer 3 builds the research representation. Use custom_feature_block when the method changes what predictors exist before the model is fit – e.g. temporal filters, lag-polynomial summaries, factor construction, rotation, research-specific transforms.

A registered feature block is dispatched whenever an L3 step node’s op matches the registered name (the runtime checks the registry before falling through to built-in ops).

Decorator usage

import macroforecast as mf

@mf.custom_feature_block("double_it", block_kind="temporal")
def double_it(frame, params):
    return frame * 2.0

block_kind is one of "temporal", "rotation", "factor" and governs which Layer 2 axis exposes the registered name.

Required signature

def fn(
    frame: pd.DataFrame,         # the upstream L3 node's output (predictors)
    params: dict[str, Any],      # the recipe's params dict for this step node
) -> pd.DataFrame:
    ...

Input / output contract

Aspect	Contract
`frame`	`pd.DataFrame` from the upstream node; index is the panel `DatetimeIndex`. May contain `pd.NA` (drop / fill is the block’s responsibility).
`params`	The L3 step node’s `params:` dict; `block_kind` is injected by the dispatcher when present.
Return	`pd.DataFrame` with the same index (or a deterministic subset). New columns are stable feature names.

Worked example

import macroforecast as mf
from macroforecast import custom

custom.clear_custom_feature_blocks()

@mf.custom_feature_block("double_it", block_kind="temporal")
def double_it(frame, params):
    return frame * float(params.get("scale", 2.0))

recipe = {
    "0_meta": {"fixed_axes": {"failure_policy": "fail_fast", "reproducibility_mode": "seeded_reproducible"}},
    "1_data": {
        "fixed_axes": {"custom_source_policy": "custom_panel_only", "frequency": "monthly", "horizon_set": "custom_list"},
        "leaf_config": {
            "target": "y", "target_horizons": [1],
            "custom_panel_inline": {
                "date": ["2018-01-01", "2018-02-01", "2018-03-01", "2018-04-01", "2018-05-01",
                         "2018-06-01", "2018-07-01", "2018-08-01", "2018-09-01", "2018-10-01"],
                "y":  [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
                "x1": [0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0],
            },
        },
    },
    "2_preprocessing": {"fixed_axes": {"transform_policy": "no_transform", "outlier_policy": "none",
                                        "imputation_policy": "none_propagate", "frame_edge_policy": "keep_unbalanced"}},
    "3_feature_engineering": {
        "nodes": [
            {"id": "src_X", "type": "source",
             "selector": {"layer_ref": "l2", "sink_name": "l2_clean_panel_v1", "subset": {"role": "predictors"}}},
            {"id": "src_y", "type": "source",
             "selector": {"layer_ref": "l2", "sink_name": "l2_clean_panel_v1", "subset": {"role": "target"}}},
            {"id": "scaled_x", "type": "step", "op": "double_it",
             "params": {"block_kind": "temporal", "scale": 3.0}, "inputs": ["src_X"]},
            {"id": "y_h", "type": "step", "op": "target_construction",
             "params": {"mode": "point_forecast", "method": "direct", "horizon": 1}, "inputs": ["src_y"]},
        ],
        "sinks": {"l3_features_v1": {"X_final": "scaled_x", "y_final": "y_h"}, "l3_metadata_v1": "auto"},
    },
    "4_forecasting_model": {
        "nodes": [
            {"id": "src_X", "type": "source",
             "selector": {"layer_ref": "l3", "sink_name": "l3_features_v1", "subset": {"component": "X_final"}}},
            {"id": "src_y", "type": "source",
             "selector": {"layer_ref": "l3", "sink_name": "l3_features_v1", "subset": {"component": "y_final"}}},
            {"id": "fit_main", "type": "step", "op": "fit_model",
             "params": {"family": "ridge", "alpha": 1.0, "min_train_size": 4,
                        "forecast_strategy": "direct", "training_start_rule": "expanding",
                        "refit_policy": "every_origin", "search_algorithm": "none"},
             "inputs": ["src_X", "src_y"]},
            {"id": "predict", "type": "step", "op": "predict", "inputs": ["fit_main", "src_X"]},
        ],
        "sinks": {"l4_forecasts_v1": "predict", "l4_model_artifacts_v1": "fit_main", "l4_training_metadata_v1": "auto"},
    },
}

result = mf.run(recipe)
print("custom block applied:", "scaled_x" in result.cells[0].runtime_result.artifacts["l3_features_v1"].X_final.column_names or
      list(result.cells[0].runtime_result.artifacts["l3_features_v1"].X_final.data.columns))

The L3 step node’s op: double_it triggers the registered callable; the output DataFrame becomes scaled_x and feeds the L4 fit node.

Common errors

Symptom	Cause	Fix
`NotImplementedError: L3 runtime does not support op 'foo'`	Step node references `op: foo` but no callable named `foo` is registered.	Import the module that calls `@mf.custom_feature_block("foo", ...)` before `mf.run(...)`.
Custom block silently ignored, fallback ran instead	Registered callable raised inside the dispatcher (the dispatcher swallows exceptions for safety).	Call `mf.custom.get_custom_feature_block(name, block_kind=...).function(frame, params)` directly during dev to surface the traceback.
Index drift in the L3 sink	Block returned a smaller / re-indexed frame and the L3 `concat` later mis-aligned.	Keep the input `DatetimeIndex` (or document the drop precisely).

5. Layer 3: `custom_feature_combiner`

Feature combiners are broader than feature blocks. Use a combiner when the method changes how already-built blocks become final Z – nonlinear interactions across blocks, supervised block weighting, residualized composition, custom low-dim representations.

Decorator usage

import macroforecast as mf
import pandas as pd

@mf.custom_feature_combiner("merge_concat")
def merge_concat(inputs, params):
    if isinstance(inputs, list):
        return pd.concat(inputs, axis=1)
    return inputs

Required signature

def fn(
    inputs: list[pd.DataFrame],   # list of upstream block frames
    params: dict[str, Any],       # the recipe's params dict for this step node
) -> pd.DataFrame:
    ...

The combiner receives the list of upstream frames (multi-frame merge contract). Always handle both the list form and – defensively – the single frame form, since the dispatcher only routes the list form when the upstream has multiple inputs.

Input / output contract

Aspect	Contract
`inputs`	`list[pd.DataFrame]`; each frame shares the panel `DatetimeIndex`.
`params`	The L3 step node’s `params:` dict (e.g. weights, components).
Return	`pd.DataFrame` with the same index. Columns are the public feature names of the combined representation.

Worked example

import macroforecast as mf
from macroforecast import custom
import pandas as pd

custom.clear_custom_feature_combiners()

@mf.custom_feature_combiner("merge_concat")
def merge_concat(inputs, params):
    return pd.concat(inputs, axis=1)

# Inside a recipe, an L3 step node like
#   {id: combine, type: step, op: merge_concat, inputs: [block_a, block_b]}
# triggers the combiner when both upstream blocks are pd.DataFrames.

a = pd.DataFrame({"x": [1.0, 2.0]})
b = pd.DataFrame({"y": [3.0, 4.0]})

# Direct invocation for unit tests / debugging:
spec = mf.custom.get_custom_feature_combiner("merge_concat")
print(spec.function([a, b], {}))

To wire it into a real recipe, define an L3 step node whose op matches the registered name and whose inputs list the upstream block ids – the dispatcher routes the list of upstream frames to the combiner.

Common errors

Symptom	Cause	Fix
`KeyError` inside combiner	Indexed into `inputs` as if it were always a list, but the dispatcher passed a single frame.	Branch on `isinstance(inputs, list)` (the worked example does).
Column collisions in the combined frame	Two upstream blocks emit a column with the same name.	Rename inside the combiner – `pd.concat([a, b.add_prefix("b_")], axis=1)`.

Method comparison sweeps

Method researchers often compare combinations:

built-in Layer 2 representation with built-in Layer 4 generator;
custom Layer 2 representation with built-in Layer 4 generator;
built-in Layer 2 representation with custom Layer 4 generator;
custom Layer 2 representation with custom Layer 4 generator.

Use sweep_axes for registry axes and leaf_sweep_axes for variant-specific configuration names. leaf_sweep_axes materializes into leaf_config before each variant is compiled, so registered custom names can vary across variants without editing package internals.

path:
  0_meta:
    fixed_axes:
      study_scope: one_target_compare_methods
      failure_policy: continue_on_failure
  2_preprocessing:
    sweep_axes:
      temporal_feature_block: [moving_average_features, custom_temporal_features]
  4_forecasting_model:
    nodes:
      - id: src_X
        type: source
        selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: X_final}}
      - id: src_y
        type: source
        selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: y_final}}
      - id: fit_candidate
        type: step
        op: fit_model
        params: {family: my_custom_generator}
        inputs: [src_X, src_y]
      - id: predict_candidate
        type: step
        op: predict
        inputs: [fit_candidate, src_X]
    sinks:
      l4_forecasts_v1: predict_candidate
      l4_model_artifacts_v1: fit_candidate
      l4_training_metadata_v1: auto

If the custom name should apply only when the parent custom axis is selected, use nested_sweep_axes with a leaf_config.<key> child. This avoids duplicate built-in variants that differ only by an unused custom name.

path:
  2_preprocessing:
    nested_sweep_axes:
      temporal_feature_block:
        moving_average_features: {}
        custom_temporal_features:
          leaf_config.custom_temporal_feature_block:
            - my_temporal_block
            - my_second_temporal_block

For custom combiners, bind the combiner name the same way:

path:
  2_preprocessing:
    nested_sweep_axes:
      feature_block_combination:
        concatenate_named_blocks: {}
        custom_feature_combiner:
          leaf_config.custom_feature_combiner:
            - my_combiner

The compiler expands the grid first, then validates each variant. With failure_policy=continue_on_failure, unsupported cells remain visible in the manifest; supported built-in/custom cells run and appear in the same study manifest.

FRED-SD mixed-frequency payloads

FRED-SD can emit extra Layer 2 payloads for custom Layer 3 models:

Payload contract	Context key	Present when
`fred_sd_native_frequency_block_payload_v1`	`context["auxiliary_payloads"]["fred_sd_native_frequency_block_payload"]`	`fred_sd_mixed_frequency_representation` is `native_frequency_block_payload` or `mixed_frequency_model_adapter`.
`fred_sd_mixed_frequency_model_adapter_v1`	`context["auxiliary_payloads"]["fred_sd_mixed_frequency_model_adapter"]`	`fred_sd_mixed_frequency_representation` is `mixed_frequency_model_adapter`.

The native-frequency payload includes blocks, block_order, and column_to_native_frequency. The adapter payload records the adapter route and the block-payload contract consumed by the model. This is the supported place to implement research-specific mixed-frequency likelihoods, weighting schemes, state updates, or direct forecast rules while leaving Layer 1 source / t-code policy and Layer 2 representation choices auditable.

Use examples/custom_fred_sd_mixed_frequency_model.py as an executable Python template and examples/recipes/templates/fred-sd-custom-mixed-frequency-model.yaml as the matching recipe skeleton. YAML can select the registered model_family, but it does not import Python code – import the module that calls @mf.custom_model(...) before compiling or running the recipe.

Fair-comparison checklist

For a custom-method comparison to be valid:

Keep Layer 0 task axes fixed unless task design is the object of comparison.
Keep Layer 1 raw-data treatment fixed unless raw missing/outlier policy is the object of comparison.
Put representation construction in Layer 2/3, not inside a model closure.
Put estimator changes in Layer 3, not inside a feature block.
Record all fitted choices in fit_state, provenance, or model tuning payload.
Use identical split, horizon, benchmark, and evaluation settings across built-in and custom variants.

This separation matters because most method papers compare combinations: a new representation with existing models, existing representations with a new model, or a custom representation and custom model together. macroforecast should make each of those comparisons executable without package-internal edits.

Custom Extensions

Extension map

1. Layer 3: custom_model

Decorator usage

Required signature

Input contract

Output contract

Worked example

Common errors

2. Layer 2: custom_preprocessor

Decorator usage

Required signature

Input contract

Output contract

applied_at='l2' vs applied_at='l3'

Worked example

Common errors

3. Layer 2: target_transformer

Decorator usage

Required signature

Input / output contract

Worked example

Common errors

4. Layer 3: custom_feature_block

Decorator usage

Required signature

Input / output contract

Worked example

Common errors

5. Layer 3: custom_feature_combiner

Decorator usage

Required signature

Input / output contract

Worked example

Common errors

Method comparison sweeps

FRED-SD mixed-frequency payloads

Fair-comparison checklist

1. Layer 3: `custom_model`

2. Layer 2: `custom_preprocessor`

`applied_at='l2'` vs `applied_at='l3'`

3. Layer 2: `target_transformer`

4. Layer 3: `custom_feature_block`

5. Layer 3: `custom_feature_combiner`