Benchmark & Predictor Universe (1.4)
Declares which benchmark to compare against, which predictors the model sees, which raw variables are even in play, and which deterministic features augment the X panel. Four axes in v1.0 — every value operational via a leaf_config input channel or a simple in-code filter.
Section |
axis |
Role |
|---|---|---|
1.4.1 |
The reference forecast used for relative metrics |
|
1.4.2 |
Which columns of the raw panel are fed to the model |
|
1.4.3 |
Which columns of the raw panel are available in the first place |
|
1.4.4 |
Deterministic features appended to X (trend, seasonals, break dummies) |
Note on dropped values:
predictor_family.text_only/mixed_feature_blocks— require NN/text embeddings stack (v2).variable_universe.feature_selection_dynamic_subset— CV-in-training feature selection loop; deferred to v1.1 tuning-engine extension.deterministic_components.trend_and_quadratic— redundant withlinear_trend+ a futureleaf_config.trend_orderchannel.
target_family (the old 1.4.1 axis) was dropped in PR #32 — subsumed by target_structure.
At a glance (defaults):
benchmark_family— no default; you always pick one (most studies start withhistorical_meanorautoregressive_bic).predictor_family— feature-builder dynamic default. target_lag_features →target_lags_only; raw_feature_panel →all_macro_vars. You rarely set it.variable_universe = all_variables— the full raw panel is available. Switch to a subset only when the recipe explicitly narrows the candidate variables.deterministic_components = none— no X augmentation. Switch tolinear_trend/ seasonals /break_dummieswhen your target needs them.
Most research runs pick benchmark_family and leave the other three at the default.
1.4.1 benchmark_family
Selects the reference forecast for relative metrics. All 12 kept values are operational in v1.0. Values that require user-supplied inputs are validated at compile time.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Training-set mean. Default. |
|
operational |
Random-walk at |
|
operational |
AR model with BIC-selected lag order. |
|
operational |
AR model at a fixed lag |
|
operational |
AR + Diffusion Index (factor) model. |
|
operational |
Rolling-window mean ( |
|
operational |
Arbitrary callable supplied in |
|
operational |
Callable supplied in |
|
operational |
Single-factor OLS on the leading principal factor (v1.0 self-contained impl). |
|
operational |
Runs each member in |
|
operational |
Pre-computed forecast series supplied via |
|
operational |
Same pattern, |
Functions & features
macroforecast.execution.build._run_benchmark_executordispatches bybenchmark_familyvalue.factor_model_benchmark: z-scored leading-factor regression; falls back tohistorical_meanfor training windows < 6 rows.benchmark_suite: inline dispatch overleaf_config.benchmark_suitemembers (allowed set: historical_mean, zero_change, autoregressive_bic, rolling_mean, autoregressive_fixed_lag, autoregressive_diffusion_index). Missing or unsupported members raiseCompileValidationError.paper_specific_benchmark/survey_forecast: look up the forecast attrain.index[-1] + horizonmonths (monthly freq); fall back to the most recent trailing value on miss. The required target-keyed series dict is checked at compile time.expert_benchmark: programmatic only; requiresleaf_config.benchmark_config.expert_callable.
Recipe usage
# Paper-replication: compare against the paper's published forecast
path:
1_data_task:
leaf_config:
paper_forecast_series:
INDPRO: ... # pd.Series keyed by date
4_forecasting_model:
nodes:
- {id: src_X, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: X_final}}}
- {id: src_y, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: y_final}}}
- {id: paper_benchmark, type: step, op: benchmark_forecast, params: {family: paper_specific_benchmark}, inputs: [src_X, src_y]}
sinks:
l4_forecasts_v1: paper_benchmark
1.4.2 predictor_family
Selects which columns of the raw panel become model predictors. 6 operational values.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Only the target’s own lags (forces |
|
operational |
Every column except the target. Default for raw-panel recipes. |
|
operational |
User-supplied category mapping: |
|
operational |
Columns whose name starts with |
|
operational |
User-supplied column list: |
Functions & features
macroforecast.execution.build._raw_panel_columns(frame, target, predictor_family, spec)dispatches on the rule.Target column is always excluded from the predictor set.
Compile guards:
explicit_variable_listrequiresleaf_config.handpicked_columns;category_basedrequiresleaf_config.predictor_category_columnsandleaf_config.predictor_category.
Dropped values
text_only: requires text-embedding / NN domain stack — deferred to v2 (Transformer scope).mixed_feature_blocks: multi-block NN architecture — deferred to v2.
Recipe usage
path:
1_data_task:
fixed_axes:
predictor_family: explicit_variable_list
leaf_config:
handpicked_columns: [RPI, UNRATE, CPIAUCSL]
3_feature_engineering:
nodes:
- {id: src_x, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: predictors}}}
- {id: src_y, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: target}}}
- {id: y_h, type: step, op: target_construction, params: {horizon: 1}, inputs: [src_y]}
sinks:
l3_features_v1: {X_final: src_x, y_final: y_h}
1.4.3 variable_universe
Selects which columns of the raw panel survive dataset filtering before any training begins. 8 operational values.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Default. No filter. |
|
operational |
FRED-MD core macro variables ( |
|
operational |
User-supplied column list: |
|
operational |
|
|
operational |
|
Functions & features
macroforecast.execution.build._apply_variable_universe(raw_result, rule, spec, target)is called during dataset loading inexecute_recipe.Target and date columns are always preserved after filtering.
Runtime discovery (stability / correlation) is out of scope — users supply the subset.
Compile guards:
explicit_variable_listrequiresleaf_config.variable_universe_columns;category_variablesrequiresleaf_config.variable_universe_category_columnsandleaf_config.variable_universe_category;target_specific_variablesrequiresleaf_config.target_specific_columnsentries for the current target(s).
Dropped values
feature_selection_dynamic_subset: CV-in-training feature selection loop requires a tuning-engine extension — deferred to v1.1.paper_replication_subset,expert_curated_subset,stability_filtered_subset,correlation_screened_subset(2026-04-21): four labels shared identical runtime semantics (singlelist[str]input + column filter). Consolidated intoexplicit_variable_list.
Recipe usage
# target-specific subset
path:
1_data_task:
fixed_axes:
variable_universe: target_specific_variables
leaf_config:
target_specific_columns:
INDPRO: [RPI, UNRATE, CPIAUCSL]
PAYEMS: [UNRATE, AWHMAN, CPIAUCSL]
# hand-picked column list
path:
1_data_task:
fixed_axes:
variable_universe: explicit_variable_list
leaf_config:
variable_universe_columns: [RPI, UNRATE, CPIAUCSL]
1.4.4 deterministic_components
Appends deterministic feature columns to the X matrix. 6 operational values.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Default. No augmentation. |
|
operational |
Explicit column of 1s (redundant with |
|
operational |
Adds a |
|
operational |
Adds 11 monthly dummies ( |
|
operational |
Adds 3 quarterly dummies ( |
|
operational |
One 0/1 dummy per date in |
Functions & features
Module:
macroforecast.execution.deterministic—augment_frame(df, component, *, index=None, break_dates=None)+augment_array(X, component, *, index, break_dates=None).Wired into
_build_raw_panel_training_dataafter preprocessing. Both X_train and X_pred are augmented identically so the fitted coefficients apply at prediction time.monthly_seasonal/quarterly_seasonalrequire aDatetimeIndex.Compile guard:
break_dummiesrequires non-emptyleaf_config.break_dates.
Dropped values
trend_and_quadratic: redundant withlinear_trend+ a futureleaf_config.trend_orderchannel. The quadratic / higher-order polynomial trend will re-enter as a trend-order parameter when needed rather than a separate axis value.
Recipe usage
path:
1_data_task:
fixed_axes:
deterministic_components: break_dummies
leaf_config:
break_dates: ["2008-09-01", "2020-03-01"]
3_feature_engineering:
nodes:
- {id: src_x, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: predictors}}}
- {id: break_x, type: step, op: deterministic_components, params: {component: break_dummies}, inputs: [src_x]}
sinks:
l3_features_v1: {X_final: break_x}
Benchmark & Predictor Universe (1.4) takeaways
Every value in every 1.4 axis is operational in v1.0. Zero
registry_onlyentries remain.benchmark_familygains 4 formerly-metadata variants as real implementations:factor_model_benchmark,benchmark_suite,paper_specific_benchmark,survey_forecast.predictor_familyandvariable_universeuse the same design pattern: the user provides a pre-computed column list (or category mapping) vialeaf_config; runtime discovery is out of scope.deterministic_componentsaugments the raw-panel X with classical econometric terms (trend / seasonals / break dummies) via a dedicated module.
Next group: 1.5 Data handling policies (coming).