Layer Contract Design

macroforecast is organized as explicit layer contracts. Each layer either exposes a list of axes or a DAG of nodes. The contract is the public interface: recipe YAML, Navigator choices, validators, runtime artifacts, and L8 manifests must agree on the same layer IDs, sink names, and option names.

Layer Map

L0 -> L1 -> L2 -> L3(DAG) -> L4(DAG) -> L5 -> L6 -> L7(DAG) -> L8
        |      |      |       |
       L1.5   L2.5   L3.5    L4.5 diagnostics

Layer	Category	Mode	Purpose
L0	setup	list	runtime policy: failure handling, reproducibility, compute layout
L1	construction	list	data source, target, predictor universe, geography, sample, horizons, regimes
L2	construction	list	raw-to-clean preprocessing
L3	construction	graph	feature engineering and target construction
L4	construction	graph	model fitting, forecasting, benchmarks, ensembles, tuning
L5	consumption	list	metrics, benchmark-relative evaluation, aggregation, ranking
L6	consumption	list	statistical tests; default off
L7	consumption	graph	interpretation, importance, transformation attribution; default off
L8	consumption	list	export, saved objects, provenance, artifact layout
L1.5	diagnostic	list	raw data summary; default off
L2.5	diagnostic	list	pre/post preprocessing comparison; default off
L3.5	diagnostic	list	feature diagnostics; default off
L4.5	diagnostic	list	model-fit and generator diagnostics; default off

Rules That Matter

L0-L4 are the sweepable construction surface. L5-L8 and diagnostics describe, test, interpret, or export existing cells.
L3, L4, and L7 are graph layers. Use nodes and sinks; fixed-axis sugar is not accepted for L3/L4.
L6, L7, and all .5 diagnostic layers are default off. When a diagnostic layer has enabled: false, it produces no DAG nodes and no sink.
L8 derives default saved_objects from active upstream layers. Active diagnostics are exported as diagnostics_l1_5, diagnostics_l2_5, diagnostics_l3_5, and diagnostics_l4_5.
Forecast combination belongs in L4. L3 rejects L4 forecast-combine ops.
study_scope is not a Layer 0 axis in the current layer-contract system. It is derived into manifest metadata when needed.

Core Data Flow

L1 defines raw data and regime metadata. L2 consumes L1 and emits the cleaned panel. L3 consumes cleaned data plus optional raw/regime access, then emits l3_features_v1 and l3_metadata_v1. L4 consumes L3 features and emits forecasts, model artifacts, and training metadata. L5 consumes forecasts and produces evaluation artifacts. L6 and L7 are optional consumption layers. L8 collects all active sinks into an export manifest.

Diagnostics are side branches. They inspect upstream artifacts but do not modify construction-layer sinks.

YAML Shape

Minimal construction path:

1_data:
  fixed_axes:
    dataset: fred_md
  leaf_config:
    target: CPIAUCSL

2_preprocessing:
  fixed_axes: {}

3_feature_engineering:
  nodes:
    - {id: src_x, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: predictors}}}
    - {id: src_y, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: target}}}
    - {id: x_lag, type: step, op: lag, params: {n_lag: 4}, inputs: [src_x]}
    - {id: y_h, type: step, op: target_construction, params: {horizon: 1}, inputs: [src_y]}
  sinks:
    l3_features_v1: {X_final: x_lag, y_final: y_h}
    l3_metadata_v1: auto

4_forecasting_model:
  nodes:
    - {id: src_X, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: X_final}}}
    - {id: src_y, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: y_final}}}
    - {id: fit_ridge, type: step, op: fit_model, params: {family: ridge}, inputs: [src_X, src_y]}
    - {id: predict_ridge, type: step, op: predict, inputs: [fit_ridge, src_X]}
  sinks:
    l4_forecasts_v1: predict_ridge
    l4_model_artifacts_v1: fit_ridge
    l4_training_metadata_v1: auto

5_evaluation:
  fixed_axes: {}

8_output:
  fixed_axes: {}

Optional diagnostics:

1_5_data_summary:
  enabled: true
  fixed_axes: {}

3_5_feature_diagnostics:
  enabled: true
  fixed_axes:
    comparison_stages: raw_vs_cleaned_vs_features

8_output:
  fixed_axes:
    saved_objects: [forecasts, metrics, ranking, diagnostics_all]

Naming Notes

Layer IDs in code are lower-snake: l1, l3_5, l8.
YAML layer keys are numeric names: 1_data, 3_feature_engineering, 4_5_generator_diagnostics.
Sink names are versioned: l3_features_v1, l4_forecasts_v1, l8_artifacts_v1.
L3 supports canonical design names such as varimax, polynomial, kernel, and nystroem, while retaining compatibility aliases such as varimax_rotation, polynomial_expansion, kernel_features, and nystroem_features.

Next: Data.