Layer Contract Design
macroforecast is organized as explicit layer contracts. Each layer either exposes a list of axes or a DAG of nodes. The contract is the public interface: recipe YAML, Navigator choices, validators, runtime artifacts, and L8 manifests must agree on the same layer IDs, sink names, and option names.
Layer Map
L0 -> L1 -> L2 -> L3(DAG) -> L4(DAG) -> L5 -> L6 -> L7(DAG) -> L8
| | | |
L1.5 L2.5 L3.5 L4.5 diagnostics
Layer |
Category |
Mode |
Purpose |
|---|---|---|---|
L0 |
setup |
list |
runtime policy: failure handling, reproducibility, compute layout |
L1 |
construction |
list |
data source, target, predictor universe, geography, sample, horizons, regimes |
L2 |
construction |
list |
raw-to-clean preprocessing |
L3 |
construction |
graph |
feature engineering and target construction |
L4 |
construction |
graph |
model fitting, forecasting, benchmarks, ensembles, tuning |
L5 |
consumption |
list |
metrics, benchmark-relative evaluation, aggregation, ranking |
L6 |
consumption |
list |
statistical tests; default off |
L7 |
consumption |
graph |
interpretation, importance, transformation attribution; default off |
L8 |
consumption |
list |
export, saved objects, provenance, artifact layout |
L1.5 |
diagnostic |
list |
raw data summary; default off |
L2.5 |
diagnostic |
list |
pre/post preprocessing comparison; default off |
L3.5 |
diagnostic |
list |
feature diagnostics; default off |
L4.5 |
diagnostic |
list |
model-fit and generator diagnostics; default off |
Rules That Matter
L0-L4 are the sweepable construction surface. L5-L8 and diagnostics describe, test, interpret, or export existing cells.
L3, L4, and L7 are graph layers. Use
nodesandsinks; fixed-axis sugar is not accepted for L3/L4.L6, L7, and all
.5diagnostic layers are default off. When a diagnostic layer hasenabled: false, it produces no DAG nodes and no sink.L8 derives default
saved_objectsfrom active upstream layers. Active diagnostics are exported asdiagnostics_l1_5,diagnostics_l2_5,diagnostics_l3_5, anddiagnostics_l4_5.Forecast combination belongs in L4. L3 rejects L4 forecast-combine ops.
study_scopeis not a Layer 0 axis in the current layer-contract system. It is derived into manifest metadata when needed.
Core Data Flow
L1 defines raw data and regime metadata. L2 consumes L1 and emits the cleaned panel. L3 consumes cleaned data plus optional raw/regime access, then emits l3_features_v1 and l3_metadata_v1. L4 consumes L3 features and emits forecasts, model artifacts, and training metadata. L5 consumes forecasts and produces evaluation artifacts. L6 and L7 are optional consumption layers. L8 collects all active sinks into an export manifest.
Diagnostics are side branches. They inspect upstream artifacts but do not modify construction-layer sinks.
YAML Shape
Minimal construction path:
1_data:
fixed_axes:
dataset: fred_md
leaf_config:
target: CPIAUCSL
2_preprocessing:
fixed_axes: {}
3_feature_engineering:
nodes:
- {id: src_x, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: predictors}}}
- {id: src_y, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: target}}}
- {id: x_lag, type: step, op: lag, params: {n_lag: 4}, inputs: [src_x]}
- {id: y_h, type: step, op: target_construction, params: {horizon: 1}, inputs: [src_y]}
sinks:
l3_features_v1: {X_final: x_lag, y_final: y_h}
l3_metadata_v1: auto
4_forecasting_model:
nodes:
- {id: src_X, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: X_final}}}
- {id: src_y, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: y_final}}}
- {id: fit_ridge, type: step, op: fit_model, params: {family: ridge}, inputs: [src_X, src_y]}
- {id: predict_ridge, type: step, op: predict, inputs: [fit_ridge, src_X]}
sinks:
l4_forecasts_v1: predict_ridge
l4_model_artifacts_v1: fit_ridge
l4_training_metadata_v1: auto
5_evaluation:
fixed_axes: {}
8_output:
fixed_axes: {}
Optional diagnostics:
1_5_data_summary:
enabled: true
fixed_axes: {}
3_5_feature_diagnostics:
enabled: true
fixed_axes:
comparison_stages: raw_vs_cleaned_vs_features
8_output:
fixed_axes:
saved_objects: [forecasts, metrics, ranking, diagnostics_all]
Naming Notes
Layer IDs in code are lower-snake:
l1,l3_5,l8.YAML layer keys are numeric names:
1_data,3_feature_engineering,4_5_generator_diagnostics.Sink names are versioned:
l3_features_v1,l4_forecasts_v1,l8_artifacts_v1.L3 supports canonical design names such as
varimax,polynomial,kernel, andnystroem, while retaining compatibility aliases such asvarimax_rotation,polynomial_expansion,kernel_features, andnystroem_features.
Next: Data.