Layer 5: Evaluation
Parent: Detail: Layer Contracts
Previous: Layer 4
Current: Layer 5
Next: Layer 6
Layer 5 computes forecast accuracy, benchmark-relative metrics, aggregation, slicing, decomposition, ranking, and reporting artifacts. It is descriptive evaluation; inference belongs to Layer 6.
Contract
Inputs:
l4_forecasts_v1;l4_model_artifacts_v1;l1_data_definition_v1;optional
l1_regime_metadata_v1;l3_metadata_v1.
Output:
l5_evaluation_v1.
Sub-Layers
Slot |
Purpose |
|---|---|
L5.A |
metric specification |
L5.B |
benchmark comparison |
L5.C |
aggregation |
L5.D |
sample slicing and decomposition |
L5.E |
ranking and reporting |
Main Axes
metrics:
primary_metric,point_metrics,density_metrics,direction_metrics,relative_metrics;benchmark:
benchmark_window,benchmark_scope;aggregation:
agg_time,agg_horizon,agg_target,agg_state;slicing:
oos_period,regime_use,regime_metrics;decomposition:
decomposition_target,decomposition_order;output shape:
ranking,report_style.
Gates
Relative metrics require an L4 benchmark.
Density metrics require quantile or density forecasts.
agg_targetrequires multi-target data.agg_stateandby_statedecomposition require FRED-SD.Regime metrics and
by_regimedecomposition require an active L1 regime.ranking: mcs_inclusionrequires active L6 MCS.
Example
5_evaluation:
fixed_axes:
primary_metric: relative_mse
point_metrics: [mse, mae]
relative_metrics: [relative_mse, r2_oos]
benchmark_scope: per_target_horizon
agg_horizon: per_horizon_separate
ranking: by_relative_metric
report_style: per_target_horizon_panel
See encyclopedia
For the full per-axis × per-option catalogue (every value with its OptionDoc summary, when-to-use / when-NOT, references), see encyclopedia/l5/.