Forecast-Time And Source-Frame Policies (1.5)
Declares how forecast-time information and source-frame quality are handled
before Layer 2 sees the data. Data revision / vintage status is controlled by
information_set_type in the forecast-time information page. This page covers
publication lag, same-period x access, raw-source missing/outlier treatment
before T-codes, and frame availability after the Layer 1 source frame exists.
Section |
axis |
Role |
|---|---|---|
1.5.1 |
Frame Availability Policy: what to do when predictor / target rows contain NaN after the Layer 1 source frame exists |
|
1.5.2 |
Whether to repair raw-source missing values before FRED transforms/T-codes |
|
1.5.3 |
Whether to repair raw-source outliers before FRED transforms/T-codes |
|
1.5.4 |
Publication Lag Rule: when predictor observations are treated as published and usable |
|
1.5.5 |
Same-Period Predictor Rule: whether x observed at the target date may enter the model |
Note on dropped axes:
alignment_rule— mixed-frequency calendar axis; meaningful mainly for FRED-SD. Current runtime uses explicit monthly/quarterly conversion plus provenance reports, Layer 2 native-frequency block payloads, custom mixed-frequency adapters, and narrow built-in MIDAS routes (midas_almon,midasrwithnealmon/almonp/nbeta/genexp/harstep). State-space mixed-frequency likelihoods remain future.evaluation_scale— re-homed to Layer 2 (PreprocessContract.evaluation_scale) where the actual runtime effect lives.exogenous_block— redundant withfeature_builderdefault logic.regime_task— duplicates 1.3oos_period.recession_only_oos/expansion_only_oos.vintage_policy— dropped as a separate axis. Current data revision / vintage control is handled byinformation_set_typeplusleaf_config.data_vintage, including FRED-SD vintages.x_map_policy— single-op non-axis; multi-target X mapping is owned bystudy_scope(0.2). At a glance (defaults):missing_availability = zero_fill_leading_predictor_gaps— Frame Availability Policy. After the selected sample period is sliced, predictor leading missing values before each column’s first valid observation are filled with zero and recorded in provenance. Switch torequire_complete_rows,keep_available_rows, orimpute_predictors_onlyonly when a specific missing-data treatment matters.raw_missing_policy = preserve_raw_missing— leave raw-source missing values unchanged before FRED transforms/T-codes. Switch only when the research design intentionally cleans raw data before T-code construction.raw_outlier_policy = preserve_raw_outliers— leave raw-source outliers unchanged before FRED transforms/T-codes. Switch only when the research design intentionally clips or flags raw data before T-code construction.release_lag_rule = ignore_release_lag— Publication Lag Rule. Every column is available at its nominal date. Switch tofixed_lag_all_series/series_specific_lagwhen you need to simulate a publication lag.contemporaneous_x_rule = forbid_same_period_predictors— Same-Period Predictor Rule. Realistic real-time constraint. Switch toallow_same_period_predictorsonly for oracle / data-leak benchmarks.
Most research runs leave all five at the default.
1.5.1 missing_availability
Frame Availability Policy. Selects how NaN rows are handled after the source frame exists and before Layer 2 representation construction. Four operational values.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Default. Within the selected sample period, predictor leading missing values are filled with 0. Fully missing predictors are also filled with 0 and warned. Target leading missing dates are reported; target mid-sample missing blocks execution. |
|
operational |
No panel-level filter; downstream executors handle NaNs per their own policy. |
|
operational |
Drop rows where any non-date column has NaN before training. Aggressive but legitimate on short fixture windows. |
|
operational |
Impute predictor (non-target) columns using |
Functions & features
Sample-period availability path:
macroforecast.execution.build._apply_sample_period_and_availability(raw_result, recipe, *, target)implementszero_fill_leading_predictor_gapsand recordsdata_reports["availability"].General missing policy path:
macroforecast.execution.build._apply_missing_availability(raw_result, rule, *, target, spec).Called during dataset loading in
execute_recipeafter official transforms have produced the selected frame and before researcher preprocessing runs.Compile guard:
impute_predictors_onlywithout validleaf_config.x_imputationraisesCompileValidationError.
Dropped values
target_date_drop_if_missing,real_time_missing_as_missing,state_space_fill,factor_fill,em_fill— complex / niche imputation strategies; v1.1+.
Recipe usage
# Forward-fill predictor columns, keep target NaNs visible
path:
1_data_task:
fixed_axes:
missing_availability: impute_predictors_only
leaf_config:
x_imputation: ffill
1.5.2 raw_missing_policy
Selects raw-source missing treatment before FRED transforms/T-codes. Four operational values.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Default. Leave raw-source missing values untouched before FRED transforms/T-codes. |
|
operational |
Within the selected sample period, fill predictor leading missing values with 0 before FRED transforms/T-codes. |
|
operational |
Impute raw predictor columns before FRED transforms/T-codes using |
|
operational |
Drop rows with any raw-source missing value before FRED transforms/T-codes. Aggressive; use only for explicit full-mode designs. |
Functions & features
Runtime path:
macroforecast.execution.build._apply_raw_missing_policy(raw_result, rule, *, target, spec).Called before
macroforecast.execution.build._apply_tcode_preprocessing(...), so any changes affect T-code construction.Compile guard:
impute_raw_predictorswithout validleaf_config.raw_x_imputationraisesCompileValidationError.Provenance: runtime records
data_reports["raw_missing"]withbefore_official_transform: true.
Recipe usage
# Clean raw predictors before official FRED T-codes are applied
path:
1_data_task:
fixed_axes:
raw_missing_policy: impute_raw_predictors
leaf_config:
raw_x_imputation: ffill
1.5.3 raw_outlier_policy
Selects raw-source outlier treatment before FRED transforms/T-codes. Six operational values.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Default. Leave raw-source outliers untouched before FRED transforms/T-codes. |
|
operational |
Clip raw numeric columns at the 1st and 99th percentiles. |
|
operational |
Clip raw numeric columns by 1.5 IQR fences. |
|
operational |
Clip raw numeric columns by 3 MAD fences. |
|
operational |
Clip raw numeric columns by 3 standard deviations. |
|
operational |
Convert values outside the 1st and 99th percentiles to missing before FRED transforms/T-codes. |
Functions & features
Runtime path:
macroforecast.execution.build._apply_raw_outlier_policy(raw_result, rule, *, spec).Called before
macroforecast.execution.build._apply_tcode_preprocessing(...), so any changes affect T-code construction.Optional column subset:
leaf_config.raw_outlier_columns. If omitted, all raw numeric non-date columns are eligible.Provenance: runtime records
data_reports["raw_outliers"]withbefore_official_transform: true.
Recipe usage
# Clip selected raw columns before official FRED T-codes are applied
path:
1_data_task:
fixed_axes:
raw_outlier_policy: iqr_clip_raw
leaf_config:
raw_outlier_columns: [INDPRO, RPI]
1.5.4 release_lag_rule
Publication Lag Rule. Selects publication-lag policy for predictor shifts. Three operational values.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Default, no-op. Every column is available at its nominal date. |
|
operational |
Shift every non-date column by 1 period (uniform 1-month publication lag). |
|
operational |
Per-column shift declared via |
Functions & features
Module:
macroforecast.execution.build._apply_release_lag(raw_result, rule, *, spec).Compile guard:
series_specific_lagwithout a non-emptyleaf_config.release_lag_per_seriesdict raisesCompileValidationError.
Dropped values
calendar_exact_lag,lag_conservative,lag_aggressive— pure duplicates (observable behaviour identical to other values).
Recipe usage
# Typical FRED-MD release convention: most series lag 1 month, UNRATE lags 0
path:
1_data_task:
fixed_axes:
release_lag_rule: series_specific_lag
leaf_config:
release_lag_per_series:
INDPRO: 1
CPIAUCSL: 1
UNRATE: 0
Moved Out Of Layer 1
structural_break_segmentation is now a Layer 2 representation/feature-block decision. It augments the model input with break dummies, so it no longer belongs to the FRED data-frame task. For user-supplied break dates, use Layer 2 deterministic_components=break_dummies with leaf_config.break_dates.
1.5.5 contemporaneous_x_rule
Same-Period Predictor Rule. Selects whether x observed at the target date may enter the model. Two operational values.
Value catalog
Value |
Status |
What it does |
|---|---|---|
|
operational |
Default. |
|
operational |
|
Functions & features
Wired inside
macroforecast.execution.build._build_raw_panel_training_data— the axis value selects howX_trainandX_predalign with the target.Applies to raw-panel recipes only (target_lag_features uses target lags, so the axis is irrelevant there).
Recipe usage
# Oracle contemporaneous-X benchmark
path:
1_data_task:
fixed_axes:
contemporaneous_x_rule: allow_same_period_predictors
3_feature_engineering:
nodes:
- {id: src_x, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: predictors}}}
- {id: src_y, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: target}}}
- {id: y_h, type: step, op: target_construction, params: {horizon: 1}, inputs: [src_y]}
sinks:
l3_features_v1: {X_final: src_x, y_final: y_h}
Forecast-Time And Source-Frame Policies (1.5) takeaways
Every value in every 1.5 axis is operational in v1.0. Zero
registry_onlyentries remain.All required non-default inputs are compile-time contracts (
release_lag_per_series,x_imputation,raw_x_imputation) and are propagated intodata_task_spec.raw_missing_policyandraw_outlier_policyrun before official transforms/T-codes;missing_availabilityand Layer 2 preprocessing policies run after the Layer 1 source frame exists.contemporaneous_x_ruleis the only 1.5 axis that affects fit-time X alignment; the other data-handling axes act on the raw or source frame before researcher preprocessing.
Layer 1 data-handling walk complete — the current policy axes are operational.