FRED-SD Transform Policy
FRED-SD transformation codes are a research layer in macroforecast. They are not source metadata.
Policy
Default runtime policy:
FRED-MD official t-codes are applied when
tcode_policy="official_tcode_only".FRED-QD official t-codes are applied when
tcode_policy="official_tcode_only".FRED-SD inferred t-codes are not applied by default.
Opt-in runtime policy:
exp = (
mf.Experiment(
dataset="fred_qd+fred_sd",
target="GDPC1",
start="1985-01",
end="2019-12",
horizons=[1, 2, 4],
)
.use_sd_inferred_tcodes()
)
The opt-in map is sd-analog-v0.1. It is stored in
macroforecast.raw.sd_inferred_tcodes.SD_INFERRED_TCODE_MAP.
Important interpretation:
sd-analog-v0.1is not a state-by-state stationarity optimizer.It applies one reviewed code to every state column for a given FRED-SD variable, for example all
UR_*columns share code2.The reviewed code is anchored to the closest FRED-MD/FRED-QD national analog when that analog is economically direct, then checked against state-level diagnostics.
The map therefore answers “what transformation is defensible for this SD variable as a cross-state panel?” not “what code maximizes stationarity for each individual state series?”
Policy Choices
FRED-SD has no official transform row, so there are three distinct policies a
researcher could choose. They should not be conflated.
policy |
unit of decision |
what it does |
benefit |
risk |
macroforecast status |
|---|---|---|---|---|---|
National-analog transfer |
SD variable |
If an SD variable is a state version of a national FRED-MD/QD object, use the national official t-code analog and apply it to every state. |
Anchored to official MD/QD source metadata; keeps state panels comparable. |
May not maximize stationarity for every state; weak for variables without direct national analogs. |
Current opt-in |
Variable-global empirical |
SD variable |
Search candidate codes on all states and select one code per SD variable using state/aggregate diagnostics. |
Targets stationarity while keeping one transform per cross-state panel. |
Can disagree with official national analog semantics; sample/vintage dependent. |
Explicit opt-in |
State-variable empirical |
SD variable x state |
Search candidate codes independently for each state series, for example |
Maximizes stationarity diagnostics column by column. |
Breaks cross-state comparability, can overfit small/state-specific samples, and may change by vintage. |
Explicit override |
Recommended default for package runtime remains:
Use official FRED-MD/QD t-codes where the source provides them.
Leave FRED-SD untransformed unless the user explicitly opts in.
When opting into FRED-SD, use the reviewed national-analog map first.
Treat variable-global or state-variable empirical t-codes as a separate research design, with manifest evidence and audit artifacts.
Runtime opt-ins:
# National-analog reviewed policy.
exp.use_sd_inferred_tcodes()
# Empirical stationarity policy: one code per SD variable, shared across states.
exp.use_sd_empirical_tcodes(unit="variable_global")
# Empirical stationarity override: one explicit code per selected state column.
exp.use_sd_empirical_tcodes(
unit="state_series",
code_map={"UR_CA": 2, "UR_TX": 5},
audit_uri="artifacts/sd_state_series_audit.csv",
)
Review Status
Runtime applies only these statuses by default:
status |
runtime use |
meaning |
|---|---|---|
|
yes |
Direct analog and sufficient diagnostics. |
|
yes |
Plausible analog, lower confidence. |
|
yes |
Code depends on experiment frequency. |
|
yes |
Source frequency is clear, but diagnostics require caution. |
|
no |
Statistics may fit, but the economic object changes. |
|
no |
No runtime use before targeted review. |
|
no |
No inferred code in v0.1. |
Users can restrict runtime application:
exp.use_sd_inferred_tcodes(statuses=["tentative_accept"])
They can also include additional statuses, but that should be treated as a research override.
Manifest Contract
When SD inferred t-codes are used, the run manifest includes
data_reports.sd_inferred_tcodes.
Expected fields:
field |
meaning |
|---|---|
|
Research map version, for example |
|
Always |
|
Research source identifier, for example |
|
Normalized runtime policy: |
|
|
|
|
|
Normalized experiment frequency. |
|
Review statuses allowed for this run. |
|
Column-to-code map used by t-code preprocessing. |
|
Column-to-review-status map for columns not used. |
|
Variable-level reviewed metadata. |
The manifest also emits a warning:
FRED-SD inferred/empirical t-codes are macroforecast research metadata, not official FRED-SD metadata
Frequency-Specific Rules
BPPRIVSA must not be represented as one global code.
experiment frequency |
analog source |
code |
reason |
|---|---|---|---|
monthly |
FRED-MD |
4 |
Same-frequency official MD permits/housing transform. |
quarterly |
FRED-QD |
5 |
Same-frequency official QD permits/housing transform. |
STHPI follows QD USSTHPI code 5.
Reason:
It is a house price index object.
The reviewed same-concept analog is quarterly QD
USSTHPI.Monthly interpolation should happen after the source-frequency transform decision; interpolation does not create a new official monthly analog.
Stationarity diagnostics are weak, so the status is
frequency_specific_provisional.
State-Level Stationarity Audit
On 2026-04-26, macroforecast checked the actual FRED-SD live by-series workbook
series-2026-03.xlsx against candidate codes (1, 2, 4, 5, 6) for every
state column, using observations from 2005-06 onward. For each
SD variable x state, the audit selected the candidate with the best simple
ADF/KPSS stationarity score. This is a diagnostic only; it does not define the
runtime policy.
The diagnostic confirms the user’s concern: a stationarity-only state-level
choice often differs from the national-analog code. For example, many
employment panels have FRED-MD/QD analog code 5, while state-by-state
ADF/KPSS screening often favors code 2. That does not automatically mean
code 2 is the correct package default; it means the empirical-stationarity
policy is a different research design from the national-analog policy.
SD variable |
current opt-in code |
dominant state-level stationarity code |
dominant state share |
state-level distribution |
|---|---|---|---|---|
|
monthly |
2 |
0.92 |
2:0.92; 4:0.02; 5:0.04; 6:0.02 |
|
5 |
2/6 tie |
0.45 |
1:0.02; 2:0.45; 5:0.08; 6:0.45 |
|
5, not runtime-default status |
6 |
0.61 |
2:0.22; 5:0.16; 6:0.61 |
|
5 |
2 |
0.96 |
2:0.96; 5:0.04 |
|
5 |
2 |
0.67 |
2:0.67; 5:0.02; 6:0.31 |
|
none |
6 |
0.39 |
2:0.24; 5:0.37; 6:0.39 |
|
5, not runtime-default status |
5 |
0.43 |
2:0.39; 5:0.43; 6:0.18 |
|
5 |
2 |
0.98 |
2:0.98; 6:0.02 |
|
5 |
2 |
0.96 |
1:0.02; 2:0.96; 5:0.02 |
|
5 |
2 |
0.98 |
2:0.98; 6:0.02 |
|
5 |
2 |
0.86 |
2:0.86; 5:0.06; 6:0.08 |
|
none |
2 |
0.59 |
2:0.59; 5:0.18; 6:0.24 |
|
5 |
2 |
0.88 |
2:0.88; 5:0.06; 6:0.06 |
|
5, not runtime-default status |
2 |
0.94 |
2:0.94; 5:0.04; 6:0.02 |
|
5 |
2 |
0.67 |
2:0.67; 5:0.12; 6:0.22 |
|
1 |
2 |
0.98 |
2:0.98; 6:0.02 |
|
5 |
2 |
0.88 |
2:0.88; 5:0.02; 6:0.10 |
|
5 |
2 |
1.00 |
2:1.00 |
|
none |
2 |
0.97 |
2:0.97; 6:0.03 |
|
5 |
5 |
0.67 |
2:0.20; 5:0.67; 6:0.14 |
|
5 |
5 |
0.55 |
2:0.39; 5:0.55; 6:0.06 |
|
2 |
2 |
0.98 |
2:0.98; 6:0.02 |
|
5 |
2 |
1.00 |
2:1.00 |
|
5, not runtime-default status |
5 |
0.69 |
2:0.27; 5:0.69; 6:0.04 |
|
none |
2 |
0.77 |
2:0.77; 5:0.06; 6:0.17 |
|
5 |
6 |
0.90 |
2:0.04; 5:0.06; 6:0.90 |
|
2 |
2 |
1.00 |
2:1.00 |
|
none |
2 |
0.88 |
2:0.88; 5:0.06; 6:0.06 |
Interpretation:
The current opt-in map keeps cross-state comparability by using one variable-level code.
The state-level diagnostic is useful for sensitivity analysis and future research modes, but it should not silently replace the national-analog map.
The variable-global empirical runtime policy uses the
dominant state-level stationarity codecolumn above assd-variable-global-stationarity-v0.1.The state-variable empirical runtime policy does not ship a silent built-in full state-by-series map. It accepts only an explicit
sd_tcode_code_mapprovided by the user or recipe.State-variable empirical runs must write every selected column code, sample window, vintage, test battery, tie-break rule, and audit location into the manifest.
State-series override recipe shape:
path:
2_preprocessing:
leaf_config:
sd_tcode_policy: state_series_stationarity_override_v0_1
sd_tcode_map_version: sd-state-series-stationarity-override-v0.1
sd_tcode_audit_uri: artifacts/sd_state_series_audit.csv
sd_tcode_code_map:
UR_CA: 2
UR_TX: 5
Validation Protocol
The validation script is:
python tools/research/build_sd_tcode_validation.py \
--sd-workbook /tmp/fred_sd_series_2026_03_validation.xlsx \
--md-csv /tmp/fred_md_current_validation.csv \
--qd-csv /tmp/fred_qd_current_validation.csv \
--output-dir /tmp/macrocast_sd_tcode_validation_rigorous \
--sample-start 2005-01 \
--sample-end 2025-12
Generated artifacts:
sd_tcode_candidate_results.csvsd_tcode_selected_map.jsonsd_tcode_report.md
The report shows every candidate-code/analog row. The JSON selected map merges
score-ranked diagnostics with reviewed status metadata from
SD_INFERRED_TCODE_MAP.
Diagnostics
The validation output includes:
diagnostic |
purpose |
|---|---|
|
Pearson correlation between transformed SD aggregate and transformed MD/QD analog. |
|
Pearson correlation p-value. |
|
Rank correlation for monotone relation. |
|
Spearman p-value. |
|
Average rolling aggregate correlation. |
|
Worst rolling aggregate correlation. |
|
Median state-level correlation to analog. |
|
State-level correlation dispersion. |
|
Share of states with positive analog correlation. |
|
State-level ADF stationarity pass rate. |
|
State-level KPSS stationarity pass rate. |
|
ADF result for SD aggregate. |
|
KPSS result for SD aggregate. |
|
ADF result for analog. |
|
KPSS result for analog. |
|
Distance between SD and analog autocorrelation profiles. |
|
SD low-frequency variance share. |
|
Analog low-frequency variance share. |
|
Distance between SD and analog low-frequency ratios. |
|
SD transformed outlier share. |
|
Analog transformed outlier share. |
|
SD aggregate volatility divided by analog volatility. |
|
Missing share after transform. |
|
Number of state columns with enough observations. |
Runtime Order
For composite datasets, macroforecast first aligns component frequencies:
fred_md+fred_sduses monthly frequency.fred_qd+fred_sduses quarterly frequency.
Then SD inferred t-code metadata is added to transform_codes only when the
user opted in. The existing tcode_policy="official_tcode_only" preprocessing path then
applies MD/QD official codes and any opted-in SD inferred codes together.
Non-Goals
This policy does not make SD inferred t-codes official.
This policy does not apply semantic_review, manual_review, or reject
entries by default.
This policy does not add extra filtering, scaling, missing-value imputation, or target-specific transformations beyond the existing experiment preprocessing contract.