Bring Your Own Data
macroforecast works with any time-series panel you supply. This guide covers monthly and quarterly CSV / Parquet files.
If you prefer the official FRED-MD or FRED-QD panels, start with FRED-MD or FRED-QD instead.
FRED-MD/QD format note: the raw FRED CSV files include a
Transform:header row above the data. Your custom CSV must not include that row – it is an artefact of the official FRED format and is stripped automatically only whendataset=fred_md/fred_qduses the built-in adapter. Custom CSV files are plain panels: date index + numeric columns only.
When to use this guide
Use this guide when you have:
A proprietary indicator panel (e.g., firm-level surveys, regional prices).
A monthly or quarterly series not available in FRED.
A country-specific macro panel.
If you have a few additional series you want to add on top of the official FRED panel, see Merging with FRED-MD or FRED-QD.
File format contract
Monthly CSV
date,my_target,x1,x2
1990-01-01,1.23,0.45,2.10
1990-02-01,1.31,0.47,2.05
1990-03-01,1.29,0.46,1.99
Rules:
First column: date, parseable by pandas (
YYYY-MM-DDis the safest format;YYYY-MMalso works when the day is not meaningful).Remaining columns: numeric. Non-numeric cells are coerced to
NaN; columns that are entirelyNaNare dropped silently.No
Transform:row. No multi-level headers. No trailing metadata rows.The column you name as
targetin the recipe must be present.
Quarterly CSV
Same rules. Use YYYY-01-01, YYYY-04-01, YYYY-07-01, YYYY-10-01 as
quarterly date stamps, or any convention pandas parses to quarterly periods.
The recipe axis frequency: quarterly tells the runtime to interpret the
dates as quarterly.
Parquet
Same schema as CSV. The Parquet file may have either a DatetimeIndex or
a date column as its first column. Column names and numeric typing rules
are identical.
Running with your own data
Option A: YAML recipe (recommended)
Set custom_source_policy: custom_panel_only and point custom_source_path
at your file. The runtime infers CSV vs Parquet from the file extension
(.csv -> CSV loader; .parquet or .pq -> Parquet loader).
Monthly example
0_meta:
fixed_axes:
failure_policy: fail_fast
reproducibility_mode: seeded_reproducible
1_data:
fixed_axes:
custom_source_policy: custom_panel_only
dataset: fred_md # labels the panel as "monthly" in the runtime
frequency: monthly
horizon_set: custom_list
leaf_config:
target: my_target
target_horizons: [1, 3, 6]
custom_source_path: data/my_monthly_panel.csv
sample_start_date: "1990-01"
sample_end_date: "2019-12"
2_preprocessing:
fixed_axes:
transform_policy: no_transform
outlier_policy: none
imputation_policy: none_propagate
frame_edge_policy: keep_unbalanced
3_feature_engineering:
nodes:
- {id: src_X, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: predictors}}}
- {id: src_y, type: source, selector: {layer_ref: l2, sink_name: l2_clean_panel_v1, subset: {role: target}}}
- {id: lag_x, type: step, op: lag, params: {n_lag: 1}, inputs: [src_X]}
- {id: y_h, type: step, op: target_construction, params: {mode: point_forecast, method: direct, horizon: 1}, inputs: [src_y]}
sinks:
l3_features_v1: {X_final: lag_x, y_final: y_h}
l3_metadata_v1: auto
4_forecasting_model:
nodes:
- {id: src_X, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: X_final}}}
- {id: src_y, type: source, selector: {layer_ref: l3, sink_name: l3_features_v1, subset: {component: y_final}}}
- id: fit_ridge
type: step
op: fit_model
params: {family: ridge, alpha: 1.0, min_train_size: 24, forecast_strategy: direct,
training_start_rule: expanding, refit_policy: every_origin, search_algorithm: none}
inputs: [src_X, src_y]
- {id: predict_ridge, type: step, op: predict, inputs: [fit_ridge, src_X]}
sinks:
l4_forecasts_v1: predict_ridge
l4_model_artifacts_v1: fit_ridge
l4_training_metadata_v1: auto
5_evaluation:
fixed_axes:
primary_metric: mse
point_metrics: [mse, rmse, mae]
8_output:
fixed_axes:
saved_objects: [forecasts, metrics, ranking]
leaf_config:
output_directory: ./output/my_study/
Run it:
import macroforecast as mf
result = mf.run("my_study.yaml", output_directory="output/my_study/")
print(result.cells[0].sink_hashes)
Quarterly example
Change two lines:
dataset: fred_qd # labels the panel as "quarterly"
frequency: quarterly
Everything else stays the same. The quarterly panel uses the same date-index
format rules as monthly; the runtime resolves the frequency from dataset.
Option B: Python helper functions
mf.load_custom_csv and mf.load_custom_parquet load your file and return
a RawLoadResult you can inspect before running a full study.
import macroforecast as mf
# Monthly panel
result = mf.load_custom_csv("data/my_monthly_panel.csv", dataset="fred_md")
print(result.data.head()) # pandas DataFrame, date index
print(result.dataset_metadata) # frequency, data_through, etc.
# Quarterly panel
result_q = mf.load_custom_csv("data/my_quarterly_panel.csv", dataset="fred_qd")
# Parquet
result_pq = mf.load_custom_parquet("data/my_panel.parquet", dataset="fred_md")
dataset must be one of "fred_md" (monthly), "fred_qd" (quarterly), or
"fred_sd" (state-level monthly). It labels the schema downstream – it does
not require your columns to match FRED mnemonics.
These helper functions are for inspection only. To run a full study, use the YAML recipe path (Option A).
Merging with FRED-MD or FRED-QD
If you want McCracken-Ng’s curated 126 monthly (or 245 quarterly) series
plus a few custom series, use official_plus_custom:
1_data:
fixed_axes:
custom_source_policy: official_plus_custom
dataset: fred_md
frequency: monthly
leaf_config:
target: CPIAUCSL
target_horizons: [1, 3, 6]
custom_source_path: data/my_extra_series.csv
custom_merge_rule: left_join # inner_join / left_join / outer_join
sample_start_date: "1990-01"
sample_end_date: "2019-12"
custom_merge_rule is required. Choose:
Rule |
Keeps dates from |
|---|---|
|
Rows present in both FRED and your file |
|
All FRED dates; your series gets |
|
All dates in either file |
The custom file must have the same date column format. Duplicate column names (same mnemonic as a FRED series) will be suffixed by the runtime; rename before merging if the intent is to replace a FRED series.
Common pitfalls
Symptom |
Cause |
Fix |
|---|---|---|
|
Date column is not the first column, or the date format is not parseable. |
Move the date column first; use ISO format |
Target column is silently missing from the panel |
Column name in |
Check column names with |
All-NaN columns dropped silently |
A series has no numeric values after type coercion. |
Inspect the raw file for text entries or hidden characters. |
|
|
Apply your own transforms in |
|
Relative path resolves from where |
Use an absolute path or change your working directory to the project root before calling |
|
Your extra file’s date range does not overlap the FRED vintage dates. |
Use |
For FRED-MD / FRED-QD column definitions and T-code reference, see FRED-MD and FRED-QD.
See also
FRED-MD column dictionary – 126 monthly series, T-codes, groups
FRED-QD column dictionary – 245 quarterly series, T-codes
Custom function quickstart – bring your own model, preprocessor, or target transformer
Quickstart – minimal recipe walkthrough
First study – full study with diagnostics, tests, and output