FRED-MD

Monthly U.S. macroeconomic panel maintained by the Federal Reserve Bank of St. Louis. Loaded via macroforecast.load_fred_md() when path.1_data_task.fixed_axes.dataset == "fred_md".

Citation & authoritative source

Original paper: Michael W. McCracken and Serena Ng, “FRED-MD: A Monthly Database for Macroeconomic Research,” Journal of Business & Economic Statistics 34(4): 574–589, 2016. Working paper: Federal Reserve Bank of St. Louis WP 2015-012.
Official landing page: St. Louis Fed — FRED-MD & FRED-QD (current documentation, appendix, and historical vintages).
Variable appendix (current): FRED-MD_updated_appendix.pdf — authoritative list of every series, its T-code, and its source. The macroforecast package does not redistribute this appendix; users who need the exact current variable list should fetch it from St. Louis Fed.

What macroforecast downloads

Current vintage: https://www.stlouisfed.org/-/media/project/frbstl/stlouisfed/research/fred-md/monthly/current.csv (this exact URL is used by macroforecast/raw/datasets/fred_md.py). Replaced at the start of every month by the maintainers.
Historical vintage: per-month CSVs at https://www.stlouisfed.org/-/media/project/frbstl/stlouisfed/research/fred-md/monthly/{vintage}.csv where vintage is e.g. 2020-06. Accessed when the recipe sets leaf_config.data_vintage and information_set_type == "real_time_vintage".
Bundle (historical): St. Louis Fed periodically publishes a ZIP of all past vintages. macroforecast supports extraction from such a zip via the local_zip_source loader argument.

The CSV uses two header rows: the first is the transformation code (T-code) per series; subsequent rows are the observations indexed by month.

Structure — 8 variable categories

The paper organises the panel into eight groups, unchanged since the 2016 publication:

Output and income — industrial production aggregates and sectoral indices (e.g., INDPRO, IPFINAL, IPMANSICS), real personal income (RPI).
Labor market — nonfarm payrolls (PAYEMS), unemployment rate (UNRATE), hours, earnings, initial claims.
Housing — housing starts and permits at national and regional level (HOUST, PERMIT).
Consumption, orders, and inventories — retail sales, new orders for durable goods, wholesale / retail inventories, consumer sentiment (UMCSENTx).
Money and credit — monetary aggregates (M1SL, M2SL), bank reserves, consumer and real-estate loans, commercial paper outstanding.
Interest and exchange rates — fed funds rate (FEDFUNDS), T-bill and Treasury yields across the curve (GS1, GS5, GS10), credit spreads, exchange rates against major currencies.
Prices — CPI (CPIAUCSL), PCE price index, PPI, commodity prices (oil, metals).
Stock market — S&P 500 (S&P 500), dividend yield, P/E ratio, aggregate market returns.

Exact membership of each group at any given point in time is in the appendix PDF — macroforecast does not encode it. For code that needs category-aware feature grouping, the feature_grouping axis in Layer 2 / 3 will eventually surface a fred_category value (reserved for v1.1).

Transformation codes (T-codes)

The first row of current.csv encodes the recommended stationarity transform for each series. From the 2016 paper, appendix Table 1:

T-code	Transform
1	No transformation
2	First difference $\Delta x_t$
3	Second difference $\Delta^2 x_t$
4	Natural logarithm $\log x_t$
5	First difference of logs $\Delta \log x_t$
6	Second difference of logs $\Delta^2 \log x_t$
7	First difference of percent change $\Delta (x_t / x_{t-1} - 1)$

In macroforecast these codes are part of the Layer 1 official-frame decision:

official_transform_policy: keep_official_raw_scale with official_transform_scope: none → ignore T-codes and keep raw levels.
official_transform_policy: apply_official_tcode with official_transform_scope: target_and_predictors → apply the CSV’s per-series transform before Layer 2 researcher preprocessing.
Legacy tcode_policy bridge fields are still accepted for old recipes, but new recipes should express official transforms through the Layer 1 axes.

Changes from the 2015–2016 working paper to current

The 2015 working paper documented 134 series. The current panel (circa 2024–2026) has evolved through monthly maintenance — the exact current count fluctuates because St. Louis Fed:

Drops series when the underlying FRED ID is discontinued (e.g., an index whose source survey is retired). Example pattern: some housing-permits breakdowns were trimmed when the source Census tables consolidated.
Adds series when FRED adds a directly comparable index or when a new sub-indicator becomes useful for factor estimation. Additions are rare and flagged in the appendix’s change log.
Re-codes T-codes when a series’ stationarity profile visibly changes (e.g., a regime shift in a price index warranting a log-diff instead of log-level). Such changes are also flagged in the appendix.
Renames source FRED IDs when the Fed updates its own taxonomy. The paper-era name remains in the first-row header for backward compatibility.

The authoritative change log is maintained by the St. Louis Fed in the appendix PDF (the “change history” section); macroforecast does not attempt to mirror it. If a user needs bit-identical replication of a published study that cites FRED-MD, they should pin information_set_type: real_time_vintage + leaf_config.data_vintage: "YYYY-MM" where YYYY-MM is the vintage the study used.

Loader behaviour — things to know

Download is cached at ~/.cache/macroforecast/raw/ (override with cache_root on the loader). The cache key is (dataset, vintage, source_url).
No data redistribution — the package never bundles the CSV. Network access or a user-provided local_source path is required on first load.
Parsing: parse_fred_csv at macroforecast/raw/shared_csv.py separates the T-code header row from the observation rows and returns both (T-codes surface only if Layer 2 preprocessing consumes them).
Custom file conformance: FRED-MD’s column naming follows FRED series IDs (INDPRO, CPIAUCSL, …). Any user-side CSV or Parquet used with custom_source_policy: custom_panel_only on a monthly FRED-MD route must use a monthly date index and numeric columns. Matching FRED IDs is recommended for replacement panels; appended custom files may add study-specific column names. Duplicate names are renamed with a __custom suffix at runtime.

Known limitations in macroforecast v1.0

No variable-level metadata surface — the package does not expose each FRED ID’s description / units / source URL. Users who want that enrichment should query FRED’s REST API directly.
No automated T-code validation — if St. Louis Fed changes a T-code in a new vintage, official_transform_policy: apply_official_tcode will use the new code silently. For strict reproducibility pin the vintage.
data_vintage required for information_set_type=real_time_vintage; bare fred_md assumes information_set_type=revised (latest available revision).