What each funder funds: specialization, complementarity, and the surprising temporal stability of funder field-portfolios in a reconciled NIH/NSF/EC grant→output graph
Corpus: research-atlas v0.4.0 — 1,670,434 grants / 75 funders / 470,269 grant→work edges (concept DOI 10.5281/zenodo.20774322)
Abstract
Paper 01 in this series characterized who gets research grants (institutional concentration), who shares the resulting papers (co-funding), and how much output accompanies a dollar (the funding→output rate). It did not ask what is, structurally, a prior question: what does each funder actually fund, and how distinctively?
Here we answer that on the same reconciled graph by mapping every funder's linked output to the 26 OpenAlex top-level fields — purely on distinct-work counts, with no dollar column anywhere — and measuring three things. (1) A specialization gradient. Across the 23 funders with enough linked output to estimate a portfolio, the field-concentration of that portfolio (an HHI over the 26 fields) spans a clean 5× range, from the two pure generalists — the EC (HHI 0.092) and NSF (0.095) — to the hyper-specialist NHGRI (HHI 0.466), which puts 66% of its linked output in a single field. (2) Complementarity, recovered from data. Cosine similarity of funder field-share vectors recovers the agency map with no labels: the NIH Institutes form a tight cluster (internal mean cosine 0.819), NSF is the maximal outlier (mean cosine to the NIH cluster 0.385; most-distinct pair NIDCD↔NSF at 0.217), and the EC sits in between as a partial bridge (0.621). (3) Specialization is a stable fingerprint. Comparing each funder's portfolio HHI in 2016–2019 versus 2021–2024, the rank order is almost perfectly preserved (Spearman ρ = 0.973) and the mean absolute change in HHI is only 0.011.
The one aggregate compositional move we detect — Physical Sciences' share of funded output rising +3.83pp (95% CI [+2.87, +4.85]) from 2016 to 2024 — vanishes under mix control: within NSF alone the Physical-Sciences share fell 0.88pp, so the aggregate wobble is a composition/coverage-endpoint artifact, not a secular shift. We state the scope limit plainly throughout — field assignment requires output edges, which exist for NIH/NSF/EC only — and release all code and a Zenodo-ready metadata record.
Key findings
- Funders span a clean 5× specialization gradient: generalists EC (HHI 0.092) and NSF (0.095) at one end, the hyper-specialist NHGRI (0.466, 66% in one field) at the other.
- Cosine similarity of field-share vectors recovers the agency map with no labels: a tight NIH cluster (internal cosine 0.819), NSF as the maximal outlier (0.385 to NIH), the EC as a partial bridge (0.621).
- Specialization is a stable fingerprint, not a fad: early-vs-late HHI rank order is preserved at Spearman ρ = 0.973, mean |ΔHHI| just 0.011.
- The apparent +3.83pp swing toward the physical sciences is a coverage-endpoint composition artifact — within NSF the share actually fell 0.88pp.
- Touches no dollar column, so the graph's known dollar-noise sources cannot reach any reported number; every constant is pinned by tests/test_funder_specialization.py.
Figures




Cite this paper
@misc{bucket2026funderspecialization,
title = {What each funder funds: specialization, complementarity, and
the surprising temporal stability of funder field-portfolios
in a reconciled NIH/NSF/EC grant-to-output graph},
author = {{Bucket Foundation research-atlas working group}},
year = {2026},
howpublished = {Bucket Foundation preprint},
doi = {10.5281/zenodo.20836205},
url = {https://doi.org/10.5281/zenodo.20836205},
note = {research-atlas v0.4.0}
}