§ Research · paper

What each funder funds: specialization, complementarity, and the surprising temporal stability of funder field-portfolios in a reconciled NIH/NSF/EC grant→output graph

Bucket Foundation · research-atlas working groupBucket Foundationpreprint · 1.0 (preprint draft)2026-06-24CC-BY-4.0

Corpus: research-atlas v0.4.0 — 1,670,434 grants / 75 funders / 470,269 grant→work edges (concept DOI 10.5281/zenodo.20774322)

read the PDF ↗DOI ↗code + data on github ↗

Abstract

Paper 01 in this series characterized who gets research grants (institutional concentration), who shares the resulting papers (co-funding), and how much output accompanies a dollar (the funding→output rate). It did not ask what is, structurally, a prior question: what does each funder actually fund, and how distinctively?

Here we answer that on the same reconciled graph by mapping every funder's linked output to the 26 OpenAlex top-level fields — purely on distinct-work counts, with no dollar column anywhere — and measuring three things. (1) A specialization gradient. Across the 23 funders with enough linked output to estimate a portfolio, the field-concentration of that portfolio (an HHI over the 26 fields) spans a clean 5× range, from the two pure generalists — the EC (HHI 0.092) and NSF (0.095) — to the hyper-specialist NHGRI (HHI 0.466), which puts 66% of its linked output in a single field. (2) Complementarity, recovered from data. Cosine similarity of funder field-share vectors recovers the agency map with no labels: the NIH Institutes form a tight cluster (internal mean cosine 0.819), NSF is the maximal outlier (mean cosine to the NIH cluster 0.385; most-distinct pair NIDCD↔NSF at 0.217), and the EC sits in between as a partial bridge (0.621). (3) Specialization is a stable fingerprint. Comparing each funder's portfolio HHI in 2016–2019 versus 2021–2024, the rank order is almost perfectly preserved (Spearman ρ = 0.973) and the mean absolute change in HHI is only 0.011.

The one aggregate compositional move we detect — Physical Sciences' share of funded output rising +3.83pp (95% CI [+2.87, +4.85]) from 2016 to 2024 — vanishes under mix control: within NSF alone the Physical-Sciences share fell 0.88pp, so the aggregate wobble is a composition/coverage-endpoint artifact, not a secular shift. We state the scope limit plainly throughout — field assignment requires output edges, which exist for NIH/NSF/EC only — and release all code and a Zenodo-ready metadata record.

Key findings

Funders span a clean 5× specialization gradient: generalists EC (HHI 0.092) and NSF (0.095) at one end, the hyper-specialist NHGRI (0.466, 66% in one field) at the other.
Cosine similarity of field-share vectors recovers the agency map with no labels: a tight NIH cluster (internal cosine 0.819), NSF as the maximal outlier (0.385 to NIH), the EC as a partial bridge (0.621).
Specialization is a stable fingerprint, not a fad: early-vs-late HHI rank order is preserved at Spearman ρ = 0.973, mean |ΔHHI| just 0.011.
The apparent +3.83pp swing toward the physical sciences is a coverage-endpoint composition artifact — within NSF the share actually fell 0.88pp.
Touches no dollar column, so the graph's known dollar-noise sources cannot reach any reported number; every constant is pinned by tests/test_funder_specialization.py.

Figures

Figure 1. Funder specialization gradient: portfolio HHI over the 26 OpenAlex fields, generalist (top) to specialist (bottom). Navy = US (NSF + NIH ICs), red = EC/supranational. Each bar is labelled with that funder's dominant field and its share. Distinct linked works, 2016–2024.

Funder portfolio similarity matrix — Figure 2. Funder portfolio similarity: cosine between 26-field share vectors. The bright NIH×NIH block (internal mean cosine 0.819) and the dark NSF column (mean 0.385 to NIH) are the complementarity structure.

Specialization stability scatter — Figure 3. Specialization stability: each funder's portfolio HHI in 2016–2019 (x) vs 2021–2024 (y). Points hug the y = x line (Spearman ρ = 0.973; mean |ΔHHI| = 0.011) — funders do not re-specialize.

Aggregate vs within-NSF domain composition — Figure 4. Left: aggregate domain composition of funded output by year — Physical Sciences (navy) appears to rise at the 2024 endpoint. Right: within NSF alone (mix-controlled) the Physical-Sciences share is flat-to-falling, showing the aggregate move is a composition/coverage artifact.

Cite this paper

DOI: 10.5281/zenodo.20836205

@misc{bucket2026funderspecialization,
  title        = {What each funder funds: specialization, complementarity, and
                  the surprising temporal stability of funder field-portfolios
                  in a reconciled NIH/NSF/EC grant-to-output graph},
  author       = {{Bucket Foundation research-atlas working group}},
  year         = {2026},
  howpublished = {Bucket Foundation preprint},
  doi          = {10.5281/zenodo.20836205},
  url          = {https://doi.org/10.5281/zenodo.20836205},
  note         = {research-atlas v0.4.0}
}

← all papers the research-atlas graph open datasets