Fields / Topics dataset.
Fields and topics from the OpenAlex topic hierarchy (topic / subfield / field / domain), with parent links.
download
The canonical artifact is a content-addressable parquet committed in the research-atlas repo. Free to read; priced-once to cite (below).
data/processed/sample/field.parquet
schema
atlas_idStable surrogate key, derived deterministically from the most-stable identifier (ROR/ORCID/DOI/OpenAlex/Crossref Funder id, else source+source_id).nameDisplay name.openalex_idOpenAlex topic/work id.leveltopic / subfield / field / domain.parent_atlas_idParent field's atlas_id in the topic hierarchy.sourceShort source key (nsf, openalex, nih, cordis, …).source_idThe record's id in that source's own namespace.source_urlCanonical, citeable URL to the record at the source — the per-fact attribution chain.as_ofISO-8601 UTC timestamp the row was fetched / normalized.provenance
Every row carries its own source, source_id, source_url, and as_of — a per-fact attribution chain back to the original record. The dataset itself was produced by this pipeline:
- publishedresearch-atlas/0.1.0 · data/MANIFEST.json · 2026-06-18T16:55:10Z
- vendoredbucket-foundation/sync-research-atlas-manifest · github.com/bucket-foundation/research-atlas · 2026-06-18T16:59:02.546Z
- catalogedbucket-foundation/research-datasets · data/processed: data/processed/sample/field.parquet · 2026-06-20T18:19:28.420Z
cite — born citeable
This dataset ships the same feed402/0.2 envelope the rest of bucket.foundation speaks. Reading and citing it costs nothing; the cite block is passive, forward-looking license metadata describing what a downstream publisher would owe to re-publish it in a paid work.
{
"data": {
"dataset": "field",
"kind": "entity",
"title": "Fields / Topics",
"row_count": 19,
"columns": [
"atlas_id",
"name",
"openalex_id",
"level",
"parent_atlas_id",
"source",
"source_id",
"source_url",
"as_of"
]
},
"citation": {
"type": "source",
"source_id": "research-atlas:field@0.1.0",
"provider": "bucket-foundation",
"dataset": "research-atlas",
"retrieved_at": "2026-06-20T18:19:28.420Z",
"license": "CC-BY-4.0",
"canonical_url": "https://www.bucket.foundation/research/datasets/field",
"download_url": "https://raw.githubusercontent.com/bucket-foundation/research-atlas/main/data/processed/sample/field.parquet",
"title": "research-atlas — Fields / Topics (field)",
"as_of": "2026-06-18T16:51:15Z",
"schema_version": "0.1.0",
"row_count": 19,
"sources": [
"nsf"
]
},
"receipt": {
"tier": "raw",
"status": "open_dataset",
"price_usd": 0,
"paid_by": "bucket-foundation (open data, CC-BY-4.0; reader pays nothing)"
},
"cite": {
"applies_to": "downstream_republication_in_a_paid_work",
"reader_owes": 0,
"price_usd": 0.05,
"payout_wallet": "0xa91115B1AB8412f380Fd62446F523559F668b96B",
"license": "bucket.foundation/cite-forever/v0.1"
},
"provenance": [
{
"action": "published",
"at": "2026-06-18T16:55:10Z",
"by": "research-atlas/0.1.0",
"via": "data/MANIFEST.json"
},
{
"action": "vendored",
"at": "2026-06-18T16:59:02.546Z",
"by": "bucket-foundation/sync-research-atlas-manifest",
"via": "github.com/bucket-foundation/research-atlas"
},
{
"action": "cataloged",
"at": "2026-06-20T18:19:28.420Z",
"by": "bucket-foundation/research-datasets",
"via": "data/processed: data/processed/sample/field.parquet"
}
],
"canon_tier": "candidate"
}queryable at /api/research/datasets?dataset=field
doi — be cited forever (seam)
For permanent, scholarly-citeable identity, a published dataset gets a real DOI via Zenodo — the content-addressed parquet is deposited and the DOI is recorded alongside its feed402/0.2 cite-forever block. Reading and citing stay free; citation fees flow to the dataset’s authors over feed402/x402. There is no blockchain, no Story Protocol, no IP-NFT — just a DOI and the open cite-forever envelope. No wallet is ever required to read, download, or cite.