Quick Start

Basic usage

Every PBDB API endpoint corresponds to a Julia function. Calling the function with keyword arguments matching the API parameters returns a DataFrame.

using PaleobiologyDB

# Fossil occurrences — all Canidae in the Miocene
canids = pbdb_occurrences(base_name = "Canidae", interval = "Miocene", show = "full")

# Taxonomic information
canis_info = pbdb_taxon(name = "Canis", extids = true, show = ["attr", "app", "size"])

# A specific collection (string or numeric ID)
collection = pbdb_collection("col:1003", show = ["loc", "stratext"], extids = true)
collection = pbdb_collection(1003, show = ["loc", "stratext"])

The interface and REPL help

All functions are richly documented and discoverable from the Julia help system:

help?> pbdb_occurrences
search: pbdb_occurrences pbdb_occurrence pbdb_ref_occurrences ...

  pbdb_occurrences(; kwargs...)

  Get information about fossil occurrence records stored in the Paleobiology Database.

  Arguments
  ≡≡≡≡≡≡≡≡≡

    •  kwargs...: Filtering and output parameters. Common options include:
       • limit: Maximum number of records to return (Int or "all").
       • taxon_name: Return only records with the specified taxonomic name(s).
       • base_name: Return records for the specified name(s) and all descendant taxa.
       • lngmin, lngmax, latmin, latmax: Geographic bounding box.
       • min_ma, max_ma: Minimum and maximum age in millions of years.
       • interval: Named geologic interval (e.g. "Miocene").
       • cc: Country/continent codes (ISO two-letter or three-letter).
       • show: Extra information blocks ("coords", "classext", "ident", etc.).
       • extids: Set extids = true to show the newer string identifiers.
       • vocab: Vocabulary for field names ("pbdb" for full names, "com" for short codes).

  Examples
  ≡≡≡≡≡≡≡≡

  # `taxon_name` retrieves *only* units of this exact rank
  occs = pbdb_occurrences(taxon_name = "Canis", limit = 100)

  # `base_name` retrieves units of this and all nested ranks
  occs = pbdb_occurrences(base_name = "Canis", show = ["coords", "classext"], limit = 100)

Note the distinction: taxon_name matches only occurrences at that exact rank, while base_name includes all descendant taxa.

See the Interactive Help page for the ApiHelp submodule, which lets you browse parameters, fields, and endpoints interactively.

Fossil occurrences

# Simple query
occs = pbdb_occurrences(base_name = "Mammalia", limit = 10)

# Single occurrence
single_occ = pbdb_occurrence("occ:1001", show = "full")
single_occ = pbdb_occurrence(1001, show = "full")

# Geographic and temporal filtering
pliocene_mammals = pbdb_occurrences(
    base_name = "Mammalia",
    interval = "Pliocene",
    lngmin = -130.0, lngmax = -60.0,
    latmin = 25.0, latmax = 70.0,
    show = ["coords", "classext", "stratext"],
)

# taxon_name matches exact rank; base_name includes all subtaxa
occs_exact = pbdb_occurrences(taxon_name = "Canis", limit = 100)
occs_subtaxa = pbdb_occurrences(base_name = "Canis", show = ["coords", "classext"], limit = 100)

Taxonomic data

mammalia = pbdb_taxon(name = "Mammalia", show = ["attr", "size"])

carnivores = pbdb_taxa(name = "Carnivora", rel = "children", show = ["attr", "app"])

suggestions = pbdb_taxa_auto(name = "Cani", limit = 10)

Collections and geography

european_collections = pbdb_collections(
    lngmin = -10.0, lngmax = 40.0,
    latmin = 35.0, latmax = 65.0,
    interval = "Cenozoic",
)

clusters = pbdb_collections_geo(
    level = 2,
    lngmin = 0.0, lngmax = 15.0,
    latmin = 45.0, latmax = 55.0,
)

Specimens and measurements

whale_specimens = pbdb_specimens(base_name = "Cetacea", interval = "Miocene")

measurements = pbdb_measurements(
    spec_id = ["spm:1505", "spm:30050"],
    show = ["spec", "methods"],
)

Advanced query options

Field name vocabulary

Full descriptive column names are returned by default. Use vocab = "com" for compact 3-letter codes:

df_full  = pbdb_occurrences(base_name = "Canis", limit = 5)
df_short = pbdb_occurrences(base_name = "Canis", limit = 5, vocab = "com")

Additional information blocks

Use show to request extra data blocks alongside the default fields:

detailed_occs = pbdb_occurrences(
    base_name = "Dinosauria",
    interval = "Cretaceous",
    show = ["coords", "classext", "stratext", "ident", "loc"],
)

show = "full" returns all available blocks at once.

Time and stratigraphy

Filter by geological age in millions of years or by named interval:

old_mammals  = pbdb_occurrences(base_name = "Mammalia", min_ma = 50.0, max_ma = 65.0)
miocene_data = pbdb_occurrences(interval = "Miocene", cc = "NAM")

Query stratigraphic formations directly:

formations = pbdb_strata(
    rank   = "formation",
    lngmin = -120, lngmax = -100,
    latmin = 30,   latmax = 50,
)

References and bibliography

# References for a taxon group
refs = pbdb_ref_taxa(name = "Canidae", show = ["both", "comments"])

# References cited in occurrence records
occ_refs = pbdb_ref_occurrences(base_name = "Canis", ref_pubyr = 2000)

# A specific reference record
ref_detail = pbdb_reference(1003, show = "both")

Common parameters

ParameterDescription
base_nameTaxon name including all subtaxa and synonyms
taxon_nameTaxon name including synonyms, exact rank only
intervalNamed geologic interval (e.g. "Miocene")
min_ma, max_maAge range in millions of years
lngmin, lngmax, latmin, latmaxGeographic bounding box
ccCountry/continent codes (e.g. "US,CA", "NAM")
showExtra data blocks ("coords", "classext", "full", …)
extidsUse string identifiers ("occ:1001") instead of integers
vocabField vocabulary: "pbdb" (full names) or "com" (short codes)
limitMaximum records to return (integer or "all")

Counting records without downloading

pbdb_count(:occurrences; base_name = "Canidae")
pbdb_count(:collections; interval = "Miocene", cc = "ASI")
pbdb_count(:taxa; base_name = "Mammalia")

# Dict splatting works too
params = Dict(:base_name => "Cetacea", :interval => "Miocene")
pbdb_count(:specimens; params...)

Valid symbols: :occurrences, :collections, :taxa, :references, :specimens, :opinions.

Error handling

try
    data = pbdb_occurrences(base_name = "InvalidTaxon", limit = 10)
catch e
    @warn "PBDB request failed" exception = e
end