Publication Details

Category Text Publication
Reference Category Journals
DOI 10.1111/ddi.70205
Licence creative commons licence
Title (Primary) A global, taxon-stratified, high-resolution sampling-effort dataset from GBIF for bias-aware ecological modelling
Author El-Gabbas, A. ORCID logo
Source Titel Diversity and Distributions
Year 2026
Department BZF
Volume 32
Issue 5
Page From e70205
Language englisch
Topic T5 Future Landscapes
Data and Software links https://doi.org/10.5281/zenodo.17591681
Supplements Supplement 1
Supplement 2
Supplement 3
Keywords biodiversity informatics; data gaps; GBIF; sampling bias; sampling efforts; species distribution modelling
Abstract Introduction and Aim
Spatiotemporal and taxonomic sampling bias in biodiversity occurrence data poses critical challenges for robust ecological inference, species distribution models (SDMs), and conservation planning. Despite the exponential growth in global biodiversity records over recent decades, these biases persist. This study converts raw occurrence records from the Global Biodiversity Information Facility (GBIF) into global, publicly available, taxon-stratified, and temporally resolved sampling-effort rasters using a reproducible workflow, providing transparent and standardised measures of observation count and species richness to support bias-aware ecological analyses.

Main variables included
Two complementary raster variables: observation count and species richness, each provided across major taxonomic groups and their descendant levels (e.g., classes, orders, families).

Time Coverage
Annual and cumulative rasters span 1980–2025.

Spatial Coverage
Global; four spatial resolutions (~1, 5, 10, and 20 km).

Taxa
Nine major taxonomic groups: Amphibia, Arachnida, Aves (birds), Fungi, Insecta, Mammalia, Mollusca, Reptilia, and Tracheophyta (vascular plants), with descendant-level outputs.

Applications
Based on ~3 billion records for > 730,000 species, this study provides annual and cumulative global rasters quantifying observation count and species richness at four resolutions, stratified by nine taxonomic groups and their descendants. At 1 km resolution, 95% of records occupy merely 0.33% of Earth's surface (0.93% of land), whilst the remaining data extend across only 1.77% (3.88% of land), leaving approximately 98% (95% of land) unsampled. This extreme concentration persists across all taxonomic groups, underscoring the need for taxon-specific bias correction. Annual data enable exploration of long-term trends in data mobilisation and sampling effort. These rasters enable bias correction in presence-only SDMs, including MaxEnt bias files, target-group backgrounds, and model-based approaches. Beyond SDMs, they can inform macroecological synthesis, biodiversity monitoring, and systematic conservation planning by identifying spatial and temporal knowledge gaps. All data and code are openly available under FAIR principles, promoting transparent and reproducible biodiversity science.

El-Gabbas, A. (2026):
A global, taxon-stratified, high-resolution sampling-effort dataset from GBIF for bias-aware ecological modelling
Divers. Distrib. 32 (5), e70205
10.1111/ddi.70205