Publication Details

Category Text Publication
Reference Category Journals
DOI 10.1111/jbi.12953
Document Shareable Link
Title (Primary) Structural bias in aggregated species-level variables driven by repeated species co-occurrences: a pervasive problem in community and assemblage data
Author Hawkins, B.A.; Leroy, B.; Rodríguez, M.Á.; Singer, A.; Vilela, B.; Villalobos, F.; Wang, X.; Zelený, D.
Source Titel Journal of Biogeography
Year 2017
Department OESA; iDiv
Volume 44
Issue 6
Page From 1199
Page To 1211
Language englisch
Keywords community structure; community weighted means; geographical ecology; intrinsic variables; spatial analysis; species composition; species co-occurrence; species richness gradients; trait analysis
UFZ wide themes RU5;


Species attributes are often used to explain diversity patterns across assemblages/communities. However, repeated species co-occurrences can generate spatial pattern and strong statistical relationships between aggregated attributes and richness in the absence of biological information. Our aim is to increase awareness of this problem.


North America.


We generated empirical species richness patterns using two data structures: (1) birds gridded from range maps and (2) tree communities from the US Forest Service's Forest Inventory and Analysis. We analysed richness using linear regression, regression trees, generalized additive models, geographically weighted regression and simultaneous autoregression, with ‘random intrinsic variables’ as predictors generated by assigning random numbers to species and calculating averages in assemblages. We then generated simulations in which species with cohesive or patchy distributions are placed with respect to the North American temperature gradient with or without a broad-scale richness gradient. Random intrinsic variables are again used as predictors of richness. Finally, we analysed one simulated scenario with random intrinsic variables as both response and predictor variables.


The models of bird and tree richness often explained moderate to large proportions of the variance. Regression trees, geographically weighted regression and simultaneous autoregression were very sensitive to the problem; generalized additive models were moderately affected, as was multiple regression to a lesser extent. In the virtual data, the variance explained increased with increasing species co-occurrences, but neither range cohesion, a richness gradient nor spatial autocorrelation in predictors had major impacts on the variance explained. The problem persisted when the response variable was also a random intrinsic variable.

Main conclusions

Repeated species co-occurrences can generate strong spurious relationships between richness and aggregated species attributes. It is important to realize that models utilizing assemblage variables aggregated from species-level values, as well as maps illustrating their spatial patterns, cannot be taken at face value.

Persistent UFZ Identifier
Hawkins, B.A., Leroy, B., Rodríguez, M.Á., Singer, A., Vilela, B., Villalobos, F., Wang, X., Zelený, D. (2017):
Structural bias in aggregated species-level variables driven by repeated species co-occurrences: a pervasive problem in community and assemblage data
J. Biogeogr. 44 (6), 1199 - 1211 10.1111/jbi.12953