Abstract

Aims

It is important to explore the underlying mechanisms that cause triphasic species–area relationship (triphasic SAR) across different scales in order to understand the spatial patterns of biodiversity.

Methods

Instead of theory establishment or field data derivation, I adopted a data simulation method that used the power function of SAR to fit log-normal distribution of species abundance.

Important Findings

The results showed that one-step sampling caused biphasic SAR and n-step sampling could cause 2n-phasic SAR. Practical two-step sampling produced triphasic SAR due to the Preston and Pan effects in large areas. Furthermore, before exploring biological or ecological mechanisms for the nature phenomenon, we should identify or exclude potential mathematical, statistical or sampling reasons.

摘要

两步采样可产生三段式种面积关系

解析在不同尺度下引起三段式种面积关系(triphasic species–area relationship, triphasic SAR)的潜在机制对于理解生物多样性的空间格局非常重要。相较于理论构建和野外数据推导,本研究采用了种面积关系的幂函数来拟合对数正态分布的物种多度。研究结果表明,一步采样可引起两段式SAR,n步采样可引起2n段SAR。因为大面积时的Preston和Pan效应,实际的两步采样可产生三段式种面积关系。此外,我们在探索自然现象的生物学或生态学机制之前,我们应该识别或排除潜在的数学、统计或者采样原因。

INTRODUCTION

The shape of species–area relationship (SAR) reflects the spatial pattern of biodiversity distribution. Various equations have been developed to describe species–area curves, especially the widely used power function of SAR that assumes linear correlation between log(area) and log(number of species). However, this assumption is not always true for the SAR at large scales. Instead, triphasic species–area relationship (triphasic SAR) is predicted and found in the log–log plot, with a steeper slope in species richness at both small and large spatial scales (Brian and Cathy 2003; Hubbell 2001; Preston 1960).

This triphasic relationship between the number of species and the area varies across spatial scales (Fridley et al. 2005; O’Dwyer and Green 2010; Palmer 2007; Sclafani and Holland 2013). Many reasons on theory establishment or field data derivation have been proposed to explain the mechanism. The reasons can be sampling, aggregation or range size (Allen and White 2003; Rosindell and Cornell 2007; Storch 2016). The species–area curves without asymptotes are regarded as type I (Scheiner 2003; Zillio and He 2010). For instance, two thresholds are proposed to describe the two stage increasing segments (Lomolino 2000). Finite-area effect reveals self-similar distributions (Šizling and Storch 2004). More and more research indicated that sampling effect or design could explain the shape of SAR (Bueno et al. 2020; Gooriah and Chase 2020).

It is important to explore the underlying mechanism that causes triphasic SAR across different scales in order to understand diversity patterns. As stated in our previous paper, we should exclude potential mathematical or statistical reasons before we propose biological or ecological hypotheses, which are usually intertwined with the sampling design and implementation (Pan 2013, 2015, 2016; Pan et al. 2016; Pan and Zhu 2015). When using simulated data from both log-normal and negative-binomial distributions, we found that the overall shape of the relationship between log(area) and log(number of species) of the above two distributions was not linear but convex. Thus, I believe the triphasic SAR shape of SAR needs to be explained mathematically or statistically first, such as using data simulation methods to find possible reasons from sampling tactics.

In this paper, the impact of sampling tactics with different steps on the curve shape of SAR was analyzed. I simulated five areas with different species compositions but the same number of species and the log-normal distribution of abundance. The random and independent distribution model was used to plot the relationship between log(area) and log(number of species) at one-step sampling and two-step sampling. Then the power formulas of SAR were employed for the fitting. Finally, mathematical induction was applied for the curve shape of SAR caused from one-step and two-step sampling to n-step sampling.

MATERIALS AND METHODS

Data simulation

A simulation program in the R platform was used to generate sampling data (R Core Team, 2013). Each one of five areas (A1, A2, A3, A4 and A5) was set with 1 000 000 points, and an individual of each species occupied one point. The species composition in A1 was {s1, s2, …, s100}, the species composition in A2 {s26, s27, …, s125}, the species composition in A3 {s51, s52, …, s150}, the species composition in A4 {s76, s77, …, s175} and the species composition in A5 {s101, s102, …, s200}. The occurrence of these species in A1–A5 was simulated following log-normal distributions (selected from dozens of SADs, McGill et al. 2007; Pan 2016) (Fig. 1a).

Simulated species abundance used for the SAR fitting in (a) each area and (b) two areas.
Figure 1:

Simulated species abundance used for the SAR fitting in (a) each area and (b) two areas.

Sampling procedure

There are two sampling procedures in this study. One is two-step sampling, meaning sampling A1 first and then sampling A2, A3, A4 and A5, respectively. The other is one-step sampling, meaning sampling randomly the species composition in A1∪A2 (the union of areas A1 and A2, Pan 2015), A1∪A3, A1∪A4 and A1∪A5, respectively. The total number of species in A1∪A2, A1∪A3, A1∪A4 and A1∪A5 was 125, 150, 175 and 200, respectively (Fig. 1b). In this study, the number of species accumulated in each step was sampled without replacement. Thus, the main difference between the two-step sampling and one-step sampling was the sampling total area in each step.

Similarity analysis

Sorensen’s similarity measure (Sørensen 1948) is used here.

(1)

where a is the number of shared species in both Ai and Aj, and b and c are the number of species in the Ai and Aj, respectively. In this study, the Sorensen’s similarity between A1 and A2, A1 and A3, A1 and A4 and A1 and A5 was 3/4, 1/2, 1/4 and 0, respectively.

Calculation of number of species

As former studies proposed (Coleman 1981), for a community where resident species is randomly and independently distributed, the SAR curve can be formulated as

(2)

where SA is the number of species in area A, STA is the total number of species in the total area (TA) and Ni is the number of individuals per species i. This formula was used to calculate SAR for the sampling without replacement based on simulated data.

Data fitting

The power function logSA=logc+zlogA (i.e. the logarithm format of the power function of SAR, c and z are the fitting parameters, respectively) was used to fit SARs based on simulated data. Fourteen values of area, including 1, 10, 100, 1000, 10 000, 100 000, 1 000 000, 1 000 001, 1 000 010, 1 000 100, 1 001 000, 1 010 000, 1 100 000 and 2 000 000, were used for fitting. And the corresponding number of species was derived from Equation (2). Then both the area and the number of species were log-transformed in the power SAR fitting. The linear regression was employed for the fitting for both sampling tactics.

RESULTS

The results of two-step and one-step sampling for A1∪A2, A1∪A3, A1∪A4 and A1∪A5 are shown in Fig. 2. The shapes of the curves of two-step sampling and one-step sampling were similar. Each step sampling, whether two-step sampling or one-step sampling, formed a typical convex curve (Pan et al. 2016). The two-step sampling formed two successive convex curves, with each step sampling forming one convex curve.

(a) Two-step and (b) one-step sampling for the A1∪A2, A1∪A3, A1∪A4 and A1∪A5.
Figure 2:

(a) Two-step and (b) one-step sampling for the A1∪A2, A1∪A3, A1∪A4 and A1∪A5.

Parameter fitting of power function of SAR with two-step and one-step sampling is shown in Table 1. Whether two-step sampling or one-step sampling was adopted for A1∪A2, A1∪A3, A1∪A4 and A1∪A5, the slope z of SAR fitting fell in the range of [0.2379, 0.2721], and R-square of SAR fitting fell in the range of [0.7086, 0.7742]. Generally, the R-square of two-step sampling was higher than that of one-step sampling, because the middle points in two-step sampling were less than that in one-step sampling. R-square and the slope z would increase when the total number of species increased (Sorensen’s similarity decreased) in the same area for both two-step sampling and one-step sampling. Thus, if there is a great difference between two areas in species composition, which means low Sorensen’s similarity, the species–area curve of two-step sampling will be more likely to show linearity. If there is little difference between two areas in species composition, which means high Sorensen’s similarity, the species–area curve of two-step sampling will be more similar to the convex curve of one-step sampling.

Table 1:

SAR fitting of two-step and one-step sampling

Two-step samplingOne-step sampling
Sampling areaSlope zR-SquareSlope zR-Square
A1∪A20.23790.73330.23880.7086
A1∪A30.24750.75420.25110.7180
A1∪A40.25510.76690.26200.7274
A1∪A50.26100.77420.27210.7398
Two-step samplingOne-step sampling
Sampling areaSlope zR-SquareSlope zR-Square
A1∪A20.23790.73330.23880.7086
A1∪A30.24750.75420.25110.7180
A1∪A40.25510.76690.26200.7274
A1∪A50.26100.77420.27210.7398
Table 1:

SAR fitting of two-step and one-step sampling

Two-step samplingOne-step sampling
Sampling areaSlope zR-SquareSlope zR-Square
A1∪A20.23790.73330.23880.7086
A1∪A30.24750.75420.25110.7180
A1∪A40.25510.76690.26200.7274
A1∪A50.26100.77420.27210.7398
Two-step samplingOne-step sampling
Sampling areaSlope zR-SquareSlope zR-Square
A1∪A20.23790.73330.23880.7086
A1∪A30.24750.75420.25110.7180
A1∪A40.25510.76690.26200.7274
A1∪A50.26100.77420.27210.7398

In this study, SAR from two-step sampling will be lower than SAR from one-step sampling (Fig. 3a). Using the method of mathematical induction, each step sampling can shape a convex curve and biphasic SAR emerges, and one-step sampling can cause biphasic SAR, thus n-step sampling will lead to 2n-phasic SAR. Let n be the number of sampling times and m the number of phases of SAR curves.

The scheme of (a) triphasic SAR from possible superposition of two convex curves and (b) multiphasic SAR.
Figure 3:

The scheme of (a) triphasic SAR from possible superposition of two convex curves and (b) multiphasic SAR.

Assume nk = k (k is the natural number), mk = 2nk, then nk+1 = k + 1, mk+1 = 2nk + 2 = 2*(nk + 1) = 2nk+1.

However, due to limitations of sampling at large scales and thus mainly underestimation of the number of species, it is impossible to fully present the last half of the second convex curve, which turns out to be like a straight line (Pan et al. 2016). If the points on the SAR curve from two-step sampling are smoothed due to the Preston and Pan effects (Pan and Zhu 2015), then triphasic SAR or even five phasic SAR will be shown (Fig. 3b). In the field, it is highly possible for two large areas to have different species compositions (low Sorensen’s similarity), then two-step sampling can produce triphasic SAR.

DISCUSSION

For the negative-binomial and log-normal abundance distributions fitted by the power SAR, the curve is convex instead of a straight line, and power SAR formulas are good for the fitting if the sampling area is small (Pan et al. 2016). This two-convex curve is also shown in the figure on scale-dependent nature of the SAR (Lomolino 2000). More importantly, due to spatial heterogeneity of species distribution (low Sorensen’s similarity) and sampling effect, this 2n-phasic SAR will be more linear. This is one potential sampling reason for the linear relationship of the SAR, which also explains the linearity of the island SAR due to the Allee effect and Pan effect.

In fact, two-step sampling mimics the actual sampling process. A team samples one plot in region/island X and generates the species–area curve, which should be typically convex. Then the team continues to sample another plot with a certain distance in region/island Y for comparison or other research purposes, which is another step sampling. Then another set of points (maybe just one point will be used to represent the total number of species in the Y(isolate species–area relationship) or X + Y(species accumulation curve)) will be added in the above species–area curve if they are put together to reflect the spatial pattern of total biodiversity. Usually the team would not sample different plots in the region/island X + Y (XY) at the same time due to the research/logistics reasons and obtain only one convex. In fact, the island (isolate) species–area relationship and species accumulation curve cannot be compared directly as they have different properties (Matthews et al. 2021).

Different sampling strategies can cause different SARs and other ecological phenomena. Thus, it is of great significance to establish coherent and comprehensive sampling from local to global scales, and it is the time to reconstruct ecological sampling theory and method. In addition, it should be careful to compare the SARs from different sampling methods. Before exploring biological or ecological mechanisms for natural phenomena, we should firstly identify or exclude the potential mathematical, statistical or sampling reasons. Though many hypotheses have been adopted to explain practical phenomena, it does not necessarily mean that the results can be fully or only explained by such hypotheses. The results, however, might be caused by sampling tactics or the combination of sampling strategy and biological or ecological mechanisms. For triphasic SAR, the speciation may result in more species in the large area, but the sampling can lead to several phases.

Funding

The work was supported by the National Key R&D Program of China (2018YFF0214905 and 2016YFC1200802).

Acknowledgements

The author is thankful to the reviewers for their valuable inputs on a previous draft of this article. The views and opinions presented in this article are those of the author.

Conflict of interest statement. The author declares that he has no conflict of interest.

REFERENCES

Allen
AP
,
White
EP
(
2003
)
Effects of range size on species–area relationships
.
Evol Ecol Res
5
:
493
499
.

Brian
M
,
Cathy
C
(
2003
)
A unified theory for macroecology based on spatial patterns of abundance
.
Evol Ecol Res
5
:
469
492
.

Bueno
AS
,
Masseli
GS
,
Kaefer
IL
, et al. (
2020
)
Sampling design may obscure species–area relationships in landscape-scale field studies
.
Ecography
43
:
107
118
.

Coleman
BD
(
1981
)
On random placement and species-area relations
.
Math Biosci
54
:
191
215
.

Fridley
JD
,
Peet
RK
,
Wentworth
TR
, et al. (
2005
)
Connecting fine- and broad-scale species–area relationships of Southeastern U.S. flora
.
Ecology
86
:
1172
1177
.

Gooriah
LD
,
Chase
JM
(
2020
)
Sampling effects drive the species–area relationship in lake zooplankton
.
Oikos
129
:
124
132
.

Hubbell
SP
(
2001
)
The Unified Neutral Theory of Biodiversity and Biogeography.
Princeton, NJ
:
Princeton University Press
,
161
375
.

Lomolino
MV
(
2000
)
Ecology’s most general, yet protean pattern: the species-area relationship
.
J Biogeogr
27
:
17
26
.

Matthews
TJ
,
Triantis
KA
,
Whittaker
RJ
(
2021
)
The species–area relationship: both general and protean?
In
Matthews
TJ
,
Triantis
KA
,
Whittaker
RJ
(eds).
The Species–Area Relationship—Theory and Application.
Cambridge, UK
:
Cambridge University Press
.

McGill
BJ
,
Etienne
RS
,
Gray
JS
, et al. (
2007
)
Species abundance distributions: moving beyond single prediction theories to integration within an ecological framework
.
Ecol Lett
10
:
995
1015
.

O’Dwyer
JP
,
Green
JL
(
2010
)
Field theory for biogeography: a spatially explicit model for predicting patterns of biodiversity
.
Ecol Lett
13
:
87
95
.

Palmer
MW
(
2007
)
Species-area curves and the geometry of nature
. In
Storch
D
,
Marquet
D
,
Brown
JH
(eds).
Scaling Biodiversity
.
Cambridge, UK
:
Cambridge University Press
.

Pan
X
(
2013
)
Fundamental equations for species-area theory
.
Sci Rep
3
:
1334
.

Pan
X
(
2015
)
Reconstruct the species-area theory using set theory
.
Natl Acad Sci Lett
38
:
173
177
.

Pan
X
(
2016
)
Application of fundamental equations to species-area theory
.
BMC Ecol
16
:
42
.

Pan
X
,
Zhang
X
,
Wang
F
, et al. (
2016
)
Potential global-local inconsistency in species-area relationships fitting
.
Front Plant Sci
7
:
1282
.

Pan
X
,
Zhu
S
(
2015
)
Matthew effect in counting the number of species
.
Biodivers Conserv
24
:
2865
2868
.

Preston
FW
(
1960
)
Time and space and variation of species
.
Ecology
41
:
254
283
.

R Core Team
(
2013
)
A Language and Environment for Statistical Computing.
Vienna, Austria
:
R Foundation for Statistical Computing
. http://www.R-project.org

Rosindell
J
,
Cornell
SJ
(
2007
)
Species–area relationships from a spatially explicit neutral model in an infinite landscape
.
Ecol Lett
10
:
586
595
.

Scheiner
SM
(
2003
)
Six types of species-area curves
.
Glob Ecol Biogeogr
12
:
441
447
.

Sclafani
JA
,
Holland
SM
(
2013
)
The species-area relationship in the Late Ordovician: a test using neutral theory
.
Diversity
5
:
240
262
.

Šizling
AL
,
Storch
D
(
2004
)
Power-law species–area relationships and self-similar species distributions within finite areas
.
Ecol Lett
7
:
60
68
.

Sørensen
TA
(
1948
)
A method of establishing groups of equal amplitude in plant sociology based on similarity of species content, and its application to analyses of the vegetation on Danish commons
.
Kongelige Danske Videnskabernes Selskabs Biol Skr
5
:
1
34
.

Storch
D
(
2016
)
The theory of the nested species-area relationship: geometric foundations of biodiversity scaling
.
J Veg Sci
27
:
880
891
.

Zillio
T
,
He
FL
(
2010
)
Inferring species abundance distribution across spatial scales
.
Oikos
119
:
71
80
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
Handling Editor: Da-Yong Zhang
Da-Yong Zhang
Handling Editor
Search for other works by this author on: