On this page find a summary of the custom R commands developed during production of the book. These are mentioned throughout the text and are available as part of the associated data file (see: Support Files). See also the outline and Table of Contents.
Each entry is laid out in a similar manner to standard R-help entries. I’ve included links to other R commands as appropriate. Most of the examples link to data used in the book (see: Support Files).
See the News page for details about updated files, new custom commands and so on.
See our sister site, DataAnalytics: Ecology Matters, for resources for Ecology Students & Teachers. Including: data examples to use for practise and demonstration, and Custom Functions for R: The Statistical Programming Language.
Custom Commands Index
A | C | D | E | F | H | I | P | Q | R | S
Table: Custom R functions in Community Ecology support file.
aic |
Model-building: add single terms to a model. |
comm_grp |
Group community data using grouping variable. |
Dapp |
Bootstrap confidence intervals for Simpson’s diversity index for multiple samples. |
dist_chi |
Calculates pairwise chi squared statistics for a community dataset. |
diversity_comp |
Significance of diversity between two samples using permutation. |
eacc |
Estimated species accumulation for multiple sample groups. |
entropy_plot |
Visualize the results from the renyi() or tsallis() commands (in vegan ). |
fisher_alpha |
Calculates Fisher’s alpha, a measure of ecological diversity. |
fisher_fit |
Fits Fisher’s logseries to abundance data. |
H_boot |
Bootstrap confidence intervals for ecological diversity indices. |
H_bss |
Significance between diversity index of two samples using bootstrapping. |
H_ci |
bootstrap resampling to estimate confidence intervals for a single sample. It is intended to be used with the apply() command to provide bootstrap confidence intervals for multiple samples. |
H_CI |
computes the confidence interval for the t-test used with Shannon or Simpson’s indices. |
H_sig |
carries out a version of the t-test for either Shannon or Simpson’s indices. |
Happ |
Bootstrap confidence intervals to be computed for Shannon entropy for multiple samples. |
Iapp |
Bootstrap confidence intervals to be computed for Simpson’s Inverse index for multiple samples. |
pacc |
Extrapolated species richness in a species pool. |
plot_H |
Point plot of diversity index and bootstrap confidence intervals. |
polar.ord |
Polar Ordination (aka Bray-Curtis ordination). |
qstar |
Plot Tsallis entropy profile using q*. |
rad_aic |
Compute AIC values from multiple RAD models. |
rad_test |
A significance test of differences in deviance of RAD models. |
spec.rich |
Species richness for a community dataset. |
species_assoc |
Pairwise chi squared species association. |
species_ind |
Indicator species analysis by chi-squared. |
sr_est |
Species richness estimation with accumulating sites. |
A
Model-building: add single terms to a model.
aic
This command is a wrapper for add1()
. It computes all the single terms that can be added to the model and returns AIC values.
Usage
aic(model, scope, decreasing = FALSE, n = "all", ...)
Arguments
model |
a fitted model object. |
scope |
a formula giving the terms to be considered for adding. |
decreasing |
logical. The order to display the results decreasing = FALSE (default) shows results in order of increasing AIC. |
n |
how many variables to display (default: n = "all" ). Use integer value to return n items. |
... |
additional parameters to pass to add1() . |
Details
This command is a wrapper for add1()
. It computes all the single terms in the scope argument that can be added to the model, fit those models and compute a table of the changes in fit. The add1()
command produces results with variables listed in alphabetical order, whilst aic()
allows results to be ordered by AIC value. Additionally, you can specify the number of variables to display.
Value
An object of class "anova"
summarizing the differences in fit between the models.
See Also
The add1()
command in the stats package.
Examples
> library(vegan) # Load vegan package > data(varespec) # Biological dataset > data(varechem) # Environmental variables > H <- diversity(varespec, index = "shannon") # Shannon index for samples > vlm <- lm(H ~ 1, data = varechem) # A blank model > aic(vlm, scope = varechem, n = 3, test = "F") # display first 3 lowest AIC values Single term additions Model: H ~ 1 Df Sum of Sq RSS AIC F value Pr(>F) Baresoil 1 1.95786 1.0884 -70.240 39.5746 2.486e-06 *** Humdepth 1 0.75534 2.2909 -52.378 7.2536 0.01328 * Fe 1 0.48645 2.5598 -49.715 4.1808 0.05303 . Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Compare to add1()
result (not shown)
> add1(vlm, scope = varechem, n = 3, test = "F")
C
Group community data using grouping variable.
comm_grp
This command can be used instead of rowsum()
and essentially does the same thing.
Usage
comm_grp(comm, groups, data)
Arguments
comm |
a community dataset as a data.frame , matrix or table . |
groups |
a grouping variable. |
data |
an optional data.frame where the grouping variable can be found. |
Details
The command uses a grouping variable to combine samples in a community dataset.
Value
Returns a matrix
with rows as samples grouped by the variable in group. The columns are the species as per the original community dataset.
See Also
The rowsum()
command in base, which does exactly the same thing.
Examples
Samples in a community dataset
> rownames(gb.biol)
[1] "E1" "E2" "E3" "E4" "E5" "E6" "G1" "G2" "G3" "G4" "G5" "G6"
[13] "W1" "W2" "W3" "W4" "W5" "W6"
Some of the original community data
> gb.biol[1:5, 1:5] Aba.par Acu.dub Ago.afr Ago.ful Ago.mue E1 388 0 0 0 0 E2 325 0 0 0 0 E3 295 0 0 0 0 E4 350 0 0 0 2 E5 407 0 0 0 1
A grouping variable
> gb.site$Habitat [1] Edge Edge Edge Edge Edge Edge Grass Grass Grass Grass [11] Grass Grass Wood Wood Wood Wood Wood Wood Levels: Edge Grass Wood
Group data according to grouping variable
> comm_grp(gb.biol, groups = Habitat, data = gb.site) Aba.par Acu.dub Ago.afr Ago.ful Ago.mue Ago.vid Ama.aen Edge 2146 0 0 0 3 0 0 Grass 709 3 20 3 0 2 2 Wood 2013 0 0 0 0 0 0
D
Bootstrap confidence intervals for Simpson’s index.
Dapp
This is intended to be used with apply()
; allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.
Usage
Dapp(x)
Arguments
x |
a vector of values; a single sample from a community dataset. The intention is to use this as the FUN argument of apply (or similar) without a separate argument. |
Details
Use this command as a FUN
argument in apply()
to return bootstrap confidence intervals for Simpson’s diversity index from multiple samples.
Value
97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply()
the result is a matrix
with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.
See Also
See apply()
for applying a function to rows (or columns) of a data.frame
or matrix
. See H_boot()
and H_ci()
for computing bootstrap confidence intervals for a single sample using Simpson’s, Shannon or Inverse Simpson’s indices. See Happ()
for confidence intervals using Shannon entropy and Iapp()
for Simpson’s Inverse index. Bootstrapping is carried out using the boot()
command.
Examples
Use on single sample
> DeVries[1,1:6] Hisache Panprol Neshewi Morachi Taysp.1 Coldirc 1882 1028 19 5 8 273 > DV.ci <- Dapp(DeVries[1,]) > DV.ci 2.5% 97.5% 0.8418572 0.8553752
Use on rows of a dataset with multiple samples
> apply(DeVries, MARGIN = 1, FUN = Dapp)
canopy under
2.5% 0.8417613 0.9169560
97.5% 0.8557949 0.9224317
Chi-squared association for community analysis
dist_chi
Calculates pairwise chi squared statistics for a community dataset. The result has a class "distchi"
, which has plot
and summary
methods. The results include Pearson residuals as a dissimilarity matrix. The plot
method produces an hierarchical cluster object, and plots the dendrogram.
Usage
dist_chi(data) summary(x) plot(x, method = "complete", ...)
Arguments
data |
a community dataset (usually a matrix or data.frame ) with columns as samples. |
x |
the result of a distchi() command. |
method |
the cluster joining algorithm to be used by hclust() , use one of "ward" , "single" , "complete" (the default), "average" , "mcquitty" , "median" , or "centroid" . |
... |
other parameters to pass to the plotting command. |
Details
Chi squared association can be used to analyse species co-occurrence across many samples. Species that tend to associate together will be more likeley to be from the same community. Species that tend to be negatively associated will be form different communities. So, this approach can be used to identify groups of species and communities using data from many samples. This command carries out the association tests and used the Pearson residuals as the basis for a dissimilarity matrix (all residuals are rescaled to make the smallest zero). This is used by the plot method to produce a dendrogram showing the species groupings.
Value
An object of class "distchi"
, which has various components:
chi |
The individual chi square values for the pairwise species co-occurrences. |
expected |
The expected values for the pairwise species co-occurrences. |
residuals |
The Pearson residuals for the pairwise species co-occurrences. |
distance |
The dissimilarity matrix based on the Pearson residuals. |
p.val |
The p-value for the overall chi squared association. |
data |
The name of the data used in the analysis. |
spp |
The number of species in the dataset. |
See Also
The hclust()
command in stats for hierarchical clustering algorithms. See designdist()
in vegan for calculating dissimilarities (used for calculating co-occurrence). Use species_assoc()
for association analysis of a pair of species from a dataset. See species_ind()
for indicator species analysis using chi squared.
Examples
Use a dataset with rows as species and columns as samples
> hsere.cs <- dist_chi(hsereT)
> summary(hsere.cs)
Available components:
[1] "chi" "expected" "residuals" "distance" "p.val"
[6] "data" "spp"
Total.Chi df p.val
hsereT 632.9612 22 1.027601e-119
A dendrogram of the result
> plot(hsere.cs, method = "complete")
Diversity index comparison by permutation.
diversity_comp
Computes significance of diversity index between two samples by permutation method (oecosimu
). You can use any of the diversity indices used by the diversity()
command in vegan.
Usage
diversity_comp(comm, index = "shannon", perm = 1999)
Arguments
comm |
a community dataset of exactly two samples. |
index |
a diversity index; options are "shannon" (default), "simpson" and "invsimpson" . |
perm |
the number of permutations, default perm = 1999 . |
Details
This command uses the vegan::oecosimu()
command and is essentially a wrapper for it. It uses a permutation process to repeatedly sample two communities. The difference in the diversity index is then assessed from the permutation results.
Value
The result is an object that holds two classes: "oecosimu"
and "list"
. There are two components: statistic
, the calculated difference between the community indices and oecosimu
, the results from the simulation. The latter holds the main results, which can be summarised using the print
routine (i.e. simply typing the name of the result). Results include z-statistic, 95% CI and p-value (from simulation).
See Also
The oecosiumu()
command in vegan carries out the analysis. See also the diversity()
command for calculation of diversity indices. See H_boot()
for general confidence intervals of diversity indices by bootstrapping.
You must use a community dataset with exactly two samples. You can easily subset a larger dataset. If you have separate data then you’ll have to combine them first.
Examples
A data.frame with only two rows
> diversity_comp(DeVries, index = "simpson", perm = 2000) oecosimu with 2000 simulations simulation method r2dtable alternative hypothesis: true mean is not equal to the statistic statistic z 2.5% 50% 97.5% statistic 7.0746e-02 5.5458e+01 6.6880e-05 1.3523e-03 0.0045 Pr(sim.) statistic 0.0004998 ***
Use a data.frame with many rows and select two for comparison
> diversity_comp(gb.biol[c(1,7), ]) oecosimu with 1999 simulations simulation method r2dtable alternative hypothesis: true mean is not equal to the statistic statistic z 2.5% 50% 97.5% Pr(sim.) statistic 0.668067 10.135002 0.003174 0.063465 0.218 5e-04
E
Estimated species accumulation.
eacc
This command is intended to be used with lapply()
so that you can compute estimated species accumulation for multiple sample groups.
Usage
eacc(x, data)
Arguments
x |
An index value from a community dataset. |
data |
A community dataset, usually a data.frame or matrix . |
Details
This function is intended to be used in lapply()
rather than as a standalone command. It allows you to take a community dataset and split it by a grouping variable (an index you define). Then the function carries out estimated species accumulation groupwise, using the index you specify and the vegan::estaccumR()
command.
Value
The result is a "list"
with a component for each grouping variable specified. Each component contains the result of the estaccumR()
command, that is estimated species accumulation using various methods.
See Also
The estaccumR()
command carries out the estimated species accumulation using various estimators. The pacc()
command is a custom function that carries out a similar job for estimated species richness using the vegan::poolaccum()
command. See lapply()
for details of applying a function to list
elements.
Examples
## Make an index from rows of community dataset
> index <- list(Edge = 1:6, Grass = 7:12, Wood = 13:18)
> index
$Edge
[1] 1 2 3 4 5 6
$Grass
[1] 7 8 9 10 11 12
$Wood
[1] 13 14 15 16 17 18
Use index value to extract species accumulation by group
> lapply(index, FUN = eacc, data = gb.biol)
$Edge
N S Chao ACE
1 18.19 20.49214 21.44664
2 21.47 23.23650 23.67794
3 23.80 25.20936 26.06676
4 25.30 26.48750 27.45683
5 26.21 27.51250 28.03773
6 27.00 28.00000 28.75118
$Grass
N S Chao ACE
1 24.04 31.43000 32.11511
2 29.51 33.98068 36.43205
3 32.47 34.64050 37.42315
4 34.38 35.53885 37.24407
5 35.71 36.55452 37.74246
6 37.00 38.00000 38.72006
$Wood
N S Chao ACE
1 11.45 12.320 14.03674
2 12.18 12.700 13.25899
3 12.59 12.915 13.62097
4 12.89 13.170 13.79830
5 13.00 13.000 13.39019
6 13.00 13.000 13.00000
Plot results from Rényi or Tsallis calculations.
entropy_plot
This command allows you to visualize the results from the renyi()
or tsallis()
commands (in vegan). Usually you’ll use it as an alternative to the usual lattice
plotting command to overlay multiple results in a single plot window.
Usage
entropy_plot(H, type = "l", ylab = "Diversity", xlab = "Scale", legend = TRUE, col = palette(), lty = 1:6, cex = 1, pch = NULL, cex.legend = 1, ...)
Arguments
H |
the result from a renyi() or tsallis() command from the vegan package. |
type |
the type of plot, default "l" produces lines. Alternatives are "p" , points, "b" , both, "o" overplotted, "n" none. |
ylab
|
axis labels for y and x-axes |
legend |
Logical. The default, TRUE , displays a legend. |
col |
the default colours to use on the plot, the default uses the current colour palette. Colours are recycled as required. |
lty |
the line style(s) used, the default uses types 1:6 and these are recycled as needed. |
cex |
character expansion for plotted points, default = 1 . |
pch |
plotting symbols, default is NULL . |
cex.legend |
character expansion for legend labels, default = 1 . |
... |
other graphical parameters to pass to matplot() . |
Details
This command provides an alternative way to visualise results from Rényi or Tsallis entropy calculations. The command takes the result of a tsallis()
or renyi()
command (from vegan) and produces a line plot for each sample – showing the entropy at the various scales.
The tsallis()
and renyi()
commands usually show multiple samples using the lattice
package, with separate panel for each sample.
The entropy_plot()
command uses matplot()
to overlay the various samples in one plot. You can choose to add a legend
(defaults to the "topright"
position). If the input data is not an entropy result the command stops, with an error. You could mimic the result of a tsallis()
or renyi()
command, the result must hold a class "renyi"
.
Value
Creates a new graphic from the result of a tsallis()
or renyi()
command (from vegan). Each sample is shown as a separate line (without markers by default) showing the entropy at the various scales.
See Also
Entropy calculations in the vegan package by tsallis()
and renyi()
commands. The matplot()
command is used to draw the multiple sample lines.
Examples
## Community dataset with two samples
> rownames(DeVries)
[1] "canopy" "under"
## Make a Tsallis entropy result
> DV.te <- tsallis(DeVries)
## Plot result using lines with points
> entropy_plot(DV.te, pch = 1:2, type = "b")
F
Ecological diversity: Fisher’s alpha
fisher_alpha
Calculates Fisher’s alpha, a measure of ecological diversity.
Usage
fisher_alpha(x, MARGIN = 1, se = FALSE, ...)
Arguments
x |
Community data, a matrix -like object or a vector . |
MARGIN | Margin for which the index is computed. Default 1 = rows. |
se |
With option se = TRUE , function fisher_alpha() returns a data.frame with items for α (alpha), its approximate standard errors (se), residual degrees of freedom (df.residual), and the code returned by nlm on the success of estimation. See Details. |
... |
Other parameters to pass to nlm . |
Details
The fisher_alpha()
command estimates the α parameter of Fisher’s logarithmic series (see fisher_fit()
command). The estimation is possible only for genuine counts of individuals.
This command restores the behaviour of the old command fisher.alpha()
in vegan. This is what the vegan team had to say:
“fisherfit: completely rewritten and estimates of standard error removed: I could not find no justification for these. Actually, it seems that the value of Fisher alpha as estimated in the function was independent of the abundance distribution of species, but will be defined by the number of species (S) and number of individuals (N).”
“Now the Fisher alpha is estimated from the relationship S = alpha*(1 + log(N/alpha)) using function uniroot(). Because of this, standard errors cannot be estimated and they were removed. In addition, functions confint.fisherfit, profile.fisherfit and plot.profile.fisherfit were removed. The estimation of standard errors was also removed in function fisher.alpha (that only calls fisherfit).”
So, use with some caution – I replaced the function solely to allow you to get the same results as the exercise in the book.
Value
With option se = TRUE
, function fisher_alpha()
returns a data.frame
with items for α (alpha), its approximate standard errors (se), residual degrees of freedom (df.residual), and the code returned by nlm on the success of estimation.
See Also
Function fisher.alpha()
in later versions of package vegan does not compute se. Function fisher_fit()
computes the logseries and returns the alpha values. The newer versions of package vegan use fisherfit()
which does not permit profiling or calculation of se.
Other commands restored from earlier version of vegan:
confint.fisherfit() plot.profile.fisherfit() profile.fisherfit()
Examples
This version computes se
> fisher_alpha(DeVries, se = TRUE) alpha se df.residual code canopy 8.601215 1.249295 55 1 under 10.212885 1.379707 64 1
Newer version in vegan does not compute se even though help suggests it will
> fisher.alpha(DeVries, se = TRUE) canopy under 8.60122 10.21290
Fit Fisher’s logseries to abundance data
fisher_fit confint.fisherfit plot.profile.fisherfit profile.fisherfit
Function fisher_fit()
fits Fisher’s logseries to abundance data.
Usage
fisher_fit(x, ...)
Arguments
x |
Community data, a matrix -like object or a vector . |
... |
Other parameters to pass to nlm . |
Details
Fits Fisher’s logseries to ecological abundance data using non-linear modelling. There is a plot
method. In addition you can use confint()
(from MASS) to determine approximate confidence intervals. There are also profile
and plot.profile
methods.
Newer versions of vegan have removed the calculations for se (see Details in fisher_alpha()
) so treat these with caution. I only restored the commands to allow you to get the same results as the exercises in the book.
Value
Returns an object of class "fisherfit"
, which is the result of nlm()
.
See Also
Function fisher.alpha() in later versions of package vegan does not compute se. Function fisher_fit() computes the logseries and returns the alpha values. The newer versions of package vegan use fisherfit() which does not permit profiling or calculation of se.
Examples
Fit Fisher logseries to sample of data
> ff <- fisher_fit(DeVries[1,]) > ff Fisher log series model No. of species: 56 Fisher alpha: 8.601215
Plot the result (not shown)
> plot(ff)
Get approx CI (uses MASS)
> confint(ff)
2.5 % 97.5 %
6.396521 11.313053
Make a profile result
> profile(ff)
$alpha
tau par.vals
1 -3.3649073 5.094402
2 -2.5788594 5.795765
3 -1.8602976 6.497127
4 -1.1967304 7.198490
5 -0.5789605 7.899852
6 0.0000000 8.601215
7 0.5455796 9.302577
8 1.0621142 10.003940
9 1.5531015 10.705303
10 2.0214191 11.406665
11 2.4694633 12.108028
12 2.8992526 12.809390
attr(,"original.fit")
attr(,"original.fit")$coefficients
alpha
8.601215
attr(,"original.fit")$std.err
[1] 1.249295
attr(,"class")
[1] "profile.fisherfit" "profile.glm" "profile"
Plot the profile (not shown)
> plot(profile(ff))
H
Bootstrapping diversity indices.
H_boot
Carries out a bootstrap resampling using the boot
package. The results compute the confidence intervals. You can specify any diversity index used by the diversity()
command in vegan for a single sample but the vegan
package is not required.
Usage
H_boot(x, index = "shannon", R = 1000, ci = 95)
Arguments
x |
a vector of community data. |
index |
a diversity index; use one of "shannon" (default), "simpson" or "invsimpson" . |
R |
the number of bootstrap resamples to use, defaults to 1000. |
ci |
the confidence interval to return, the default 95, returns 2.5% and 97.5% quantiles. |
Details
A sample of community data can be randomised by repeatedly resampling. The H_boot()
command uses the boot()
command in the boot
package to conduct the resampling. The command allows you to specify various diversity index measures and computes the confidence intervals based on quantiles from the resampling.
Value
Produces a "list"
with several components: boot.stats
gives the results from the boot()
command, CI
shows the confidence intervals (based on quantiles from the bootstrap resampling), and index
gives the calculated diversity index. The command displays the index and CI directly.
See Also
The boot()
command in the boot
package carries out the resampling. Diversity indices are computed de-novo but see diversity()
in the vegan
package for alternative computation of Shannon, Simpson’s and Inverse Simpson’s indices. See H_bss()
for bootstrap comparison between two samples.
The H_ci()
command can be used with apply()
to get bootstrap confidence intervals for multiple samples.
Examples
Use defaults for one sample from a dataset
> H_boot(DeVries[1,]) H = 2.640651 2.5% 97.5% 2.596435 2.675158
Use different index, CI and number of resamples
> H_boot(DeVries[2,], index = "simpson", ci = 99, R = 2000) H = 0.9198993 0.5% 99.5% 0.9158286 0.9233032
Bootstrap significance between two diversity indices.
H_bss
This command carries out a bootstrap analysis of the diversity indices of two samples. The result gives a p-value for the permutation as well as a calculated z-score. You can use any of the diversity indices used by the diversity()
command in vegan
. The result holds a class, "Hbss"
, which has print
and summary
methods.
Usage
H_bss(comm, R = 2000, index = "shannon") print(x, ...) summary(x)
Arguments
comm |
a community dataset containing exactly two samples. |
x |
the result of an H_bss() command. |
R |
the number of bootstrap resamples to use, defaults to 2000. |
index |
a diversity index; use one of "shannon" (default), "simpson" or "invsimpson" . |
... |
other parameters to pass to print , such as digits . |
Details
The statistical significance of the difference in diversity index between two samples is estimated using a randomization method. The two samples are repeatedly randomized and the difference in the diversity index recalculated. The significance is assessed by comparing the difference between the resamples and the original difference. Confidence intervals are estimated using the quantiles of the randomization.
A z-score is also computed from the randomizations and the p-value estimated from that.
The command produces a result holding a class "Hbss"
, which has print
and summary
routines.
Value
A list object with custom class "Hbss"
, which has print
and summary methods
. The result object contains several components:
H.orig |
The calculated diversity index for the two original samples. |
statistic |
The difference in the diversity index between the two original samples. |
z.score |
The z-score calculated using the original difference and the difference between the bootstrap resamples as well as the standard deviation of the bootstrap samples. |
CI |
The 95% confidence interval for the difference between the indices of the two samples. |
p.sim |
The p-value resulting from the bootstrap resampling. |
p.z |
The p-value estimated using the z-score. |
df |
The degrees of freedom (the number of species -1 for each sample). |
Index |
The name of the diversity index used in the calculations. |
See Also
See the diversity()
command in vegan
for computations of diversity indices. See the r2dtable()
command for randomization of data tables with fixed marginal totals. See H_boot()
for bootstrap confidence intervals of a single sample. See also H_ci()
for a function to use with apply()
to give boostrap confidence intervals for multiple samples.
Examples
Use 2 samples from a larger dataset
> gb.bs <- H_bss(gb.biol[c(7,13), ])
Print result
> print(gb.bs, digits = 3)
H.orig.1 H.orig.2 Statistic Z.score 2.5% 97.5% P(sim)
shannon 1.94 1.40 0.538 9.57 0.00335 0.184 0
P(z) df
shannon 1.13e-11 38
Summary method
> summary(gb.bs)
Original shannon Index: 1.935163 1.397251
Difference in index: 0.5379125
z-score: 9.573475
Confidence interval: 0.003348132 0.1837998
Degrees of Freedom: 38
Simulated P-value: 0
P-value (z): 1.130406e-11
Bootstrap confidence intervals for a single sample.
H_ci
This command carries out a bootstrap resampling to estimate confidence intervals for a single sample. It is intended to be used with the apply()
command to provide bootstrap confidence intervals for multiple samples.
Usage
H_ci(x, index = "shannon", R = 1000, ci = 95)
Arguments
x |
a vector of community data. |
index |
a diversity index; use one of "shannon" (default), "simpson" or "invsimpson" . |
R |
the number of bootstrap resamples to use, defaults to 1000. |
ci |
the confidence interval to use, defaults to 95 (i.e. 2.5% and 97.5% quantiles). |
Details
A sample of community data can be randomised by repeatedly resampling. The H_ci()
command uses the boot()
command in the boot
package to conduct the resampling. The command allows you to specify various diversity index measures and computes the confidence intervals based on quantiles from the resampling.
This command is intended to be used as the function (FUN
argument) in apply()
, allowing you to calculate confidence intervals for multiple samples. The results can be used for example, to make error bars in plots of diversity indices for multiple samples.
Value
The command returns (invisibly) the confidence intervals as a simple vector with two values, one for the lower CI and one for the upper CI.
See Also
The H_boot()
command for calculating the diversity index and confidence intervals for a single sample. The boot()
command (in package boot
) carries out the bootstrap resampling. See the H_bss()
command for a significance test for two samples based on randomization. See the apply()
command for conducting computations repeatedly over the rows (or columns) of a dataset.
Examples
Use a dataset with two rows (samples)
> apply(DeVries, MARGIN = 1, FUN = H_ci, index = "simpson")
canopy under
2.5% 0.8418946 0.9170899
97.5% 0.8554198 0.9224856
Confidence interval for Shannon or Simpson’s t-test.
H_CI
This command computes the confidence interval for the t-test used with Shannon or Simpson’s indices.
Usage
H_CI(x, index = "shannon", ci = 95)
Arguments
x |
a vector of community data. |
index |
a diversity index; use one of "shannon" (default), or "simpson" . |
ci |
the confidence interval to use the default is 95 and works out 95% CI. |
Details
A variant of the t-test can be used to determine the statistical significance of differences in Shannon entropy or Simpson’s index between two samples. The H_CI()
command computes the confidence interval for either index using the t-test formula as a starting point. The t-test is sensitive to degrees of freedom (sample size) and bootstrapping methods are becoming more popular (see H_bss()
command).
The command can be used with apply()
to get the confidence intervals for multiple samples.
Value
Returns a simple numeric value representing the confidence interval at the chosen level (default 95%).
See Also
The H_sig()
command for calculating the t-test to compare Shannon or Simpson’s index between two samples. See the apply()
command for using H_CI()
over rows of a dataset to calculate CI for multiple samples. See H_ci()
for bootstrap confidence intervals for multiple samples and H_bss()
for a significance test based on randomization using bootstrapping.
Examples
Get CI for Simpson’s index for single sample
> H_CI(DeVries[1,], index = "simpson") [1] 0.007075684
Get CI for Shannon for rows of dataset
> apply(DeVries, MARGIN = 1, FUN = H_CI)
canopy under
0.07670451 0.09524768
Special t-test for Shannon or Simpson’s index.
H_sig
This command carries out a version of the t-test for either Shannon or Simpson’s indices. The result object has a class "Hsig"
, which has plot
and summary
routines.
Usage
H_sig(x, y, index = "shannon", alpha = 0.05)
summary(x)
plot(x, ylab = "Index Value", xlab = "Samples", ...)
Arguments
x , y |
Vectors of community data for comparison; these can be of different length. For plot and summary methods, x is the result of an H_sig() command. |
index |
a diversity index; use one of "shannon" (default), or "simpson" . |
alpha |
the significance level, the default 0.05 computes critical values for p = 0.05. |
xlab , ylab |
labels for the x and y-axes. |
... |
additional graphical parameters to pass to plot() . |
Details
A variant of the t-test can be used to determine the statistical significance of differences in Shannon entropy or Simpson’s index between two samples. The H_sig()
command computes the t-test for two samples. The t-test is sensitive to degrees of freedom (sample size) and bootstrapping methods are becoming more popular (see H_bss()
command).
The summary
routine provides an overview of the test statistics and the plot
routine creates a point plot with error bars based on confidence intervals.
Value
The result is a list with a custom class "Hsig"
, which has summary
and plot
methods. There are various components:
p.val |
The p-value for the calculated t-test. |
df |
The degrees of freedom. |
t.statistic |
The calculated value of t. |
var |
The variance for each sample. |
critical |
The critical value for the degrees of freedom and the chosen alpha level. |
H |
The value of the diversity index for each sample. |
index |
The name of the diversity index used. |
alpha |
The chosen level of significance. |
sites |
The site names. |
richness |
The number of species in each sample. |
individuals |
The number of individuals in each sample. |
The plot
method produces a point plot with error bars showing confidence intervals (using the alpha level from the original t-test).
See Also
The H_CI()
command for computing confidence intervals using the t-test formula as a starting point. See the H_bss()
command for an alternative significance test using randomization and bootstrapping.
Examples
Determine significance of t-test between Shannon index of 2 samples
DVh <- H_sig(DeVries[1,], DeVries[2,]) [1] 0.0002608309
Get a summary of the results
> summary(DVh)
Site: DeVries[1, ]
Index: shannon = 2.640651
Variance = 0.0003865389
df = 2.231875
No. Species = 56
Individuals = 5774
Site: DeVries[2, ]
Index: shannon = 2.982633
Variance = 0.0002548043
df = 1.508951
No. Species = 65
Individuals = 5922
t-statistic = 13.50384
df = 3.740253
critical value at 0.05 = 2.854166
Significance = 0.0002608309
Plot the result
> plot(DVh, pch = 16, cex = 2, col = "darkgreen", lty = 1)
Bootstrap confidence intervals for the Shannon index.
Happ
This is intended to be used with apply(); allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.
Usage
Happ(x)
Arguments
x |
a vector of values; a single sample from a community dataset. The intention is to use this as the FUN argument of apply (or similar) without a separate argument. |
Details
Use this command as a FUN
argument in apply()
to return bootstrap confidence intervals for Simpson’s diversity index from multiple samples.
Value
When called on a single sample (a vector of values) the command returns invisibly 2.5% and 97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply()
the result is a matrix
with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.
See Also
See apply()
for applying a function to rows (or columns) of a data.frame
or matrix
. See H_boot()
and H_ci()
for computing bootstrap confidence intervals for a single sample using Simpson’s, Shannon or Inverse Simpson’s indices. See Dapp()
for confidence intervals using Simpson’s index and Iapp()
for Simpson’s Inverse Index. Bootstrapping is carried out using the boot()
command.
Examples
Use as FUN
in apply command
> apply(DeVries, MARGIN = 1, FUN = Happ)
canopy under
2.5% 2.595170 2.943928
97.5% 2.674168 3.009593
Bootstrap confidence intervals for the Inverse Simpson index.
Iapp
This is intended to be used with apply()
; allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.
Usage
Iapp(x)
Arguments
x |
a vector of values; a single sample from a community dataset. The intention is to use this as the FUN argument of apply (or similar) without a separate argument. |
Details
Use this command as a FUN
argument in apply()
to return bootstrap confidence intervals for Inverse Simpson’s diversity index from multiple samples.
Value
When called on a single sample (a vector
of values) the command returns invisibly 2.5% and 97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply()
the result is a matrix
with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.
See Also
See apply()
for applying a function to rows (or columns) of a data.frame or matrix. See H_boot()
and H_ci()
for computing bootstrap confidence intervals for a single sample using Simpson’s, Shannon or Inverse Simpson’s indices. See Dapp()
for confidence intervals using Simpson’s index and Happ()
for Shannon. Bootstrapping is carried out using the boot()
command.
Examples
Use as FUN
in apply command
> apply(DeVries, MARGIN = 1, FUN = Iapp)
canopy under
2.5% 6.333271 12.05552
97.5% 6.950071 12.90245
P
Extrapolated species richness in a species pool.
pacc
This command is intended to be used with lapply()
for use with a community dataset and a grouping variable. It calls the poolaccum()
command from vegan
and returns various estimators of species richness.
Usage
pacc(x, data)
Arguments
x |
a vector of index values. |
data |
the data.frame holding the community data. |
Details
This function is intended to be used in lapply()
rather than as a standalone command. It allows you to take a community dataset and split it by a grouping variable (an index you define). Then the function carries out estimated species richness groupwise, using the index you specify and the vegan::poolaccum()
command.
Value
The result is a "list"
with a component for each grouping variable specified. Each component contains the result of the poolaccum()
command, that is estimated species accumulation using various methods.
See Also
The poolaccum()
command carries out the estimated species accumulation using various estimators. The eacc()
command is a custom function that carries out a similar job for groupwise estimated species richness using the vegan::estaccumR()
command. See lapply()
for details of applying a function to list
elements.
Examples
Make an index from rows of community dataset
index <- list(Edge = 1:6, Grass = 7:12, Wood = 13:18)
> index
$Edge
[1] 1 2 3 4 5 6
$Grass
[1] 7 8 9 10 11 12
$Wood
[1] 13 14 15 16 17 18
Use index to get extrapolated species richness for groups
> lapply(index, FUN = pacc, data = gb.biol)
$Edge
N S Chao Jackknife 1 Jackknife 2 Bootstrap
3 23.29 34.94850 27.38333 28.90000 25.22704
4 24.94 33.60742 29.44750 31.33917 27.05879
5 26.13 34.32750 30.99400 33.28750 28.38051
6 27.00 33.00000 32.00000 34.40000 29.30678
$Grass
N S Chao Jackknife 1 Jackknife 2 Bootstrap
3 32.21 38.03772 37.81000 39.45167 34.95630
4 34.10 38.48675 39.38750 40.78250 36.74922
5 35.79 43.29300 41.72600 44.46350 38.58681
6 37.00 53.00000 43.66667 47.93333 39.94114
$Wood
N S Chao Jackknife 1 Jackknife 2 Bootstrap
3 12.62 13.325 13.47333 13.66500 13.05148
4 12.91 13.160 13.76500 14.07833 13.32430
5 13.00 13.325 13.52000 13.30250 13.31851
6 13.00 13.000 13.00000 11.93333 13.17563
Point plot of diversity index and bootstrap confidence intervals.
plot_H
This command carries out a bootstrap resampling of a community dataset and plots the results as a point plot that includes error bars showing the 95% confidence intervals. You can use any diversity index used by the diversity()
command in vegan
.
Usage
plot_H(comm, index = "shannon", xlab = "Site Names", ylab = "Diversity Index", pch = 18, cex = 1.2, cex.axis = 0.75, boot = TRUE, ...)
Arguments
comm |
community dataset, usually a matrix or data.frame . |
index |
a diversity index; use one of "shannon" (default), "simpson" or "invsimpson" . |
xlab , ylab |
labels for x and y axes. |
pch |
plotting symbol, default = 18 (a filled diamond). |
cex |
character expansion, default = 1.2. |
cex.axis |
character expansion for axes, default = 0.75. |
boot = TRUE |
logical, use bootstrap CI (default = TRUE ) or variance estimators (FALSE ). |
... |
other graphical parameters to pass to plot() . |
Details
Bootstrapping is a way to repeatedly resample a community dataset. The confidence interval of the diversity index is assessed from the quantiles of the resamples. The plot_H()
command uses the diversity()
command from vegan
to calculate the diversity index for multiple samples in a community dataset. The H_ci()
command is called internally, which computes the confidence intervals by calling boot()
from the boot
package. The diversity indices are then plotted (as a point plot), using the confidence intervals as error bars. This provides a visual representation of differences in diversity index between samples.
Alternatively you can specify boot = FALSE
to use variance estimation for CI. This calls the H_CI()
command internally.
Note: if you specify boot = FALSE
and index = "invsimpson"
an error is returned.
Value
The command produces a list
object invisibly; this contains the diversity index, $H
and confidence intervals, $CI
. These are used to create a point plot with 95% error bars.
See Also
See the diversity()
command in vegan
for details of the diversity indices. The boot()
command from package boot
carries out the resampling; it is called from H_ci()
, which calculates the confidence intervals. See the apply()
command, which is used internally to obtain a statistics over the rows of a dataset. If boot = FALSE
the CI are determined using H_CI()
via a variance estimation process.
Examples
Make a plot and save results
> gbh = plot_H(gb.biol, ylab = "Shannon Entropy")
View the results
> gbh $H E1 E2 E3 E4 E5 E6 G1 1.267096 1.424505 1.412179 1.737146 1.636923 1.557477 1.935163 G2 G3 G4 G5 G6 W1 W2 2.000272 1.929426 2.360250 2.139567 2.205175 1.397251 1.582102 W3 W4 W5 W6 1.539511 1.601073 1.616333 1.552470 $CI E1 E2 E3 E4 E5 E6 2.5% 1.167435 1.339836 1.320449 1.639057 1.552808 1.477362 97.5% 1.341344 1.487790 1.484144 1.802874 1.698681 1.614796 G1 G2 G3 G4 G5 G6 2.5% 1.741574 1.805502 1.704016 2.192101 1.975294 2.028207 97.5% 2.047899 2.109588 2.046776 2.449227 2.230595 2.288512 W1 W2 W3 W4 W5 W6 2.5% 1.337487 1.519955 1.470808 1.534223 1.560459 1.488700 97.5% 1.443202 1.635484 1.597767 1.654974 1.658593 1.601543
Polar Ordination (aka Bray-Curtis ordination)
polar.ord
This command carries out Polar Ordination, a method of ordination (also known as Bray-Curtis ordination). The command works out polar ordination site scores and works out axes as follows: Axis 1 is the combination of sites that gives the maximum separation (not necessarily the maximum variance explained). Axis 2 is the combination of sites that maximises the orthogonality to the 1st axis.
There are print
, plot
and summary
methods and an identify
method for use with plots.
Usage
polar.ord(diss) print.polar.ord(ord, digits = getOption("digits")) summary.polar.ord(ord, k = 2, cor = TRUE, var.exp = TRUE, digits = getOption("digits")) plot.polar.ord(ord, x.axis = 1, y.axis = 2, type = c("t", "p", "n"), ...) identify.plot.polar.ord(pord, labels = pord$labels, ...)
Arguments
diss |
A dissimilarity matrix e.g. as produced by dist() or vegdist() . |
digits |
The number of digits to display, defaults to current settings. |
ord |
The result of a polar.ord command. |
k |
The number of axes to display. |
cor |
If TRUE , shows the correlation between ordination axes. |
var.exp |
If TRUE , shows the variance explained. |
x.axis |
Which ordination axis to plot on the x-axis. |
y.axis |
Which ordination axis to plot on the y-axis. |
type |
The type of plot, "t" text, "p" points or "n" none. |
pord |
The result of a plot.polar.ord command i.e. a polar ordination plot result. |
labels |
A vector of labels to use for the sites. |
... |
Other plotting parameters to pass to plot or identify . |
Details
Polar ordination is a method of indirect gradient analysis. The starting point is a dissimilarity matrix
, any method of producing a dissimilarity can be used as the starting point for the polar.ord()
command. Typical commands would be dist()
or vegdist()
(from package: vegan
). The analysis seeks to maximise the separation between sites along an axis. Subsequent axes are calculated to give the best separation but also to maximise the orthogonality, that is to minimise correlation between the axes.
The method is “simple” enough to be able to carry out using a spreadsheet but identification of subsequent axes can be a labour-intensive and iterative process. Note that in this application of Polar Ordination I have maximised separation, which was the original intention of Bray-Curtis. However, you can also go for a maximise variance explained approach as the summary method displays additional variance explained by each axis.
The print
method allows you to view all the generated axes, their separation (the dissimilarity) and variance explained.
The summary
method provides an overview that includes site socres, correlation and variance explained.
The plot
method allows you to visualise the results from any combination of axes.
The identify
method allows you to interactively label points on the plot.
Value
The polar.ord()
command produces an object (essentially a list
) holding the class "polar.ord"
, which has print
, summary
, plot
and identify
methods. The components are:
sites
– the site scores for all the computed axes.dist
– the distance (separation) for each axis, in the same scale as the original dissimilarity.cor
– the correlation (Pearson) between the principal axis (the one with maximum separation) and each of the other axes.exp
– the variance explained by each axis.add
– the additional variance explained by each axis in addition to the original principal axis.samples
– avector
of site names.axes
– avector
of axis names.diss
– the original dissimilarity result (a"dist"
object).method
– a character vector giving the dissimilarity metric employed.
These components are used by the print, summary, plot and identify methods.
See Also
Various methods of ordination from the vegan
package e.g. metaMDS()
for non-metric multidimensional scaling. Also see cmdscale()
, which is part of the basic stats
package.
Examples
The data (columns are sites) from CERE.RData
moss
Transpose sites/species and make a dissimilarity matrix
# Make dissimilarity matrix
d <- dist(t(moss), method = "euclidean")
Run the PO
> po <- polar.ord(d) > po # Standard print > print(po, digits = 4) # Use print method > summary(po, k = 4) # Show summary of first 4 axes
Plot
> plot(po) # Basic plot > plot(po, x.axis = 1, y.axis = 3) # Plot different axes
Species scores
require(vegan) # Weighted average species scores > wa <- wascores(scores(po), t(moss))
Quick plot
> plot(po) > text(wa, labels = rownames(wa), col = "red", cex = 0.8)
ID plot
> op <- plot(po, type = "p") # Plot points only > identify(op) # Click points to identify sites (Esc to cancel)
> points(wa, col = "red", pch = "+") # Add species points > identify(wa, labels = rownames(wa)) # Add labels interactively
Q
Plot Tsallis entropy profile using q*.
qstar
This command uses the eventstar()
command from vegan
to calculate q*
. The profile is then plotted as three plots showing, evenness, diversity and effective species for the Tsallis scales selected.
Usage
qstar(x, q = seq(0, 2, 0.05), qmax = 5, type = "l", lcol = "red", llty = 1, llwd = 1, ...)
Arguments
x |
a vector of community data. |
q |
the scales to use for the Tsallis entropy (passed to tsallis ). |
qmax |
the maximum q value to use (passed to eventstar ), default = 5. |
type |
the type of plot, default = "l" for lines. |
lcol llty llty |
colour, type and width for the lines showing q* value and diversity values at q*. |
... |
other graphical parameters to pass to plot() . |
Details
The function eventstar()
finds a characteristic value of the scale parameter q
of the Tsallis entropy corresponding to minimum of the evenness (equitability) profile based on Tsallis entropy. This is called q*
. Use of q*
and corresponding diversity, evenness, and Hill numbers (effective species), is recommended because it is a unique value representing the diversity profile. It is also positively associated with rare species in the community, thus it is a potentially useful indicator of certain relative abundance distributions of the communities.
The qstar()
command uses` eventstar()` from vegan
to calculate q*
and then plots three graphs to show evenness, diversity and effective species at the chosed scales, and their relationship to q
.
Value
The command returns the result from the eventstar()
command, i.e. a data.frame
with several columns showing q
(qstar
), the value of evenness at q*
(Estar
), the value of Tsallis entropy at q*
(Hstar
), and the effective species at q*
(Dstar
). See eventstar()
for details.
See Also
See tsallis()
in vegan
for details of Tsallis entropy and the inspiration for this plotting routine. The eventstar()
command from vegan
is called to calculate q*
.
Examples
Select a single sample from a community dataset
> qstar(DeVries[1,]) qstar Estar Hstar Dstar 1 0.6847494 0.1113006 5.062579 20.61567
R
Compute AIC values from multiple RAD models.
rad_aic
This command takes the result of multiple RAD model-fitting (from radfit()
in vegan
) and computes the “best” AIC value for each sample. The result shows the best AIC value and the corresponding model name for each sample.
Usage
rad_aic(x)
Arguments
x |
the result of radfit() from vegan . The result must hold the class "radfit.frame" , i.e. have been run on a community dataset with multiple samples. |
Details
Rank – Abundance Dominance (RAD) or Dominance/Diversity plots display logarithmic species abundances against species rank order. These plots are supposed to be effective in analysing types of abundance distributions in communities. The radfit()
command in vegan
can compute various models for community datasets. The rad_aic()
command takes the result of a radfit()
command and, for each sample, shows the name of the model with the lowest AIC value as well as displaying the AIC.
Value
Returns a data frame with two columns giving the “best” RAD model for each sample in the original community dataset. The columns are: AIC
: the AIC value, and Model
: the name of the model. Each row corresponds to a sample in the original community dataset.
See Also
See radfit()
in vegan
for details of the model fitting. The rad_test()
command carries out a significance test of differences between RAD models for a community sample.
Examples
Compute RAD models for a community dataset
> gb.rad <- radfit(gb.biol)
Explore the radfit() result
> rad_aic(gb.rad) AIC Model E1 106.29094 Mandelbrot E2 134.58026 Preemption E3 108.41663 Preemption E4 160.24125 Mandelbrot E5 183.46446 Preemption E6 146.97826 Preemption G1 106.00117 Zipf G2 83.88072 Mandelbrot G3 68.55087 Mandelbrot G4 104.48570 Mandelbrot G5 100.04912 Mandelbrot G6 98.59426 Mandelbrot W1 154.01455 Preemption W2 81.28201 Preemption W3 74.92607 Preemption W4 98.54388 Preemption W5 104.03335 Preemption W6 82.63346 Preemption
A significance test of differences in deviance of RAD models.
rad_test
This command takes the result of a radfit()
command (from vegan
) and carries out a test of significance (ANOVA) of the various RAD models, examining their deviance. The result holds a class, "rad.htest"
that has print
, summary
and plot
methods.
Usage
rad_test(x, conf.level = 0.95, log = TRUE)
print(x, ...)
summary(x)
plot(x, which = "original", las = 1, ...)
Arguments
x |
the result of a radfit() command from vegan . For summary and plot methods x should be the result of the rad_test() command. |
conf.level |
the confidence level, the default is 0.95. |
log |
Logical. If TRUE (the default) the log of the model deviance is used as the response variable in ANOVA. |
which |
the type of plot, "original" (default) shows boxplot of model deviance, "post.hoc" displays Tukey post-hoc result. |
las |
the orientation of the axis annotations, the default produces labels all horizontal. |
... |
other parameters to pass to plot() or print() . |
Details
Rank – Abundance Dominance (RAD) or Dominance/Diversity plots display logarithmic species abundances against species rank order. These plots are supposed to be effective in analysing types of abundance distributions in communities. The radfit()
command in vegan
can compute various models for community datasets. The rad_test()
command takes the result of a radfit()
command and carries out a test of model deviance using ANOVA. By default the log of model deviance is used, which improves the normality for the ANOVA.
For a meaningful result you should use replicated samples (see Examples).
Value
The rad_test()
command returns invisibly a list
with a custom class "rad.htest"
, which has print
, summary
and plot
methods.
The print
method returns a matrix
showing Mean deviance, Standard Error of deviance and Mean AIC for the RAD models.
The summary
method shows the ANOVA table ($ANOVA
) and a summary of the Tukey Post-Hoc test ($Tukey
).
The plot
method produces a boxplot that shows model deviance for each RAD model. Models are re-ordered by decreasing mean deviance. Optionally the plot
routine can display the post-hoc results.
See Also
See radfit() in vegan for details of the model fitting. The ANOVA is carried out using aov() and the post-hoc via the TukeyHSD() command. The boxplot() command is used to plot the results (see also reorder() for re-ordering of samples for plotting).
See rad_aic() for an alternative command that shows the “best” RAD model for each sample using results from radfit().
Examples
Make RAD models for six samples of dataset these are replicates (from same habitat type).
> gb.rad <- radfit(gb.biol[1:6,])
Carry out ANOVA
> gb.rt <- rad_test(gb.rad)
Print result
> print(gb.rt, digits = 3)
Null Preemption Lognormal Zipf Mandelbrot
Mean deviance 730 90.4 129.29 163.9 68.15
SE of deviance 66 14.7 8.43 14.0 9.48
Mean AIC 799 161.7 202.61 237.2 143.47
Summary of result shows ANOVA and Post-Hoc
> summary(gb.rt)
$ANOVA
Df Sum Sq Mean Sq F value Pr(>F)
model 4 20.9193 5.2298 65.545 6.937e-13 ***
Residuals 25 1.9948 0.0798
Signif. codes: 0 '' 0.001 '' 0.01 '' 0.05 '.' 0.1 ' ' 1
$Tukey
diff lwr upr p adj
Pre-Mb 0.2700865 -0.20887308 0.7490461 4.776096e-01
Log-Mb 0.6773801 0.19842055 1.1563397 2.816974e-03
Zip-Mb 0.9053575 0.42639796 1.3843171 8.202652e-05
BS-Mb 2.3975406 1.91858106 2.8765002 8.252288e-13
Log-Pre 0.4072936 -0.07166594 0.8862532 1.233203e-01
Zip-Pre 0.6352710 0.15631147 1.1142306 5.339678e-03
BS-Pre 2.1274541 1.64849457 2.6064137 1.145184e-11
Zip-Log 0.2279774 -0.25098216 0.7069370 6.345799e-01
BS-Log 1.7201605 1.24120094 2.1991201 1.043389e-09
BS-Zip 1.4921831 1.01322353 1.9711427 1.790295e-08
$Logarithmic
[1] TRUE
$Confidence.Level
[1] 0.95
Print the results
# 2 columns in plot window > opar <- par(mfrow = c(1,2)) # Box-plot > plot(gb.rt, ylab = "Log of deviance", main = "RAD deviance") # Tukey result > plot(gb.rt, which = "post.hoc") > abline(v = 0, lty = 3) # reset plot window > par(opar)
S
Species richness for a community dataset.
spec.rich
This command computes species richness for a community dataset.
Usage
spec.rich(data)
Arguments
data |
a community dataset as a data.frame or matrix . Rows as samples. |
Details
The number of species in a community sample (the species richness) is a basic measure of diversity.
Value
A numeric vector
containing the species richness for each sample of the original dataset. The vector
has a names
attribute to match the sample names of the original dataset.
See Also
The specnumber()
command in the vegan
package can also compute richness (including groupwise since version 2.0-0). See also the which()
command that is used to select elements > 0 in the calculations, and length()
.
The sr_est()
and specaccum()
commands for modelling species richness with accumulating sites.
Examples
Species richness using a 2-row matrix
> spec.rich(DeVries) canopy under 56 65
Richness from a 16-row data frame
> spec.rich(gb.biol) E1 E2 E3 E4 E5 E6 G1 G2 G3 G4 G5 G6 W1 W2 W3 W4 W5 W6 17 14 15 25 21 17 28 22 18 28 26 24 12 11 11 12 12 10
Pairwise chi squared species association.
species_assoc
This command takes two species that you specify from a community dataset and carries out a pairwise chi squared association test.
Usage
species_assoc(spA, spB, data)
Arguments
spA spB |
Index values for the two species to be analysed, corresponding to their position in the rownames attributes, i.e. a numerical value corresponding to the row number. |
data |
the data.frame (or matrix ) containing the species for analysis; rows as species. |
Details
A chi squared approach can be used to explore species co-occurrence. Species that are positively associated will tend to be from the same community, whereas species that are negatively associated will be from different communities. The species_assoc()
command takes a dataset and computes the chi squared statistics of species co-occurrence for two species that you specify.
Value
A list
object (returned invisibly) with four components:
spA |
The name of the first species (the name will be taken from the rownames attribute of the data). |
spB |
The name of the second species. |
chi.sq |
The chi squared test result – includes the observed, expected and Pearson residuals for the pair of selected species. The “standard” chi squared result is returned to screen. |
co.occur |
The co-occurrence matrix . |
See Also
See the chisq.test()
for general details of chi squared testing. The designdist()
command from vegan
is used to compute the co-occurrence matrix. See also the species_ind()
command for using chi squared in indicator species analysis.
Examples
Look at the species available
> rownames(hsereT)
[1] "Carex rostrata" "Fillipendula ulmaria"
[3] "Caltha palustris" "Bryophyte"
[5] "Galium palustre" "Valeriana officianalis"
[7] "Ranunculus repens" "Cochlearia spp"
[9] "Epilobium palustre" "Myosotis scorpioides"
[11] "Mentha rostrat" "Rumex acetosa"
[13] "Succisa pratensis" "Molinea caerulea"
[15] "Juncus rostrat" "Eriophorum vaginatum"
[17] "Eriophorum angustifloium" "Sphagnum spp."
[19] "Calluna vulgaris" "Vaccinium myrtillus"
[21] "Polytrichum spp." "Vaccinium vitis-idea"
Carry out pairwise analysis for species in rows 1,2
> hsere.co <- species_assoc(1, 2, data = hsereT) Pearson's Chi-squared test with Yates' continuity correction data: mat X-squared = 15.1532, df = 1, p-value = 9.913e-05 Warning message: In chisq.test(mat) : Chi-squared approximation may be incorrect
View the co-occurrence matrix
hsere.co$co.occur Fillipendula ulmaria Carex rostrata + - + 11 11 - 0 28
Look at the Pearson residuals
> hsere.co$chi.sq$residuals Fillipendula ulmaria Carex rostrata + - + 2.800000 -1.487038 - -2.481935 1.318118
Indicator species analysis by chi squared.
species_ind
This command takes a community dataset and carries out a chi squared test for indicator species. The species can be arranged as the rows or columns. The samples are used as the groupings for the analysis, so you may need to group the dataset using rowsum()
before analysis. The result shows only those species that have a significant association (positive or negative) for all samples/groups.
Usage
species_ind(x, rows = c("species", "samples"))
Arguments
x |
A community dataset, usually a data.frame or matrix . |
rows |
Should the rows be treated as “species” (the default) or “samples”? |
Details
An indicator species is one that is positively associated with a particular site or habitat and negatively associated with other sites/habitats. Ideally an indicator species will also be found only rarely in other sites/habitats. The chi squared approach is one way to explore species associations with sites/habitats.
Value
A list
with two components (the indicators component is returned directly and the entire list returned invisibly):
chi.test |
The results of the chi squared test, which includes the observed and expected values as well as the Pearson residuals. |
indicators |
A matrix with the names of the indicator species and the grouping; the data are the Pearson residuals. Only species with all Pearson residuals >2 are shown (i.e. only significant ones). |
See Also
See chisq.test()
for details of the chi squared process. See also species_assoc()
for pairwise species association. The indval()
command from labdsv
is probably a more robust alternative, which uses a quite different approach.
Examples
Make a dataset by grouping community samples (by habitat)
> gb <- rowsum(gb.biol, group = gb.site$Habitat)
Indicator species based on habitat
> gb.ind <- species_ind(gb, rows = "samples") Aba.par Ago.afr Bem.big Bem.man Cal.fus Cli.fos Edge 4.092740 -2.841032 -3.192145 -4.170376 -2.841032 -2.979700 Grass -2.034203 9.506136 11.182116 18.561500 9.506136 9.970119 Wood -2.706151 -2.963855 -3.629967 -7.107075 -2.963855 -3.108518 Lei.ruf Neb.bre Poe.cup Edge -7.245067 -3.209243 -2.682207 Grass -5.450846 -6.097483 17.636905 Wood 10.205860 6.724140 -7.980425
Plot Pearson residuals for Edge habitat
> barplot(gb.ind$ind[1,]) > title(main = "Habitat: Edge", ylab = "Pearson Residual") > abline(h = 0) > abline(h = c(2, -2), lty = 3)
Species richness estimation with accumulating sites.
sr_est
This command uses accumulating species richness and a log model to determine the estimated species richness for a complete dataset. You can use a community dataset or the result of the specaccum()
command from vegan
. The result contains the estimated species richness and the log model. The result holds a class "srest"
, which has print
, plot
and summary
methods.
Usage
sr_est(accum) print(srest, ...) summary(srest, digits = getOption("digits")) plot(srest, col.points = "black", col.line = "black", lwd.line = 1, lwd.points = 1, ...)
Arguments
accum |
A vector of values corresponding to accumulating species richness or, a community dataset (rows as samples) or, the result of a specaccum() command from vegan . |
srest |
The result of an sr_est() command. |
digits |
The number of significant figures to display. |
col.points lwd.points |
The colour and width of the points used in the base plot. |
col.line lwd.line |
The colour and width used for the best-fit line (produced via abline ). |
... |
Additional parameters to pass to plot() or print() . |
Details
Species accumulation curves (SAC) are used to compare diversity properties of community data sets using different accumulator functions. The sr_est()
command uses accumulating species richness as a starting point – the data can be a vector of accumulating values or the result of a vegan::specaccum()
command. Alternatively you can use a community dataset as a data.frame
or matrix
. A log-linear model is then computed to show the estimated species richness from the accumulated sites.
Value
A list
with custom class "srest"
, which has print
, summary
and plot
methods. The components are:
data.model |
The log model data; a data frame with cumulative species, number of new species and log of cumulative species corresponding to the samples in the original dataset. |
coefficients |
The intercept and slope from the log model. |
species.estimate |
The estimated species richness based on the log model. |
See Also
See the specaccum()
command in vegan
for details of methods of estimating species richness using various collector methods, and specpool()
for estimated richness in a species pool. See spec.rich()
and specnumber()
for simple calculation of species richness. The log model in sr_est()
is computed using the lm()
command.
Examples
Load vegan and dune data
> require(vegan) > data(dune)
Estimated richness
> specpool(dune) Species chao chao.se jack1 jack1.se jack2 boot All 30 32.25 3.395769 32.85 1.645448 33.84474 31.54040 boot.se n All 1.153121 20
Use log model for estimate
> dune.sr <- sr_est(dune) > dune.sr Estimated Species Richness 29.42548
Visualise the model
> plot(dune.sr, lty = 2)
Use species accumulation curve for first 6 rows (all same habitat).
> gb.sa <- specaccum(gb.biol[1:6,])
Summary of log model
> summary(sr_est(gb.sa))
Model Data:
cum.spp new.spp log.cs
1 18.16667 18.166667 1.259275
2 21.26667 3.100000 1.327699
3 23.30000 2.033333 1.367356
4 24.80000 1.500000 1.394452
5 26.00000 1.200000 1.414973
6 27.00000 1.000000 1.431364
Model coefficients:
Intercept Slope
131.0205 -92.6311
Estimated Species Richness
25.96767
My Publications
I have written several books on ecology and data analysis
Register your interest for our Training Courses
We run training courses in data management, visualisation and analysis using Excel and R: The Statistical Programming Environment. Courses will be held at one of our training centres in London. Alternatively we can come to you and provide the training at your workplace. Training Courses are also available via an online platform.
Get In Touch Now
for any information regarding our training courses, publications or help with a data project