Custom R Commands

Home » R Programming, Excel & Data Management Books » Community Ecology Book » Custom R Commands

On this page find a summary of the custom R commands developed during production of the book. These are mentioned throughout the text and are available as part of the associated data file (see: Support Files). See also the outline and Table of Contents.

Each entry is laid out in a similar manner to standard R-help entries. I’ve included links to other R commands as appropriate. Most of the examples link to data used in the book (see: Support Files).

See the News page for details about updated files, new custom commands and so on.

See our sister site, DataAnalytics: Ecology Matters, for resources for Ecology Students & Teachers. Including: data examples to use for practise and demonstration, and Custom Functions for R: The Statistical Programming Language.

Buy this book online

Buy Now

Custom Commands Index

A | C | D | E | F | H | I | P | Q | R | S

Table: Custom R functions in Community Ecology support file.

`aic`	Model-building: add single terms to a model.
`comm_grp`	Group community data using grouping variable.
`Dapp`	Bootstrap confidence intervals for Simpson’s diversity index for multiple samples.
`dist_chi`	Calculates pairwise chi squared statistics for a community dataset.
`diversity_comp`	Significance of diversity between two samples using permutation.
`eacc`	Estimated species accumulation for multiple sample groups.
`entropy_plot`	Visualize the results from the `renyi()` or `tsallis()` commands (in `vegan`).
`fisher_alpha`	Calculates Fisher’s alpha, a measure of ecological diversity.
`fisher_fit`	Fits Fisher’s logseries to abundance data.
`H_boot`	Bootstrap confidence intervals for ecological diversity indices.
`H_bss`	Significance between diversity index of two samples using bootstrapping.
`H_ci`	bootstrap resampling to estimate confidence intervals for a single sample. It is intended to be used with the `apply()` command to provide bootstrap confidence intervals for multiple samples.
`H_CI`	computes the confidence interval for the t-test used with Shannon or Simpson’s indices.
`H_sig`	carries out a version of the t-test for either Shannon or Simpson’s indices.
`Happ`	Bootstrap confidence intervals to be computed for Shannon entropy for multiple samples.
`Iapp`	Bootstrap confidence intervals to be computed for Simpson’s Inverse index for multiple samples.
`pacc`	Extrapolated species richness in a species pool.
`plot_H`	Point plot of diversity index and bootstrap confidence intervals.
`polar.ord`	Polar Ordination (aka Bray-Curtis ordination).
`qstar`	Plot Tsallis entropy profile using q*.
`rad_aic`	Compute AIC values from multiple RAD models.
`rad_test`	A significance test of differences in deviance of RAD models.
`spec.rich`	Species richness for a community dataset.
`species_assoc`	Pairwise chi squared species association.
`species_ind`	Indicator species analysis by chi-squared.
`sr_est`	Species richness estimation with accumulating sites.

A

Model-building: add single terms to a model.

aic

This command is a wrapper for add1(). It computes all the single terms that can be added to the model and returns AIC values.

Usage

aic(model, scope, decreasing = FALSE, n = "all", ...)

Arguments

`model`	a fitted model object.
`scope`	a formula giving the terms to be considered for adding.
`decreasing`	logical. The order to display the results `decreasing = FALSE` (default) shows results in order of increasing AIC.
`n`	how many variables to display (default: `n = "all"`). Use integer value to return n items.
`...`	additional parameters to pass to `add1()`.

Details

This command is a wrapper for add1(). It computes all the single terms in the scope argument that can be added to the model, fit those models and compute a table of the changes in fit. The add1() command produces results with variables listed in alphabetical order, whilst aic() allows results to be ordered by AIC value. Additionally, you can specify the number of variables to display.

Value

An object of class "anova" summarizing the differences in fit between the models.

Examples

> library(vegan) # Load vegan package

> data(varespec) # Biological dataset
> data(varechem) # Environmental variables
> H <- diversity(varespec, index = "shannon") # Shannon index for samples
> vlm <- lm(H ~ 1, data = varechem) # A blank model
> aic(vlm, scope = varechem, n = 3, test = "F") # display first 3 lowest AIC values

Single term additions
Model:
H ~ 1
Df Sum of Sq    RSS     AIC F value    Pr(>F)
Baresoil  1   1.95786 1.0884 -70.240 39.5746 2.486e-06 ***
Humdepth  1   0.75534 2.2909 -52.378  7.2536   0.01328 *
Fe        1   0.48645 2.5598 -49.715  4.1808   0.05303 .
Signif. codes:  0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Compare to add1() result (not shown)

> add1(vlm, scope = varechem, n = 3, test = "F")

C

Group community data using grouping variable.

comm_grp

This command can be used instead of rowsum() and essentially does the same thing.

Usage

comm_grp(comm, groups, data)

Arguments

`comm`	a community dataset as a `data.frame`, `matrix` or `table`.
`groups`	a grouping variable.
`data`	an optional `data.frame` where the grouping variable can be found.

Details

The command uses a grouping variable to combine samples in a community dataset.

Value

Returns a matrix with rows as samples grouped by the variable in group. The columns are the species as per the original community dataset.

Examples

Samples in a community dataset

> rownames(gb.biol)
[1] "E1" "E2" "E3" "E4" "E5" "E6" "G1" "G2" "G3" "G4" "G5" "G6"
[13] "W1" "W2" "W3" "W4" "W5" "W6"

Some of the original community data

> gb.biol[1:5, 1:5]
   Aba.par Acu.dub Ago.afr Ago.ful Ago.mue
E1     388       0       0       0       0
E2     325       0       0       0       0
E3     295       0       0       0       0
E4     350       0       0       0       2
E5     407       0       0       0       1

A grouping variable

> gb.site$Habitat
[1] Edge  Edge  Edge  Edge  Edge  Edge  Grass Grass Grass Grass
[11] Grass Grass Wood  Wood  Wood  Wood  Wood  Wood
Levels: Edge Grass Wood

Group data according to grouping variable

> comm_grp(gb.biol, groups = Habitat, data = gb.site)
      Aba.par Acu.dub Ago.afr Ago.ful Ago.mue Ago.vid Ama.aen
Edge     2146       0       0       0       3       0       0
Grass     709       3      20       3       0       2       2
Wood     2013       0       0       0       0       0       0

D

Bootstrap confidence intervals for Simpson’s index.

Dapp

This is intended to be used with apply(); allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.

Usage

Dapp(x)

Arguments

`x`	a `vector` of values; a single sample from a community dataset. The intention is to use this as the `FUN` argument of `apply` (or similar) without a separate argument.

Details

Use this command as a FUN argument in apply() to return bootstrap confidence intervals for Simpson’s diversity index from multiple samples.

Value

97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply() the result is a matrix with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.

Examples

Use on single sample

> DeVries[1,1:6]
Hisache Panprol Neshewi Morachi Taysp.1 Coldirc
   1882    1028      19       5       8     273

> DV.ci <- Dapp(DeVries[1,])
> DV.ci
     2.5%     97.5%
0.8418572 0.8553752

Use on rows of a dataset with multiple samples

> apply(DeVries, MARGIN = 1, FUN = Dapp)
          canopy     under
2.5%  0.8417613 0.9169560
97.5% 0.8557949 0.9224317

Chi-squared association for community analysis

dist_chi

Calculates pairwise chi squared statistics for a community dataset. The result has a class "distchi", which has plot and summary methods. The results include Pearson residuals as a dissimilarity matrix. The plot method produces an hierarchical cluster object, and plots the dendrogram.

Usage

dist_chi(data)

summary(x)

plot(x, method = "complete", ...)

Arguments

`data`	a community dataset (usually a `matrix` or `data.frame`) with columns as samples.
`x`	the result of a `distchi()` command.
`method`	the cluster joining algorithm to be used by `hclust()`, use one of `"ward"`, `"single"`, `"complete"` (the default), `"average"`, `"mcquitty"`, `"median"`, or `"centroid"`.
`...`	other parameters to pass to the plotting command.

Details

Chi squared association can be used to analyse species co-occurrence across many samples. Species that tend to associate together will be more likeley to be from the same community. Species that tend to be negatively associated will be form different communities. So, this approach can be used to identify groups of species and communities using data from many samples. This command carries out the association tests and used the Pearson residuals as the basis for a dissimilarity matrix (all residuals are rescaled to make the smallest zero). This is used by the plot method to produce a dendrogram showing the species groupings.

Value

An object of class "distchi", which has various components:

`chi`	The individual chi square values for the pairwise species co-occurrences.
`expected`	The expected values for the pairwise species co-occurrences.
`residuals`	The Pearson residuals for the pairwise species co-occurrences.
`distance`	The dissimilarity matrix based on the Pearson residuals.
`p.val`	The p-value for the overall chi squared association.
`data`	The name of the data used in the analysis.
`spp`	The number of species in the dataset.

Examples

Use a dataset with rows as species and columns as samples

> hsere.cs <- dist_chi(hsereT)
> summary(hsere.cs)

Available components:
[1] "chi"       "expected"  "residuals" "distance"  "p.val"
[6] "data"      "spp"

Total.Chi df         p.val
hsereT  632.9612 22 1.027601e-119

A dendrogram of the result

> plot(hsere.cs, method = "complete")

Diversity index comparison by permutation.

diversity_comp

Computes significance of diversity index between two samples by permutation method (oecosimu). You can use any of the diversity indices used by the diversity() command in vegan.

Usage

diversity_comp(comm, index = "shannon", perm = 1999)

Arguments

`comm`	a community dataset of exactly two samples.
`index`	a diversity index; options are `"shannon"` (default), `"simpson"` and `"invsimpson"`.
`perm`	the number of permutations, default `perm = 1999`.

Details

This command uses the vegan::oecosimu() command and is essentially a wrapper for it. It uses a permutation process to repeatedly sample two communities. The difference in the diversity index is then assessed from the permutation results.

Value

The result is an object that holds two classes: "oecosimu" and "list". There are two components: statistic, the calculated difference between the community indices and oecosimu, the results from the simulation. The latter holds the main results, which can be summarised using the print routine (i.e. simply typing the name of the result). Results include z-statistic, 95% CI and p-value (from simulation).

Examples

A data.frame with only two rows

> diversity_comp(DeVries, index = "simpson", perm = 2000)

oecosimu with 2000 simulations
simulation method r2dtable
alternative hypothesis: true mean is not equal to the statistic
           statistic          z       2.5%        50%  97.5%
statistic 7.0746e-02 5.5458e+01 6.6880e-05 1.3523e-03 0.0045
           Pr(sim.)
statistic 0.0004998 ***

Use a data.frame with many rows and select two for comparison

> diversity_comp(gb.biol[c(1,7), ])

oecosimu with 1999 simulations
simulation method r2dtable
alternative hypothesis: true mean is not equal to the statistic
          statistic        z      2.5%       50% 97.5% Pr(sim.)
statistic  0.668067 10.135002  0.003174  0.063465 0.218    5e-04

E

Estimated species accumulation.

eacc

This command is intended to be used with lapply() so that you can compute estimated species accumulation for multiple sample groups.

Usage

eacc(x, data)

Arguments

`x`	An index value from a community dataset.
`data`	A community dataset, usually a `data.frame` or `matrix`.

Details

This function is intended to be used in lapply() rather than as a standalone command. It allows you to take a community dataset and split it by a grouping variable (an index you define). Then the function carries out estimated species accumulation groupwise, using the index you specify and the vegan::estaccumR() command.

Value

The result is a "list" with a component for each grouping variable specified. Each component contains the result of the estaccumR() command, that is estimated species accumulation using various methods.

Examples

## Make an index from rows of community dataset
> index <- list(Edge = 1:6, Grass = 7:12, Wood = 13:18)

> index
$Edge
[1] 1 2 3 4 5 6
$Grass
[1]  7  8  9 10 11 12
$Wood
[1] 13 14 15 16 17 18

Use index value to extract species accumulation by group

> lapply(index, FUN = eacc, data = gb.biol)

$Edge
N     S     Chao      ACE
1 18.19 20.49214 21.44664
2 21.47 23.23650 23.67794
3 23.80 25.20936 26.06676
4 25.30 26.48750 27.45683
5 26.21 27.51250 28.03773
6 27.00 28.00000 28.75118

$Grass
N     S     Chao      ACE
1 24.04 31.43000 32.11511
2 29.51 33.98068 36.43205
3 32.47 34.64050 37.42315
4 34.38 35.53885 37.24407
5 35.71 36.55452 37.74246
6 37.00 38.00000 38.72006

$Wood
N     S   Chao      ACE
1 11.45 12.320 14.03674
2 12.18 12.700 13.25899
3 12.59 12.915 13.62097
4 12.89 13.170 13.79830
5 13.00 13.000 13.39019
6 13.00 13.000 13.00000

Plot results from Rényi or Tsallis calculations.

entropy_plot

This command allows you to visualize the results from the renyi() or tsallis() commands (in vegan). Usually you’ll use it as an alternative to the usual lattice plotting command to overlay multiple results in a single plot window.

Usage

entropy_plot(H, type = "l",
             ylab = "Diversity",
             xlab = "Scale",
             legend = TRUE,
             col = palette(),
             lty = 1:6, cex = 1,
             pch = NULL,
             cex.legend = 1, ...)

Arguments

`H`	the result from a `renyi()` or `tsallis()` command from the vegan package.
`type`	the type of plot, default `"l"` produces lines. Alternatives are `"p"`, points, `"b"`, both, `"o"` overplotted, `"n"` none.
`ylab` `xlab`	axis labels for y and x-axes
`legend`	Logical. The default, `TRUE`, displays a legend.
`col`	the default colours to use on the plot, the default uses the current colour palette. Colours are recycled as required.
`lty`	the line style(s) used, the default uses types 1:6 and these are recycled as needed.
`cex`	character expansion for plotted points, default = `1`.
`pch`	plotting symbols, default is `NULL`.
`cex.legend`	character expansion for legend labels, default = `1`.
`...`	other graphical parameters to pass to `matplot()`.

Details

This command provides an alternative way to visualise results from Rényi or Tsallis entropy calculations. The command takes the result of a tsallis() or renyi() command (from vegan) and produces a line plot for each sample – showing the entropy at the various scales.

The tsallis() and renyi() commands usually show multiple samples using the lattice package, with separate panel for each sample.

The entropy_plot() command uses matplot() to overlay the various samples in one plot. You can choose to add a legend (defaults to the "topright" position). If the input data is not an entropy result the command stops, with an error. You could mimic the result of a tsallis() or renyi() command, the result must hold a class "renyi".

Value

Creates a new graphic from the result of a tsallis() or renyi() command (from vegan). Each sample is shown as a separate line (without markers by default) showing the entropy at the various scales.

Examples

## Community dataset with two samples
> rownames(DeVries)
[1] "canopy" "under"

## Make a Tsallis entropy result
> DV.te <- tsallis(DeVries)

## Plot result using lines with points
> entropy_plot(DV.te, pch = 1:2, type = "b")

F

Ecological diversity: Fisher’s alpha

fisher_alpha

Calculates Fisher’s alpha, a measure of ecological diversity.

Usage

fisher_alpha(x, MARGIN = 1, se = FALSE, ...)

Arguments

`x`	Community data, a `matrix`-like object or a `vector`.
MARGIN	Margin for which the index is computed. Default `1` = rows.
`se`	With option `se = TRUE`, function `fisher_alpha()` returns a `data.frame` with items for α (alpha), its approximate standard errors (se), residual degrees of freedom (df.residual), and the code returned by `nlm` on the success of estimation. See Details.
`...`	Other parameters to pass to `nlm`.

Details

The fisher_alpha() command estimates the α parameter of Fisher’s logarithmic series (see fisher_fit() command). The estimation is possible only for genuine counts of individuals.

This command restores the behaviour of the old command fisher.alpha() in vegan. This is what the vegan team had to say:

“fisherfit: completely rewritten and estimates of standard error removed: I could not find no justification for these. Actually, it seems that the value of Fisher alpha as estimated in the function was independent of the abundance distribution of species, but will be defined by the number of species (S) and number of individuals (N).”

“Now the Fisher alpha is estimated from the relationship S = alpha*(1 + log(N/alpha)) using function uniroot(). Because of this, standard errors cannot be estimated and they were removed. In addition, functions confint.fisherfit, profile.fisherfit and plot.profile.fisherfit were removed. The estimation of standard errors was also removed in function fisher.alpha (that only calls fisherfit).”

So, use with some caution – I replaced the function solely to allow you to get the same results as the exercise in the book.

Value

With option se = TRUE, function fisher_alpha() returns a data.frame with items for α (alpha), its approximate standard errors (se), residual degrees of freedom (df.residual), and the code returned by nlm on the success of estimation.

Examples

This version computes se

> fisher_alpha(DeVries, se = TRUE)
           alpha       se df.residual code
canopy  8.601215 1.249295          55    1
under  10.212885 1.379707          64    1

Newer version in vegan does not compute se even though help suggests it will

> fisher.alpha(DeVries, se = TRUE)
 canopy    under
8.60122 10.21290

Fit Fisher’s logseries to abundance data

fisher_fit

confint.fisherfit

plot.profile.fisherfit

profile.fisherfit

Function fisher_fit() fits Fisher’s logseries to abundance data.

Usage

fisher_fit(x, ...)

Arguments

`x`	Community data, a `matrix`-like object or a `vector`.
`...`	Other parameters to pass to `nlm`.

Details

Fits Fisher’s logseries to ecological abundance data using non-linear modelling. There is a plot method. In addition you can use confint() (from MASS) to determine approximate confidence intervals. There are also profile and plot.profile methods.

Newer versions of vegan have removed the calculations for se (see Details in fisher_alpha()) so treat these with caution. I only restored the commands to allow you to get the same results as the exercises in the book.

Value

Returns an object of class "fisherfit", which is the result of nlm().

Examples

Fit Fisher logseries to sample of data

> ff <- fisher_fit(DeVries[1,])
> ff
Fisher log series model
No. of species: 56
Fisher alpha:   8.601215

Plot the result (not shown)

> plot(ff)

Get approx CI (uses MASS)

> confint(ff)
   2.5 %    97.5 %
6.396521 11.313053

Make a profile result

> profile(ff)
$alpha
          tau  par.vals
1  -3.3649073  5.094402
2  -2.5788594  5.795765
3  -1.8602976  6.497127
4  -1.1967304  7.198490
5  -0.5789605  7.899852
6   0.0000000  8.601215
7   0.5455796  9.302577
8   1.0621142 10.003940
9   1.5531015 10.705303
10  2.0214191 11.406665
11  2.4694633 12.108028
12  2.8992526 12.809390

attr(,"original.fit")
attr(,"original.fit")$coefficients
   alpha
8.601215

attr(,"original.fit")$std.err
[1] 1.249295

attr(,"class")
[1] "profile.fisherfit" "profile.glm"       "profile"

Plot the profile (not shown)

> plot(profile(ff))

H

Bootstrapping diversity indices.

H_boot

Carries out a bootstrap resampling using the boot package. The results compute the confidence intervals. You can specify any diversity index used by the diversity() command in vegan for a single sample but the vegan package is not required.

Usage

H_boot(x, index = "shannon", R = 1000, ci = 95)

Arguments

`x`	a `vector` of community data.
`index`	a diversity index; use one of `"shannon"` (default), `"simpson"` or `"invsimpson"`.
`R`	the number of bootstrap resamples to use, defaults to 1000.
`ci`	the confidence interval to return, the default 95, returns 2.5% and 97.5% quantiles.

Details

A sample of community data can be randomised by repeatedly resampling. The H_boot() command uses the boot() command in the boot package to conduct the resampling. The command allows you to specify various diversity index measures and computes the confidence intervals based on quantiles from the resampling.

Value

Produces a "list" with several components: boot.stats gives the results from the boot() command, CI shows the confidence intervals (based on quantiles from the bootstrap resampling), and index gives the calculated diversity index. The command displays the index and CI directly.

Examples

Use defaults for one sample from a dataset

> H_boot(DeVries[1,])
H =  2.640651
   2.5%    97.5%
2.596435 2.675158

Use different index, CI and number of resamples

> H_boot(DeVries[2,], index = "simpson", ci = 99, R = 2000)
H =  0.9198993
     0.5%     99.5%
0.9158286 0.9233032

Bootstrap significance between two diversity indices.

H_bss

This command carries out a bootstrap analysis of the diversity indices of two samples. The result gives a p-value for the permutation as well as a calculated z-score. You can use any of the diversity indices used by the diversity() command in vegan. The result holds a class, "Hbss", which has print and summary methods.

Usage

H_bss(comm, R = 2000, index = "shannon")

print(x, ...)

summary(x)

Arguments

`comm`	a community dataset containing exactly two samples.
`x`	the result of an H_bss() command.
`R`	the number of bootstrap resamples to use, defaults to 2000.
`index`	a diversity index; use one of `"shannon"` (default), `"simpson"` or `"invsimpson"`.
`...`	other parameters to pass to `print`, such as `digits`.

Details

The statistical significance of the difference in diversity index between two samples is estimated using a randomization method. The two samples are repeatedly randomized and the difference in the diversity index recalculated. The significance is assessed by comparing the difference between the resamples and the original difference. Confidence intervals are estimated using the quantiles of the randomization.

A z-score is also computed from the randomizations and the p-value estimated from that.

The command produces a result holding a class "Hbss", which has print and summary routines.

Value

A list object with custom class "Hbss", which has print and summary methods. The result object contains several components:

`H.orig`	The calculated diversity index for the two original samples.
`statistic`	The difference in the diversity index between the two original samples.
`z.score`	The z-score calculated using the original difference and the difference between the bootstrap resamples as well as the standard deviation of the bootstrap samples.
`CI`	The 95% confidence interval for the difference between the indices of the two samples.
`p.sim`	The p-value resulting from the bootstrap resampling.
`p.z`	The p-value estimated using the z-score.
`df`	The degrees of freedom (the number of species -1 for each sample).
`Index`	The name of the diversity index used in the calculations.

Examples

Use 2 samples from a larger dataset

> gb.bs <- H_bss(gb.biol[c(7,13), ])

Print result

> print(gb.bs, digits = 3)
        H.orig.1 H.orig.2 Statistic Z.score    2.5% 97.5% P(sim)
shannon     1.94     1.40     0.538    9.57 0.00335 0.184      0
P(z) df
shannon 1.13e-11 38

Summary method

> summary(gb.bs)
Original shannon Index:  1.935163 1.397251
Difference in index:     0.5379125
z-score:                 9.573475
Confidence interval:     0.003348132 0.1837998
Degrees of Freedom:      38
Simulated P-value:       0
P-value (z):             1.130406e-11

Bootstrap confidence intervals for a single sample.

H_ci

This command carries out a bootstrap resampling to estimate confidence intervals for a single sample. It is intended to be used with the apply() command to provide bootstrap confidence intervals for multiple samples.

Usage

H_ci(x, index = "shannon", R = 1000, ci = 95)

Arguments

`x`	a `vector` of community data.
`index`	a diversity index; use one of `"shannon"` (default), `"simpson"` or `"invsimpson"`.
`R`	the number of bootstrap resamples to use, defaults to 1000.
`ci`	the confidence interval to use, defaults to 95 (i.e. 2.5% and 97.5% quantiles).

Details

A sample of community data can be randomised by repeatedly resampling. The H_ci() command uses the boot() command in the boot package to conduct the resampling. The command allows you to specify various diversity index measures and computes the confidence intervals based on quantiles from the resampling.

This command is intended to be used as the function (FUN argument) in apply(), allowing you to calculate confidence intervals for multiple samples. The results can be used for example, to make error bars in plots of diversity indices for multiple samples.

Value

The command returns (invisibly) the confidence intervals as a simple vector with two values, one for the lower CI and one for the upper CI.

Examples

Use a dataset with two rows (samples)

> apply(DeVries, MARGIN = 1, FUN = H_ci, index = "simpson")
         canopy     under
2.5%  0.8418946 0.9170899
97.5% 0.8554198 0.9224856

Confidence interval for Shannon or Simpson’s t-test.

H_CI

This command computes the confidence interval for the t-test used with Shannon or Simpson’s indices.

Usage

H_CI(x, index = "shannon", ci = 95)

Arguments

`x`	a `vector` of community data.
`index`	a diversity index; use one of `"shannon"` (default), or `"simpson"`.
`ci`	the confidence interval to use the default is 95 and works out 95% CI.

Details

A variant of the t-test can be used to determine the statistical significance of differences in Shannon entropy or Simpson’s index between two samples. The H_CI() command computes the confidence interval for either index using the t-test formula as a starting point. The t-test is sensitive to degrees of freedom (sample size) and bootstrapping methods are becoming more popular (see H_bss() command).

The command can be used with apply() to get the confidence intervals for multiple samples.

Value

Returns a simple numeric value representing the confidence interval at the chosen level (default 95%).

Examples

Get CI for Simpson’s index for single sample

> H_CI(DeVries[1,], index = "simpson")
[1] 0.007075684

Get CI for Shannon for rows of dataset

> apply(DeVries, MARGIN = 1, FUN = H_CI)
    canopy      under
0.07670451 0.09524768

Special t-test for Shannon or Simpson’s index.

H_sig

This command carries out a version of the t-test for either Shannon or Simpson’s indices. The result object has a class "Hsig", which has plot and summary routines.

Usage

H_sig(x, y, index = "shannon", alpha = 0.05)

summary(x)

plot(x, ylab = "Index Value", xlab = "Samples", ...)

Arguments

`x`, `y`	Vectors of community data for comparison; these can be of different length. For `plot` and `summary` methods, `x` is the result of an `H_sig()` command.
`index`	a diversity index; use one of `"shannon"` (default), or `"simpson"`.
`alpha`	the significance level, the default `0.05` computes critical values for p = 0.05.
`xlab`, `ylab`	labels for the x and y-axes.
`...`	additional graphical parameters to pass to `plot()`.

Details

A variant of the t-test can be used to determine the statistical significance of differences in Shannon entropy or Simpson’s index between two samples. The H_sig() command computes the t-test for two samples. The t-test is sensitive to degrees of freedom (sample size) and bootstrapping methods are becoming more popular (see H_bss() command).

The summary routine provides an overview of the test statistics and the plot routine creates a point plot with error bars based on confidence intervals.

Value

The result is a list with a custom class "Hsig", which has summary and plot methods. There are various components:

`p.val`	The p-value for the calculated t-test.
`df`	The degrees of freedom.
`t.statistic`	The calculated value of t.
`var`	The variance for each sample.
`critical`	The critical value for the degrees of freedom and the chosen alpha level.
`H`	The value of the diversity index for each sample.
`index`	The name of the diversity index used.
`alpha`	The chosen level of significance.
`sites`	The site names.
`richness`	The number of species in each sample.
`individuals`	The number of individuals in each sample.

The plot method produces a point plot with error bars showing confidence intervals (using the alpha level from the original t-test).

Examples

Determine significance of t-test between Shannon index of 2 samples

DVh <- H_sig(DeVries[1,], DeVries[2,])
[1] 0.0002608309

Get a summary of the results

> summary(DVh)
Site: DeVries[1, ]
Index: shannon = 2.640651
Variance = 0.0003865389
df = 2.231875
No. Species = 56
Individuals = 5774
Site: DeVries[2, ]
Index: shannon = 2.982633
Variance = 0.0002548043
df =  1.508951
No. Species = 65
Individuals = 5922
t-statistic = 13.50384
df = 3.740253
critical value at 0.05 = 2.854166
Significance = 0.0002608309

Plot the result

> plot(DVh, pch = 16, cex = 2, col = "darkgreen", lty = 1)

Bootstrap confidence intervals for the Shannon index.

Happ

This is intended to be used with apply(); allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.

Usage

Happ(x)

Arguments

`x`	a vector of values; a single sample from a community dataset. The intention is to use this as the `FUN` argument of `apply` (or similar) without a separate argument.

Details

Use this command as a FUN argument in apply() to return bootstrap confidence intervals for Simpson’s diversity index from multiple samples.

Value

When called on a single sample (a vector of values) the command returns invisibly 2.5% and 97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply() the result is a matrix with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.

Examples

Use as FUN in apply command

> apply(DeVries, MARGIN = 1, FUN = Happ)
        canopy    under
2.5%  2.595170 2.943928
97.5% 2.674168 3.009593

Bootstrap confidence intervals for the Inverse Simpson index.

Iapp

This is intended to be used with apply(); allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.

Usage

Iapp(x)

Arguments

`x`	a `vector` of values; a single sample from a community dataset. The intention is to use this as the `FUN` argument of `apply` (or similar) without a separate argument.

Details

Use this command as a FUN argument in apply() to return bootstrap confidence intervals for Inverse Simpson’s diversity index from multiple samples.

Value

When called on a single sample (a vector of values) the command returns invisibly 2.5% and 97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply() the result is a matrix with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.

Examples

Use as FUN in apply command

> apply(DeVries, MARGIN = 1, FUN = Iapp)
        canopy    under
2.5%  6.333271 12.05552
97.5% 6.950071 12.90245

P

Extrapolated species richness in a species pool.

pacc

This command is intended to be used with lapply() for use with a community dataset and a grouping variable. It calls the poolaccum() command from vegan and returns various estimators of species richness.

Usage

pacc(x, data)

Arguments

`x`	a `vector` of index values.
`data`	the `data.frame` holding the community data.

Details

This function is intended to be used in lapply() rather than as a standalone command. It allows you to take a community dataset and split it by a grouping variable (an index you define). Then the function carries out estimated species richness groupwise, using the index you specify and the vegan::poolaccum() command.

Value

The result is a "list" with a component for each grouping variable specified. Each component contains the result of the poolaccum() command, that is estimated species accumulation using various methods.

Examples

Make an index from rows of community dataset

index <- list(Edge = 1:6, Grass = 7:12, Wood = 13:18)
> index
$Edge
[1] 1 2 3 4 5 6
$Grass
[1]  7  8  9 10 11 12
$Wood
[1] 13 14 15 16 17 18

Use index to get extrapolated species richness for groups

> lapply(index, FUN = pacc, data = gb.biol)
$Edge
N     S     Chao Jackknife 1 Jackknife 2 Bootstrap
3 23.29 34.94850    27.38333    28.90000  25.22704
4 24.94 33.60742    29.44750    31.33917  27.05879
5 26.13 34.32750    30.99400    33.28750  28.38051
6 27.00 33.00000    32.00000    34.40000  29.30678
$Grass
N     S     Chao Jackknife 1 Jackknife 2 Bootstrap
3 32.21 38.03772    37.81000    39.45167  34.95630
4 34.10 38.48675    39.38750    40.78250  36.74922
5 35.79 43.29300    41.72600    44.46350  38.58681
6 37.00 53.00000    43.66667    47.93333  39.94114
$Wood
N     S   Chao Jackknife 1 Jackknife 2 Bootstrap
3 12.62 13.325    13.47333    13.66500  13.05148
4 12.91 13.160    13.76500    14.07833  13.32430
5 13.00 13.325    13.52000    13.30250  13.31851
6 13.00 13.000    13.00000    11.93333  13.17563

Point plot of diversity index and bootstrap confidence intervals.

plot_H

This command carries out a bootstrap resampling of a community dataset and plots the results as a point plot that includes error bars showing the 95% confidence intervals. You can use any diversity index used by the diversity() command in vegan.

Usage

plot_H(comm,
       index = "shannon",
       xlab = "Site Names",
       ylab = "Diversity Index",
       pch = 18,
       cex = 1.2,
       cex.axis = 0.75,
       boot = TRUE, ...)

Arguments

`comm`	community dataset, usually a `matrix` or `data.frame`.
`index`	a diversity index; use one of `"shannon"` (default), `"simpson"` or `"invsimpson"`.
`xlab`, `ylab`	labels for x and y axes.
`pch`	plotting symbol, default = `18` (a filled diamond).
`cex`	character expansion, default = 1.2.
`cex.axis`	character expansion for axes, default = 0.75.
`boot = TRUE`	logical, use bootstrap CI (default = `TRUE`) or variance estimators (`FALSE`).
`...`	other graphical parameters to pass to `plot()`.

Details

Bootstrapping is a way to repeatedly resample a community dataset. The confidence interval of the diversity index is assessed from the quantiles of the resamples. The plot_H() command uses the diversity() command from vegan to calculate the diversity index for multiple samples in a community dataset. The H_ci() command is called internally, which computes the confidence intervals by calling boot() from the boot package. The diversity indices are then plotted (as a point plot), using the confidence intervals as error bars. This provides a visual representation of differences in diversity index between samples.

Alternatively you can specify boot = FALSE to use variance estimation for CI. This calls the H_CI() command internally.

Note: if you specify boot = FALSE and index = "invsimpson" an error is returned.

Value

The command produces a list object invisibly; this contains the diversity index, $H and confidence intervals, $CI. These are used to create a point plot with 95% error bars.

Examples

Make a plot and save results

> gbh = plot_H(gb.biol, ylab = "Shannon Entropy")

View the results

> gbh
$H
      E1       E2       E3       E4       E5       E6       G1
1.267096 1.424505 1.412179 1.737146 1.636923 1.557477 1.935163
      G2       G3       G4       G5       G6       W1       W2
2.000272 1.929426 2.360250 2.139567 2.205175 1.397251 1.582102
      W3       W4       W5       W6
1.539511 1.601073 1.616333 1.552470

$CI
            E1       E2       E3       E4       E5       E6
2.5%  1.167435 1.339836 1.320449 1.639057 1.552808 1.477362
97.5% 1.341344 1.487790 1.484144 1.802874 1.698681 1.614796
            G1       G2       G3       G4       G5       G6
2.5%  1.741574 1.805502 1.704016 2.192101 1.975294 2.028207
97.5% 2.047899 2.109588 2.046776 2.449227 2.230595 2.288512
            W1       W2       W3       W4       W5       W6
2.5%  1.337487 1.519955 1.470808 1.534223 1.560459 1.488700
97.5% 1.443202 1.635484 1.597767 1.654974 1.658593 1.601543

Polar Ordination (aka Bray-Curtis ordination)

polar.ord

This command carries out Polar Ordination, a method of ordination (also known as Bray-Curtis ordination). The command works out polar ordination site scores and works out axes as follows: Axis 1 is the combination of sites that gives the maximum separation (not necessarily the maximum variance explained). Axis 2 is the combination of sites that maximises the orthogonality to the 1st axis.

There are print, plot and summary methods and an identify method for use with plots.

Usage

polar.ord(diss)

print.polar.ord(ord, digits = getOption("digits"))

summary.polar.ord(ord, k = 2, cor = TRUE, var.exp = TRUE, digits = getOption("digits"))

plot.polar.ord(ord, x.axis = 1, y.axis = 2, type = c("t", "p", "n"), ...)

identify.plot.polar.ord(pord, labels = pord$labels, ...)

Arguments

`diss`	A dissimilarity `matrix` e.g. as produced by `dist()` or `vegdist()`.
`digits`	The number of digits to display, defaults to current settings.
`ord`	The result of a `polar.ord` command.
`k`	The number of axes to display.
`cor`	If `TRUE`, shows the correlation between ordination axes.
`var.exp`	If `TRUE`, shows the variance explained.
`x.axis`	Which ordination axis to plot on the x-axis.
`y.axis`	Which ordination axis to plot on the y-axis.
`type`	The type of plot, `"t"` text, `"p"` points or `"n"` none.
`pord`	The result of a `plot.polar.ord` command i.e. a polar ordination plot result.
`labels`	A `vector` of labels to use for the sites.
`...`	Other plotting parameters to pass to `plot` or `identify`.

Details

Polar ordination is a method of indirect gradient analysis. The starting point is a dissimilarity matrix, any method of producing a dissimilarity can be used as the starting point for the polar.ord() command. Typical commands would be dist() or vegdist() (from package: vegan). The analysis seeks to maximise the separation between sites along an axis. Subsequent axes are calculated to give the best separation but also to maximise the orthogonality, that is to minimise correlation between the axes.

The method is “simple” enough to be able to carry out using a spreadsheet but identification of subsequent axes can be a labour-intensive and iterative process. Note that in this application of Polar Ordination I have maximised separation, which was the original intention of Bray-Curtis. However, you can also go for a maximise variance explained approach as the summary method displays additional variance explained by each axis.

The print method allows you to view all the generated axes, their separation (the dissimilarity) and variance explained.

The summary method provides an overview that includes site socres, correlation and variance explained.

The plot method allows you to visualise the results from any combination of axes.

The identify method allows you to interactively label points on the plot.

Value

The polar.ord() command produces an object (essentially a list) holding the class "polar.ord", which has print, summary, plot and identify methods. The components are:

sites – the site scores for all the computed axes.
dist – the distance (separation) for each axis, in the same scale as the original dissimilarity.
cor – the correlation (Pearson) between the principal axis (the one with maximum separation) and each of the other axes.
exp – the variance explained by each axis.
add – the additional variance explained by each axis in addition to the original principal axis.
samples – a vector of site names.
axes – a vector of axis names.
diss – the original dissimilarity result (a "dist" object).
method – a character vector giving the dissimilarity metric employed.

These components are used by the print, summary, plot and identify methods.

Examples

The data (columns are sites) from CERE.RData

moss

Transpose sites/species and make a dissimilarity matrix

# Make dissimilarity matrix
d <- dist(t(moss), method = "euclidean")

Run the PO

> po <- polar.ord(d)

> po # Standard print
> print(po, digits = 4) # Use print method
> summary(po, k = 4) # Show summary of first 4 axes

Plot

> plot(po) # Basic plot
> plot(po, x.axis = 1, y.axis = 3) # Plot different axes

Species scores

require(vegan)
# Weighted average species scores
> wa <- wascores(scores(po), t(moss))

Quick plot

> plot(po)
> text(wa, labels = rownames(wa), col = "red", cex = 0.8)

ID plot

> op <- plot(po, type = "p") # Plot points only
> identify(op) # Click points to identify sites (Esc to cancel)

> points(wa, col = "red", pch = "+") # Add species points
> identify(wa, labels = rownames(wa)) # Add labels interactively

Q

Plot Tsallis entropy profile using q*.

qstar

This command uses the eventstar() command from vegan to calculate q*. The profile is then plotted as three plots showing, evenness, diversity and effective species for the Tsallis scales selected.

Usage

qstar(x,
      q = seq(0, 2, 0.05),
      qmax = 5,
      type = "l",
      lcol = "red",
      llty = 1,
      llwd = 1, ...)

Arguments

`x`	a `vector` of community data.
`q`	the scales to use for the Tsallis entropy (passed to `tsallis`).
`qmax`	the maximum q value to use (passed to `eventstar`), default = 5.
`type`	the type of plot, default = `"l"` for lines.
`lcol` `llty` `llty`	colour, type and width for the lines showing q* value and diversity values at q*.
`...`	other graphical parameters to pass to `plot()`.

Details

The function eventstar() finds a characteristic value of the scale parameter q of the Tsallis entropy corresponding to minimum of the evenness (equitability) profile based on Tsallis entropy. This is called q*. Use of q* and corresponding diversity, evenness, and Hill numbers (effective species), is recommended because it is a unique value representing the diversity profile. It is also positively associated with rare species in the community, thus it is a potentially useful indicator of certain relative abundance distributions of the communities.

The qstar() command uses` eventstar()` from vegan to calculate q* and then plots three graphs to show evenness, diversity and effective species at the chosed scales, and their relationship to q.

Value

The command returns the result from the eventstar() command, i.e. a data.frame with several columns showing q (qstar), the value of evenness at q* (Estar), the value of Tsallis entropy at q* (Hstar), and the effective species at q* (Dstar). See eventstar() for details.

Examples

Select a single sample from a community dataset

> qstar(DeVries[1,])
      qstar     Estar    Hstar    Dstar
1 0.6847494 0.1113006 5.062579 20.61567

R

Compute AIC values from multiple RAD models.

rad_aic

This command takes the result of multiple RAD model-fitting (from radfit() in vegan) and computes the “best” AIC value for each sample. The result shows the best AIC value and the corresponding model name for each sample.

Usage

rad_aic(x)

Arguments

`x`	the result of `radfit()` from `vegan`. The result must hold the class `"radfit.frame"`, i.e. have been run on a community dataset with multiple samples.

Details

Rank – Abundance Dominance (RAD) or Dominance/Diversity plots display logarithmic species abundances against species rank order. These plots are supposed to be effective in analysing types of abundance distributions in communities. The radfit() command in vegan can compute various models for community datasets. The rad_aic() command takes the result of a radfit() command and, for each sample, shows the name of the model with the lowest AIC value as well as displaying the AIC.

Value

Returns a data frame with two columns giving the “best” RAD model for each sample in the original community dataset. The columns are: AIC: the AIC value, and Model: the name of the model. Each row corresponds to a sample in the original community dataset.

Examples

Compute RAD models for a community dataset

> gb.rad <- radfit(gb.biol)

Explore the radfit() result

> rad_aic(gb.rad)
         AIC      Model
E1 106.29094 Mandelbrot
E2 134.58026 Preemption
E3 108.41663 Preemption
E4 160.24125 Mandelbrot
E5 183.46446 Preemption
E6 146.97826 Preemption
G1 106.00117       Zipf
G2  83.88072 Mandelbrot
G3  68.55087 Mandelbrot
G4 104.48570 Mandelbrot
G5 100.04912 Mandelbrot
G6  98.59426 Mandelbrot
W1 154.01455 Preemption
W2  81.28201 Preemption
W3  74.92607 Preemption
W4  98.54388 Preemption
W5 104.03335 Preemption
W6  82.63346 Preemption

A significance test of differences in deviance of RAD models.

rad_test

This command takes the result of a radfit() command (from vegan) and carries out a test of significance (ANOVA) of the various RAD models, examining their deviance. The result holds a class, "rad.htest" that has print, summary and plot methods.

Usage

rad_test(x, conf.level = 0.95, log = TRUE)

print(x, ...)

summary(x)

plot(x, which = "original", las = 1, ...)

Arguments

`x`	the result of a `radfit()` command from `vegan`. For `summary` and `plot` methods `x` should be the result of the `rad_test()` command.
`conf.level`	the confidence level, the default is 0.95.
`log`	Logical. If `TRUE` (the default) the log of the model deviance is used as the response variable in ANOVA.
`which`	the type of plot, `"original"` (default) shows boxplot of model deviance, `"post.hoc"` displays Tukey post-hoc result.
`las`	the orientation of the axis annotations, the default produces labels all horizontal.
`...`	other parameters to pass to `plot()` or `print()`.

Details

Rank – Abundance Dominance (RAD) or Dominance/Diversity plots display logarithmic species abundances against species rank order. These plots are supposed to be effective in analysing types of abundance distributions in communities. The radfit() command in vegan can compute various models for community datasets. The rad_test() command takes the result of a radfit() command and carries out a test of model deviance using ANOVA. By default the log of model deviance is used, which improves the normality for the ANOVA.

For a meaningful result you should use replicated samples (see Examples).

Value

The rad_test() command returns invisibly a list with a custom class "rad.htest", which has print, summary and plot methods.

The print method returns a matrix showing Mean deviance, Standard Error of deviance and Mean AIC for the RAD models.

The summary method shows the ANOVA table ($ANOVA) and a summary of the Tukey Post-Hoc test ($Tukey).

The plot method produces a boxplot that shows model deviance for each RAD model. Models are re-ordered by decreasing mean deviance. Optionally the plot routine can display the post-hoc results.

Examples

Make RAD models for six samples of dataset these are replicates (from same habitat type).

> gb.rad <- radfit(gb.biol[1:6,])

Carry out ANOVA

> gb.rt <- rad_test(gb.rad)

Print result

> print(gb.rt, digits = 3)
               Null Preemption Lognormal  Zipf Mandelbrot
Mean deviance   730       90.4    129.29 163.9      68.15
SE of deviance   66       14.7      8.43  14.0       9.48
Mean AIC        799      161.7    202.61 237.2     143.47

Summary of result shows ANOVA and Post-Hoc

> summary(gb.rt)
$ANOVA
            Df  Sum Sq Mean Sq F value    Pr(>F)
model        4 20.9193  5.2298  65.545 6.937e-13 ***
Residuals   25  1.9948  0.0798

Signif. codes:  0 '' 0.001 '' 0.01 '' 0.05 '.' 0.1 ' ' 1
$Tukey
             diff         lwr       upr        p adj
Pre-Mb  0.2700865 -0.20887308 0.7490461 4.776096e-01
Log-Mb  0.6773801  0.19842055 1.1563397 2.816974e-03
Zip-Mb  0.9053575  0.42639796 1.3843171 8.202652e-05
BS-Mb   2.3975406  1.91858106 2.8765002 8.252288e-13
Log-Pre 0.4072936 -0.07166594 0.8862532 1.233203e-01
Zip-Pre 0.6352710  0.15631147 1.1142306 5.339678e-03
BS-Pre  2.1274541  1.64849457 2.6064137 1.145184e-11
Zip-Log 0.2279774 -0.25098216 0.7069370 6.345799e-01
BS-Log  1.7201605  1.24120094 2.1991201 1.043389e-09
BS-Zip  1.4921831  1.01322353 1.9711427 1.790295e-08

$Logarithmic
[1] TRUE

$Confidence.Level
[1] 0.95

Print the results

# 2 columns in plot window
> opar <- par(mfrow = c(1,2))
# Box-plot
> plot(gb.rt, ylab = "Log of deviance", main = "RAD deviance")
# Tukey result
> plot(gb.rt, which = "post.hoc")
> abline(v = 0, lty = 3)
# reset plot window
> par(opar)

S

Species richness for a community dataset.

spec.rich

This command computes species richness for a community dataset.

Usage

spec.rich(data)

Arguments

data a community dataset as a data.frame or matrix. Rows as samples.

Details

The number of species in a community sample (the species richness) is a basic measure of diversity.

Value

A numeric vector containing the species richness for each sample of the original dataset. The vector has a names attribute to match the sample names of the original dataset.

Examples

Species richness using a 2-row matrix

> spec.rich(DeVries)
canopy  under
    56     65

Richness from a 16-row data frame

> spec.rich(gb.biol)
E1 E2 E3 E4 E5 E6 G1 G2 G3 G4 G5 G6 W1 W2 W3 W4 W5 W6
17 14 15 25 21 17 28 22 18 28 26 24 12 11 11 12 12 10

Pairwise chi squared species association.

species_assoc

This command takes two species that you specify from a community dataset and carries out a pairwise chi squared association test.

Usage

species_assoc(spA, spB, data)

Arguments

`spA` `spB`	Index values for the two species to be analysed, corresponding to their position in the `rownames` attributes, i.e. a numerical value corresponding to the row number.
`data`	the `data.frame` (or `matrix`) containing the species for analysis; rows as species.

Details

A chi squared approach can be used to explore species co-occurrence. Species that are positively associated will tend to be from the same community, whereas species that are negatively associated will be from different communities. The species_assoc() command takes a dataset and computes the chi squared statistics of species co-occurrence for two species that you specify.

Value

A list object (returned invisibly) with four components:

`spA`	The name of the first species (the name will be taken from the `rownames` attribute of the data).
`spB`	The name of the second species.
`chi.sq`	The chi squared test result – includes the observed, expected and Pearson residuals for the pair of selected species. The “standard” chi squared result is returned to screen.
`co.occur`	The co-occurrence `matrix`.

Examples

Look at the species available

> rownames(hsereT)
[1] "Carex rostrata"           "Fillipendula ulmaria"
[3] "Caltha palustris"         "Bryophyte"
[5] "Galium palustre"          "Valeriana officianalis"
[7] "Ranunculus repens"        "Cochlearia spp"
[9] "Epilobium palustre"       "Myosotis scorpioides"
[11] "Mentha rostrat"          "Rumex acetosa"
[13] "Succisa pratensis"        "Molinea caerulea"
[15] "Juncus rostrat"           "Eriophorum vaginatum"
[17] "Eriophorum angustifloium" "Sphagnum spp."
[19] "Calluna vulgaris"         "Vaccinium myrtillus"
[21] "Polytrichum spp."         "Vaccinium vitis-idea"

Carry out pairwise analysis for species in rows 1,2

> hsere.co <- species_assoc(1, 2, data = hsereT)

Pearson's Chi-squared test with Yates' continuity
correction
data:  mat
X-squared = 15.1532, df = 1, p-value = 9.913e-05
Warning message:
In chisq.test(mat) : Chi-squared approximation may be incorrect

View the co-occurrence matrix

hsere.co$co.occur
              Fillipendula ulmaria
Carex rostrata  +  -
             + 11 11
             -  0 28

Look at the Pearson residuals

> hsere.co$chi.sq$residuals
              Fillipendula ulmaria
Carex rostrata         +         -
             +  2.800000 -1.487038
             - -2.481935  1.318118

Indicator species analysis by chi squared.

species_ind

This command takes a community dataset and carries out a chi squared test for indicator species. The species can be arranged as the rows or columns. The samples are used as the groupings for the analysis, so you may need to group the dataset using rowsum() before analysis. The result shows only those species that have a significant association (positive or negative) for all samples/groups.

Usage

species_ind(x, rows = c("species", "samples"))

Arguments

`x`	A community dataset, usually a `data.frame` or `matrix`.
`rows`	Should the rows be treated as “species” (the default) or “samples”?

Details

An indicator species is one that is positively associated with a particular site or habitat and negatively associated with other sites/habitats. Ideally an indicator species will also be found only rarely in other sites/habitats. The chi squared approach is one way to explore species associations with sites/habitats.

Value

A list with two components (the indicators component is returned directly and the entire list returned invisibly):

`chi.test`	The results of the chi squared test, which includes the observed and expected values as well as the Pearson residuals.
`indicators`	A `matrix` with the names of the indicator species and the grouping; the data are the Pearson residuals. Only species with all Pearson residuals >2 are shown (i.e. only significant ones).

Examples

Make a dataset by grouping community samples (by habitat)

> gb <- rowsum(gb.biol, group = gb.site$Habitat)

Indicator species based on habitat

> gb.ind <- species_ind(gb, rows = "samples")
        Aba.par   Ago.afr   Bem.big   Bem.man   Cal.fus   Cli.fos
Edge   4.092740 -2.841032 -3.192145 -4.170376 -2.841032 -2.979700
Grass -2.034203  9.506136 11.182116 18.561500  9.506136  9.970119
Wood  -2.706151 -2.963855 -3.629967 -7.107075 -2.963855 -3.108518
        Lei.ruf   Neb.bre   Poe.cup
Edge  -7.245067 -3.209243 -2.682207
Grass -5.450846 -6.097483 17.636905
Wood  10.205860  6.724140 -7.980425

Plot Pearson residuals for Edge habitat

> barplot(gb.ind$ind[1,])
> title(main = "Habitat: Edge", ylab = "Pearson Residual")
> abline(h = 0)
> abline(h = c(2, -2), lty = 3)

Species richness estimation with accumulating sites.

sr_est

This command uses accumulating species richness and a log model to determine the estimated species richness for a complete dataset. You can use a community dataset or the result of the specaccum() command from vegan. The result contains the estimated species richness and the log model. The result holds a class "srest", which has print, plot and summary methods.

Usage

sr_est(accum)

print(srest, ...)

summary(srest, digits = getOption("digits"))

plot(srest, col.points = "black", col.line = "black",

lwd.line = 1, lwd.points = 1, ...)

Arguments

`accum`	A `vector` of values corresponding to accumulating species richness or, a community dataset (rows as samples) or, the result of a `specaccum()` command from `vegan`.
`srest`	The result of an `sr_est()` command.
`digits`	The number of significant figures to display.
`col.points` `lwd.points`	The colour and width of the points used in the base plot.
`col.line` `lwd.line`	The colour and width used for the best-fit line (produced via `abline`).
`...`	Additional parameters to pass to `plot()` or `print()`.

Details

Species accumulation curves (SAC) are used to compare diversity properties of community data sets using different accumulator functions. The sr_est() command uses accumulating species richness as a starting point – the data can be a vector of accumulating values or the result of a vegan::specaccum() command. Alternatively you can use a community dataset as a data.frame or matrix. A log-linear model is then computed to show the estimated species richness from the accumulated sites.

Value

A list with custom class "srest", which has print, summary and plot methods. The components are:

`data.model`	The log model data; a data frame with cumulative species, number of new species and log of cumulative species corresponding to the samples in the original dataset.
`coefficients`	The intercept and slope from the log model.
`species.estimate`	The estimated species richness based on the log model.

Examples

Load vegan and dune data

> require(vegan)
> data(dune)

Estimated richness

> specpool(dune)
    Species  chao  chao.se jack1 jack1.se    jack2     boot
All      30 32.25 3.395769 32.85 1.645448 33.84474 31.54040
     boot.se  n
All 1.153121 20

Use log model for estimate

> dune.sr <- sr_est(dune)
> dune.sr
Estimated Species Richness
                  29.42548

Visualise the model

> plot(dune.sr, lty = 2)

Use species accumulation curve for first 6 rows (all same habitat).

> gb.sa <- specaccum(gb.biol[1:6,])

Summary of log model

> summary(sr_est(gb.sa))
Model Data:
   cum.spp   new.spp   log.cs
1 18.16667 18.166667 1.259275
2 21.26667  3.100000 1.327699
3 23.30000  2.033333 1.367356
4 24.80000  1.500000 1.394452
5 26.00000  1.200000 1.414973
6 27.00000  1.000000 1.431364
Model coefficients:
Intercept     Slope
 131.0205  -92.6311
Estimated Species Richness
                  25.96767

My Publications

I have written several books on ecology and data analysis

An Introduction to R

Data Analysis and Visualisation

£35.00

Buy Now

Beginning R: The Statistical

Programming Language

£26.99

Buy now

Statistics for Ecologists

Using R and Excel

£34.99

Buy now

The Essential R

Reference

£44.99

Buy now

Community

Ecology

£39.99

Buy now

Managing Data

Using Excel

£24.99

Buy now

Register your interest for our Training Courses

We run training courses in data management, visualisation and analysis using Excel and R: The Statistical Programming Environment. Courses will be held at one of our training centres in London. Alternatively we can come to you and provide the training at your workplace. Training Courses are also available via an online platform.

Get In Touch Now

for any information regarding our training courses, publications or help with a data project

Custom R Commands

Custom Commands Index

A

Usage

Arguments

Details

Value

See Also

Examples

C

Usage

Arguments

Details

Value

See Also

Examples

D

Usage

Arguments

Details

Value

See Also

Examples

Usage

Arguments

Details

Value

See Also

Examples

Usage

Arguments

Details

Value

See Also

Examples

E

Usage

Arguments

Details

Value

See Also

Examples

Usage

Arguments

Details

Value

See Also

Examples

F

Usage

Arguments

Details

Value

See Also

Examples

Usage

Arguments

Details

Value

See Also

Examples

H

Usage

Arguments

Details

Value

See Also

Examples

Usage

Arguments

Details

Value

See Also

Examples

Usage

Arguments

Details

Value

See Also

Examples