Dr. Mark Gardener

Home
About

Home > Publications > Community Ecology: Custom R commands

Community Ecology

Analytical Methods Using R and Excel

by: Mark Gardener

Available now from Pelagic Publishing.

On this page find a summary of the custom R commands developed during production of the book. These are mentioned throughout the text and are available as part of the associated data file (see: Support Files). Some of the commands are already illustrated on my Writer's Bloc page. See also the outline and Table of Contents.

Each entry is laid out in a similar manner to standard R-help entries. I've included links to other R commands as appropriate. Most of the examples link to data used in the book (see: Support Files).

See the News page for details about updated files, new custom commands and so on.


 

Top

Custom R Commands

A | C | D | E | F | H | I | P | Q | R | S


 

-- A --

aic
aic

Model-building: add single terms to a model.

  This command is a wrapper for add1(). It computes all the single terms that can be added to the model and returns AIC values.
 

Usage

aic(model, scope, decreasing = FALSE, n = "all", ...)

 

Arguments

  model a fitted model object.
  scope a formula giving the terms to be considered for adding.
  decreasing logical. The order to display the results decreasing = FALSE (default) shows results in order of increasing AIC.
  n how many variables to display (default: n = "all"). Use integer value to return n items.
  ... additional parameters to pass to add1().
 

Details

This command is a wrapper for add1(). It computes all the single terms in the scope argument that can be added to the model, fit those models and compute a table of the changes in fit. The add1() command produces results with variables listed in alphabetical order, whilst aic() allows results to be ordered by AIC value. Additionally you can specify the number of variables to display.

The result has class "anova"

Value

An object of class "anova" summarizing the differences in fit between the models.

See add1() in stats package

See Also

The add1() command in the stats package.

 

Examples

Use aic() instead of add1() when there are many variables to consider. This makes the output more targetted and easier to manage.

Top

> library(vegan) # Load vegan package
> data(varespec) # Biological dataset
> data(varechem) # Environmental variables
> H = diversity(varespec, index = "shannon") # Shannon index for samples
> vlm = lm(H ~ 1, data = varechem) # A blank model
> aic(vlm, scope = varechem, n = 3, test = "F") # display first 3 lowest AIC values

Single term additions
Model:
H ~ 1
         Df Sum of Sq    RSS     AIC F value    Pr(>F) 
Baresoil  1   1.95786 1.0884 -70.240 39.5746 2.486e-06 ***
Humdepth  1   0.75534 2.2909 -52.378  7.2536   0.01328 * 
Fe        1   0.48645 2.5598 -49.715  4.1808   0.05303 . 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# Compare to add1() result (not shown)
add1(vlm, scope = varechem, n = 3, test = "F")
 
 

-- C --

comm_grp
comm_grp

Group community data using grouping variable.

  This command can be used instead of rowsum() and essentially does the same thing.
 

Usage

comm_grp(comm, groups, data)

 

Arguments

  comm a community dataset as a data.frame, matrix or table.
  groups a grouping variable.
  data an optional data.frame where the grouping variable can be found.
 

Details

The command uses a grouping variable to combine samples in a community dataset.

The result is a matrix.

Value

Returns a matrix with rows as samples grouped by the variable in group. The columns are the species as per the original community dataset.

See rowsum() in the base package.

See Also

The rowsum() command in base, which does exactly the same thing.

 

Examples

Generally the rowsum() command will be adequate; I wrote comm_grp() as an exercise to help me work out how to do the grouping.

Top

## Samples in a community dataset
> rownames(gb.biol)
[1] "E1" "E2" "E3" "E4" "E5" "E6" "G1" "G2" "G3" "G4" "G5" "G6"
[13] "W1" "W2" "W3" "W4" "W5" "W6"
## Some of the original community data > gb.biol[1:5, 1:5]
Aba.par Acu.dub Ago.afr Ago.ful Ago.mue
E1 388 0 0 0 0
E2 325 0 0 0 0
E3 295 0 0 0 0
E4 350 0 0 0 2
E5 407 0 0 0 1
## A grouping variable > gb.site$Habitat
[1] Edge Edge Edge Edge Edge Edge Grass Grass Grass Grass
[11] Grass Grass Wood Wood Wood Wood Wood Wood
Levels: Edge Grass Wood
## Group data according to grouping variable > comm_grp(gb.biol, groups = Habitat, data = gb.site)
Aba.par Acu.dub Ago.afr Ago.ful Ago.mue Ago.vid Ama.aen
Edge 2146 0 0 0 3 0 0
Grass 709 3 20 3 0 2 2
Wood 2013 0 0 0 0 0 0
 
Top

-- D --

Dapp
Dapp

Bootstrap confidence intervals for Simpson's index.

  This is intended to be used with apply(); allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.
Requires package: boot

Usage

Dapp(x)

 

Arguments

  x a vector of values; a single sample from a community dataset. The intention is to use this as the FUN argument of apply (or similar) without a separate argument.
Computes bootstrap CI for Simpson's index.

Details

Use this command as a FUN argument in apply() to return bootstrap confidence intervals for Simpson's diversity index from multiple samples.

2.5% and 97.5% bootstrap quantiles used as 95% Confidence Interval.

Value

When called on a single sample (a vector of values) the command returns invisibly 2.5% and 97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply() the result is a matrix with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.

 

See Also

See apply() for applying a function to rows (or columns) of a data.frame or matrix. See H_boot() and H_ci() for computing bootstrap confidence intervals for a single sample using Simpson's, Shannon or Inverse Simpson's indices. See Happ() for confidence intervals using Shannon entropy and Iapp() for Simpson's Inverse index. Bootstrapping is carried out using the boot() command.

 

Examples

Use Dapp() as a FUN in the apply() command to get bootstrap CI for multiple samples.

Top

## Use on single sample
> DeVries[1,1:6]
Hisache Panprol Neshewi Morachi Taysp.1 Coldirc
1882 1028 19 5 8 273
> DV.ci <- Dapp(DeVries[1,])
> DV.ci
2.5% 97.5%
0.8418572 0.8553752
## Use on rows of a dataset with multiple samples > apply(DeVries, MARGIN = 1, FUN = Dapp)
canopy under
2.5% 0.8417613 0.9169560
97.5% 0.8557949 0.9224317
 
dist_chi
dist_chi

Chi squared association for community analysis.

  Calculates pairwise chi squared statistics for a community dataset. The result has a class "distchi", which has plot and summary methods. The results include Pearson residuals as a dissimilarity matrix. The plot method produces an hierarchical cluster object, and plots the dendrogram.
Class "distchi" with summary and plot methods.

Usage

dist_chi(data)
summary(x)
plot(x, method = "complete", ...)

 

Arguments

  data a community dataset (usually a matrix or data.frame) with columns as samples.
  x the result of a distchi() command.
  method the cluster joining algorithm to be used by hclust(), use one of "ward", "single", "complete" (the default), "average", "mcquitty", "median", or "centroid".
  ... other parameters to pass to the plotting command.
Requires package: vegan

Details

Chi squared association can be used to analyse species co-occurrence across many samples. Species that tend to associate together will be more likeley to be from the same community. Species that tend to be negatively associated will be form different communities. So, this approach can be used to identify groups of species and communities using data from many samples. This command carries out the association tests and used the Pearson residuals as the basis for a disssimilarity matrix (all residuals are rescaled to make the smallest zero). This is used by the plot method to produce a dendrogram showing the speices groupings.

 

Value

An object of class "distchi", which has various components:

  chi The individual chi square values for the pairwise species co-occurrences.
  expected The expected values for the pairwise species co-occurrences.
  residuals The Pearson residuals for the pairwise species co-occurrences.
  distance The dissimilarity matrix based on the Pearson residuals.
  p.val The p-value for the overall chi squared association.
  data The name of the data used in the analysis.
  spp The number of species in the dataset.
 

See Also

The hclust() command in stats for hierarchical clustering algorithms. See designdist() in vegan for calculating dissimilarities (used for calculating co-occurrence). Use species_assoc() for association analysis of a pair of species from a dataset. See species_ind() for indicator species analysis using chi squared.

 

Examples

You can specify the cluster joining algorithm as part of the plot method for results of the dist_chi() command.

Top

## Use a dataset with rows as species and columns as samples
> hsere.cs <- dist_chi(hsereT)
> summary(hsere.cs)
Available components:
[1] "chi"       "expected"  "residuals" "distance"  "p.val" 
[6] "data"      "spp" 
       Total.Chi df         p.val
hsereT  632.9612 22 1.027601e-119

## A dendrogram of the result
> plot(hsere.cs, method = "complete")
dendrogram from dist_chi command
 
diversity_comp diversity_comp

Diversity index comparison by permutation.

  Computes significance of diversity index between two samples by permutation method (oecosimu). You can use any of the diversity indices used by the diversity() command in vegan.
 

Usage

diversity_comp(comm, index = "shannon", perm = 1999)

 

Arguments

  comm a community dataset of exactly two samples.
  index a diversity index; options are "shannon" (default), "simpson" and "invsimpson".
  perm the number of permutations, default = 1999.
Requires the vegan package

Details

This command uses the vegan::oecosimu() command and is essentially a wrapper for it. It uses a permutation process to repeatedly sample two communities. The difference in the diversity index is then assessed from the permutation results.

 

Value

The result is an object that holds two classes: "oecosimu" and "list". There are two components: statistic, the calculated difference between the community indices and oecosimu, the results from the simulation. The latter holds the main results, which can be summarised using the print routine (i.e. simply typing the name of the result). Results include z-statistic, 95% CI and p-value (from simulation).

 

See Also

The oecosiumu() command carries out the analysis. See also the diversity() command for calculation of diversity indices. See H_boot() for general confidence intervals of diversity indices by bootstrapping.

You must use a community dataset with exactly two samples. You can easily subset a larger dataset. If you have separate data then you'll have to combine them first.

Examples

## A data.frame with only two rows
> diversity_comp(DeVries, index = "simpson", perm = 2000)

oecosimu with 2000 simulations
simulation method r2dtable
alternative hypothesis: true mean is not equal to the statistic
           statistic          z       2.5%        50%  97.5%
statistic 7.0746e-02 5.5458e+01 6.6880e-05 1.3523e-03 0.0045
           Pr(sim.) 
statistic 0.0004998 ***
## Use a data.frame with many rows and select two for comparison
> diversity_comp(gb.biol[c(1,7), ])

oecosimu with 1999 simulations
simulation method r2dtable
alternative hypothesis: true mean is not equal to the statistic
           statistic        z      2.5%       50% 97.5% Pr(sim.)
statistic  0.668067 10.135002  0.003174  0.063465 0.218    5e-04
 
Top

-- E --

eacc eacc

Estimated species accumulation.

Requires package: vegan This command is intended to be used with lapply() so that you can compute estimated species accumulation for multiple sample groups.
Use as FUN in an lappy() command

Usage

eacc(x, data)

 

Arguments

  x An index value from a community dataset
  data A community dataset, usually a data.frame or matrix.
eacc() command is used as the FUN in lapply()

Details

This function is intended to be used in lapply() rather than as a standalone command. It allows you to take a community dataset and split it by a grouping variable (an index you define). Then the function carries out estimated species accumulation groupwise, using the index you specify and the vegan::estaccumR() command.

 

Value

The result is a "list" with a component for each grouping variable specified. Each component contains the result of the estaccumR() command, that is estimated species accumulation using various methods.

 

See Also

The estaccumR() command carries out the estimated species accumulation using various estimators. The pacc() command is a custom function that carries out a similar job for estimated species richness using the vegan::poolaccum() command. See lapply() for details of applying a function to list elements.

Make an index from rows of a dataset. Then use lapply() to get the accumulating species for groups based on this index.

Examples

## Make an index from rows of community dataset
> index = list(Edge = 1:6, Grass = 7:12, Wood = 13:18)

> index
$Edge
[1] 1 2 3 4 5 6
$Grass
[1]  7  8  9 10 11 12
$Wood
[1] 13 14 15 16 17 18

## Use index value to extract species accumulation by group
> lapply(index, FUN = eacc, data = gb.biol)

$Edge
N     S     Chao      ACE
1 18.19 20.49214 21.44664
2 21.47 23.23650 23.67794
3 23.80 25.20936 26.06676
4 25.30 26.48750 27.45683
5 26.21 27.51250 28.03773
6 27.00 28.00000 28.75118
$Grass
N     S     Chao      ACE
1 24.04 31.43000 32.11511
2 29.51 33.98068 36.43205
3 32.47 34.64050 37.42315
4 34.38 35.53885 37.24407
5 35.71 36.55452 37.74246
6 37.00 38.00000 38.72006
$Wood
N     S   Chao      ACE
1 11.45 12.320 14.03674
2 12.18 12.700 13.25899
3 12.59 12.915 13.62097
4 12.89 13.170 13.79830
5 13.00 13.000 13.39019
6 13.00 13.000 13.00000
 
entropy_plot entropy_plot

Plot results from Rényi or Tsallis calculations.

  This command allows you to visualise the results from the renyi() or tsallis() commands (in vegan). Usually you'll use it as an alternative to the usual lattice plotting command to overlay multiple results in a single plot window.
Uses the result of a tsallis() or renyi() command (package:vegan)

Usage

entropy_plot(H, type = "l", ylab = "Diversity", xlab = "Scale",
             legend = TRUE, col = palette(), lty = 1:6, cex = 1,
             pch = NULL, cex.legend = 1, ...)

 

Arguments

  H the result from a renyi() or tsallis() command from the vegan package.
  type the type of plot, default "l" produces lines. Alternatives are "p", points, "b", both, "o" overplotted, "n" none.
  ylab
xlab
axis labels for y and x-axes.
  legend Logical. The default, TRUE, displays a legend.
  col the default colours to use on the plot, the default uses the current colour palette. Colours are recycled as required.
  lty the line style(s) used, the default uses types 1:6 and these are recycled as needed.
  cex character expansion for plotted points, default = 1.
  pch plotting symbols, default is NULL.
  cex.legend character expansion for legend labels, default = 1.
  ... other graphical parameters.
Requires the vegan package to make entropy results to plot

Details

This command provides an alternative way to visualise results from Rényi or Tsallis entropy calculations. The command takes the result of a tsallis() or renyi() command (from vegan) and produces a line plot for each sample – showing the entropy at the various scales.

The tsallis() and renyi() commands usually show multiple samples using the lattice package, with separate panel for each sample.

The entropy_plot() command uses matplot() to overlay the various samples in one plot. You can choose to add a legend (defaults to the "topright" position). If the input data is not an entropy result the command stops, with an error. You could mimic the result of a tsallis() or renyi() command, the result must hold a class "renyi".

 

Value

Creates a new graphic from the result of a tsallis() or renyi() command (from vegan). Each sample is shown as a separate line (without markers by default) showing the entropy at the various scales.

Uses matplot() to produce graphic

See Also

Entropy calculations in the vegan package by tsallis() and renyi() commands. The matplot() command is used to draw the multiple sample lines.

 

Examples

## Community dataset with two samples
> rownames(DeVries)
[1] "canopy" "under"

## Make a Tsallis entropy result
> DV.te = tsallis(DeVries)

## Plot result using lines with points
> entropy_plot(DV.te, pch = 1:2, type = "b")

Result from entropy_plot command

 
Top

-- F --

fisher_alpha
fisher_alpha

Ecological diversity: Fisher's alpha

  Calculates Fisher's alpha, a measure of ecological diversity.
 

Usage

fisher_alpha(x, MARGIN = 1, se = FALSE, ...)

 

Arguments

  x Community data, a matrix-like object or a vector.
  MARGIN Margin for which the index is computed. Default 1 = rows.
  se

With option se = TRUE, function fisher_alpha() returns a data frame with items for α (alpha), its approximate standard errors (se), residual degrees of freedom (df.residual), and the code returned by nlm on the success of estimation. See Details.

  ... Other parameters to pass to nlm.
Copy of old fisher.alpha() command from vegan package. Computes se but treat these with caution!

Details

The fisher_alpha() command estimates the α parameter of Fisher's logarithmic series (see fisher_fit() command). The estimation is possible only for genuine counts of individuals.

This command restores the behaviour of the old command fisher.alpha() in vegan. This is what the vegan team had to say:

fisherfit: completely rewritten and estimates of standard error removed: I could not find no justification for these. Actually, it seems that the value of Fisher alpha as estimated in the function was independent of the abundance distribution of species, but will be defined by the number of species (S) and number of individuals (N).

Now the Fisher alpha is estimated from the relationship S = alpha*(1 + log(N/alpha)) using function uniroot(). Because of this, standard errors cannot be estimated and they were removed. In addition, functions confint.fisherfit, profile.fisherfit and plot.profile.fisherfit were removed. The estimation of standard errors was also removed in function fisher.alpha (that only calls fisherfit).

So, use with some caution - I replaced the function solely to allow you to get the same results as the exercise in the book.

 

Value

With option se = TRUE, function fisher_alpha() returns a data frame with items for α (alpha), its approximate standard errors (se), residual degrees of freedom (df.residual), and the code returned by nlm on the success of estimation.

 

See Also

Function fisher.alpha() in later versions of package vegan does not compute se. Function fisher_fit() computes the logseries and returns the alpha values. The newer versions of package vegan use fisherfit() which does not permit profiling or calculation of se.

Other commands restored from earlier version of vegan:
confint.fisherfit()
plot.profile.fisherfit()
profile.fisherfit()

Top

Examples

## This version computes se
> fisher_alpha(DeVries, se = TRUE)
alpha se df.residual code
canopy 8.601215 1.249295 55 1
under 10.212885 1.379707 64 1
## Newer version in vegan does not compute se even though help suggests it will > fisher.alpha(DeVries, se = TRUE)
canopy under
8.60122 10.21290
 
fisher_fit

fisher_fit

confint.fisherfit()
plot.profile.fisherfit()
profile.fisherfit()

Fit Fisher's logseries to abundance data

  Function fisher_fit() fits Fisher's logseries to abundance data.
 

Usage

fisher_fit(x, ...)

 

Arguments

  x Community data, a matrix-like object or a vector.
  ... Other parameters to pass to nlm.
Copy of old fisherfit() command from package vegan. Allows se and profiling but treat with caution!

Details

Fits Fisher's logseries to ecological abundance data using non-linear modelling. There is a plot method. In addition you can use confint() (from MASS) to determine approximate confidence intervals. There are also a profile and plot.profile methods.

Newer versions of vegan have removed the calculations for se (see Details in fisher_alpha()) so treat these with caution. I only restored the commands to allow you to get the same results as the exercises in the book.

 

Value

Returns an object of class "fisherfit", which is the result of nlm().

 

See Also

Function fisher.alpha() in later versions of package vegan does not compute se. Function fisher_fit() computes the logseries and returns the alpha values. The newer versions of package vegan use fisherfit() which does not permit profiling or calculation of se.

Top

Examples

## Fit Fisher logseries to sample of data
> ff <- fisher_fit(DeVries[1,])
> ff
Fisher log series model
    No. of species: 56 
         Fisher alpha:   8.601215 
## Plot the result (not shown)
> plot(ff)

## Get approx CI (uses MASS)
> confint(ff)
2.5 %    97.5 % 
6.396521 11.313053 

## Make a profile result
> profile(ff)
$alpha
          tau  par.vals
1  -3.3649073  5.094402
2  -2.5788594  5.795765
3  -1.8602976  6.497127
4  -1.1967304  7.198490
5  -0.5789605  7.899852
6   0.0000000  8.601215
7   0.5455796  9.302577
8   1.0621142 10.003940
9   1.5531015 10.705303
10  2.0214191 11.406665
11  2.4694633 12.108028
12  2.8992526 12.809390
attr(,"original.fit")
attr(,"original.fit")$coefficients
   alpha 
8.601215 
attr(,"original.fit")$std.err
[1] 1.249295
attr(,"class")
[1] "profile.fisherfit" "profile.glm"       "profile" 

## Plot the profile (not shown)
> plot(profile(ff))    
 
Top

-- H --

H_boot H_boot

Bootstrapping diversity indices.

Uses the boot package Carries out a bootstrap resampling using the boot package. The results compute the confidence intervals. You can specify any diversity index used by the diversity() command in vegan for a single sample but the vegan package is not required..
 

Usage

H_boot(x, index = "shannon", R = 1000, ci = 95)

 

Arguments

  x a vector of community data.
  index a diversity index; use one of "shannon" (default), "simpson" or "invsimpson".
  R the number of bootstrap resamples to use, defaults to 1000.
  ci the confidence interval to return, the default 95, returns 2.5% and 97.5% quantiles.
 

Details

A sample of community data can be randomised by repeatedly resampling. The H_boot() command uses the boot() command in the boot package to conduct the resampling. The command allows you to specify various diversity index measures and computes the confidence intervals based on quantiles from the resampling.

Returns the diversity index and confidence intervals based on quantiles from the bootstrap resampling

Value

Produces a "list" with several components: boot.stats gives the results from the boot() command, CI shows the confidence intervals (based on quantiles from the bootstrap resampling), and index gives the calculated diversity index. The command displays the index and CI directly.

 

See Also

The boot() command in the boot package carries out the resampling. Diversity indices are computed de-novo but see diversity() in the vegan package for alternative computation of Shannon, Simpson's and Inverse Simpson's indices. See H_bss() for bootstrap comparison between two samples.

The H_ci() command can be used with apply() to get bootstrap confidence intervals for multiple samples.

 

Examples

## Use defaults for one sample from a dataset
> H_boot(DeVries[1,])
H =  2.640651 
    2.5%    97.5% 
2.596435 2.675158

## Use different index, CI and number of resamples
> H_boot(DeVries[2,], index = "simpson", ci = 99, R = 2000)
H =  0.9198993
     0.5%     99.5%
0.9158286 0.9233032

H_bss H_bss

Bootstrap significance between two diversity indices.

Requires the vegan package. Has print and summary methods This command carries out a bootstrap analysis of the diversity indices of two samples. The result gives a p-value for the permutation as well as a calculated z-score. You can use any of the diversity indices used by the diversity() command in vegan. The result holds a class, "Hbss", which has print and summary methods.
Requires dataset with exactly 2 samples as input

Usage

H_bss(comm, R = 2000, index = "shannon")
print(x, ...)
summary(x)

 

Arguments

  comm a community dataset containing exactly two samples.
  x the result of an H_bss() command.
  R the number of bootstrap resamples to use, defaults to 2000.
  index a diversity index; use one of "shannon" (default), "simpson" or "invsimpson".
  ... other parameters to pass to print, such as digits.
 

Details

The statistical significance of the difference in diversity index between two samples is estimated using a randomization method. The two samples are repeatedly randomized and the difference in the diversity index recalculated. The significance is assessed by comparing the difference between the resamples and the original difference. Confidence intervals are estimated using the quantiles of the randomization.

A z-score is also computed from the ramdomizations and the p-value estimated from that.

The command produces a result holding a class "Hbss", which has print and summary routines.

Result holds class "Hbss" with print and summary routines

Value

A list object with custom class "Hbss", which has print and summary methods. The result object contains several components:

  H.orig The calculated diveristy index for the two original samples.
  statistic The difference in the diversity index between the two original samples.
  z.score The z-score calculated using the original difference and the difference between the bootstrap resamples as well as the standard deviation of the bootstrap samples.
  CI The 95% confidence interval for the difference between the indices of the two samples.
  p.sim The p-value resulting from the bootstrap resampling.
  p.z The p-value estimated using the z-score.
  df The degrees of freedom (the number of species -1 for each sample).
  Index The name of the diversity index used in the calculations.
 

See Also

See the diversity() command in vegan for computations of diversity indices. See the r2dtable() command for randomization of data tables with fixed marginal totals. See H_boot() for bootstrap confidence intervals of a single sample. See also H_ci() for a function to use with apply() to give boostrap confidence intervals for multiple samples.

 

Examples

## Use 2 samples from a larger dataset
> gb.bs = H_bss(gb.biol[c(7,13), ])

## Print result
> print(gb.bs, digits = 3)
        H.orig.1 H.orig.2 Statistic Z.score    2.5% 97.5% P(sim)
shannon     1.94     1.40     0.538    9.57 0.00335 0.184      0
            P(z) df
shannon 1.13e-11 38

## Summary method
> summary(gb.bs)
Original shannon Index:  1.935163 1.397251
Difference in index:     0.5379125
z-score:                 9.573475
Confidence interval:     0.003348132 0.1837998
Degrees of Freedom:      38
Simulated P-value:       0
P-value (z):             1.130406e-11

H_ci H_ci

Bootstrap confidence intervals for a single sample.

Requires the boot package This command carries out a bootstrap resampling to estimate confidence intervals for a single sample. It is intended to be used with the apply() command to provide bootstrap confidence intervals for multiple samples.
 

Usage

H_ci(x, index = "shannon", R = 1000, ci = 95)

 

Arguments

  x a vector of community data.
  index a diversity index; use one of "shannon" (default), "simpson" or "invsimpson".
  R the number of bootstrap resamples to use, defaults to 1000.
  ci the confidence interval to use, defaults to 95 (i.e. 2.5% and 97.5% quantiles).
Use as FUN command in apply()

Details

A sample of community data can be randomised by repeatedly resampling. The H_ci() command uses the boot() command in the boot package to conduct the resampling. The command allows you to specify various diversity index measures and computes the confidence intervals based on quantiles from the resampling.

This command is intended to be used as the function (FUN argument) in apply(), allowing you to calculate confidence intervals for multiple samples. The results can be used for example, to make error bars in plots of diversity indices for multiple samples.

 

Value

The command returns (invisibly) the confidence intervals as a simple vector with two values, one for the lower CI and one for the upper CI.

 

See Also

The H_boot() command for calculating the diversity index and confidence intervals for a single sample. The boot() command (in package boot) carries out the bootstrap resampling. See the H_bss() command for a significance test for two samples based on randomization. See the apply() command for conducting computations repeatedly over the rows (or columns) of a dataset.

 

Examples

## Use a dataset with two rows (samples)
> apply(DeVries, MARGIN = 1, FUN = H_ci, index = "simpson")
canopy under
2.5% 0.8418946 0.9170899
97.5% 0.8554198 0.9224856

H_CI H_CI

Confidence interval for Shannon or Simpson's t-test.

  This command computes the confidence interval for the t-test used with Shannon or Simpson's indices.
 

Usage

H_CI(x, index = "shannon", ci = 95)

 

Arguments

  x a vector of community data.
  index a diversity index; use one of "shannon" (default), or "simpson".
  ci the confidence interval to use the default is 95 and works out 95% CI.
Use as a separate command or with apply() to get CI for multiple samples

Details

A variant of the t-test can be used to determine the statistical significance of differences in Shannon entropy or Simpson's index between two samples. The H_CI() command computes the confidence interval for either index using the t-test formula as a starting point. The t-test is sensitive to degrees of freedom (sample size) and bootstrapping methods are becoming more popular (see H_bss() command).

The command can be used with apply() to get the confidence intervals for multiple samples.

 

Value

Returns a simple numeric value representing the confidence interval at the chosen level (default 95%).

 

See Also

The H_sig() command for calculating the t-test to compare Shannon or Simpson's index between two samples. See the apply() command for using H_CI() over rows of a dataset to calculate CI for multiple samples. See H_ci() for bootstrap confidence intervals for multiple samples and H_bss() for a significance test based on randomization using bootstrapping.

 

Examples

## Get CI for Simpson's index for single sample
> H_CI(DeVries[1,], index = "simpson")
[1] 0.007075684 ## Get CI for Shannon for rows of dataset
> apply(DeVries, MARGIN = 1, FUN = H_CI)
canopy under
0.07670451 0.09524768

H_sig H_sig

Special t-test for Shannon or Simpson's index.

  This command carries out a version of the t-test for either Shannon or Simpson's indices. The result object has a class "Hsig", which has plot and summary routines.
 

Usage

H_sig(x, y, index = "shannon", alpha = 0.05)
summary(x)
plot(x, ylab = "Index Value", xlab = "Samples", ...)

 

Arguments

  x, y vectors of community data for comparison; these can be of different length. For plot and summary methods, x is the result of an H_sig() command.
  index a diversity index; use one of "shannon" (default), or "simpson".
  alpha the significance level, the default 0.05 computes critical values for p = 0.05.
  xlab, ylab labels for the x and y-axes.
  ... additional graphical parameters to pass to plot().
 

Details

A variant of the t-test can be used to determine the statistical significance of differences in Shannon entropy or Simpson's index between two samples. The H_sig() command computes the t-test for two samples. The t-test is sensitive to degrees of freedom (sample size) and bootstrapping methods are becoming more popular (see H_bss() command).

The summary routine provides an overview of the test statistics and the plot routine creates a point plot with error bars based on confidence intervals.

The result holds a class "Hsig", which has plot and summary routines

Value

The result is a list with a custom class "Hsig", which has summary and plot methods. There are various components:

  p.val The p-value for the calculated t-test.
  df The degrees of freedom.
  t.statistic The calculated value of t.
  var The variance.
  critical The critical value for df degrees of freedom and the chosen alpha level.
  H The value of the diversity index for each sample.
  index The name of the index used.
  alpha The chosen level of significance.
  sites The site names.
  richness The number of species in each sample.
  individuals The number of individuals in each sample.
  The plot method produces a point plot with error bars showing confidence intervals (using the alpha level from the original t-test).
 

See Also

The H_CI() command for computing confidence intervals using the t-test formula as a starting point. See the H_bss() command for an alternative significance test using randomization and bootstrapping.

 

Examples

## Determine significance of t-test between Shannon index of 2 samples
> DVh = H_sig(DeVries[1,], DeVries[2,])
[1] 0.0002608309

## Get a summary of the results
> summary(DVh)
Site: DeVries[1, ] 
Index: shannon = 2.640651 
Variance = 0.0003865389 
df = 2.231875 
No. Species = 56 
Individuals = 5774
Site: DeVries[2, ] 
Index: shannon = 2.982633 
Variance = 0.0002548043 
df =  1.508951 
No. Species = 65 
Individuals = 5922
t-statistic = 13.50384 
df = 3.740253 
critical value at 0.05 = 2.854166 
Significance = 0.0002608309 
## Plot the result
> plot(DVh, pch = 16, cex = 2, col = "darkgreen", lty = 1)
Plot of H_sig result

Happ Happ

Bootstrap confidence intervals for the Shannon index.

  This is intended to be used with apply(); allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.
Requires package: boot

Usage

Happ(x)

 

Arguments

  x a vector of values; a single sample from a community dataset. The intention is to use this as the FUN argument of apply (or similar) without a separate argument.
 

Details

Use this command as a FUN argument in apply() to return bootstrap confidence intervals for Simpson's diversity index from multiple samples.

2.5% and 97.5% bootstrap quantiles used as 95% Confidence Interval.

Value

When called on a single sample (a vector of values) the command returns invisibly 2.5% and 97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply() the result is a matrix with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.

 

See Also

See apply() for applying a function to rows (or columns) of a data.frame or matrix. See H_boot() and H_ci() for computing bootstrap confidence intervals for a single sample using Simpson's, Shannon or Inverse Simpson's indices. See Dapp() for confidence intervals using Simpson's index and Iapp() for Simpson's Inverse Index. Bootstrapping is carried out using the boot() command.

Use Happ() as a FUN in the apply() command to get bootstrap CI for multiple samples.

Examples

## Use as FUN in apply command
> apply(DeVries, MARGIN = 1, FUN = Happ)
canopy under
2.5% 2.595170 2.943928

97.5% 2.674168 3.009593
 
Top

-- I --

Iapp
Iapp

Bootstrap confidence intervals for the Inverse Simpson index.

  This is intended to be used with apply(); allowing bootstrap confidence intervals to be computed for multiple samples. The default is 1000 resamples and 95% CI.
Requires package: boot

Usage

Iapp(x)

 

Arguments

  x a vector of values; a single sample from a community dataset. The intention is to use this as the FUN argument of apply (or similar) without a separate argument.
 

Details

Use this command as a FUN argument in apply() to return bootstrap confidence intervals for Inverse Simpson's diversity index from multiple samples.

2.5% and 97.5% bootstrap quantiles used as 95% Confidence Interval.

Value

When called on a single sample (a vector of values) the command returns invisibly 2.5% and 97.5% quantiles as bootstrap confidence intervals (as a simple numeric vector). When used with apply() the result is a matrix with columns corresponding to samples and two rows, one for the 2.5% and one for the 97.5% bootstrap quantile.

 

See Also

See apply() for applying a function to rows (or columns) of a data.frame or matrix. See H_boot() and H_ci() for computing bootstrap confidence intervals for a single sample using Simpson's, Shannon or Inverse Simpson's indices. See Dapp() for confidence intervals using Simpson's index and Happ() for Shannon. Bootstrapping is carried out using the boot() command.

Use Iapp() as a FUN in the apply() command to get bootstrap CI for multiple samples.

Examples

## Use as FUN in apply command
> apply(DeVries, MARGIN = 1, FUN = Iapp)
canopy under
2.5% 6.333271 12.05552
97.5% 6.950071 12.90245
 
   
Top

-- P --

pacc pacc Extrapolated species richness in a species pool.
Requires package: vegan This command is intended to be used with lapply() for use with a community dataset and a grouping variable. It calls the poolaccum() command from vegan and returns various estimators of species richness.
Use as FUN in lapply() command

Usage

pacc(x, data)

 

Arguments

  x a vector of index values.
  data the data.frame holding the community data.
pacc() command is used as the FUN in lapply()

Details

This function is intended to be used in lapply() rather than as a standalone command. It allows you to take a community dataset and split it by a grouping variable (an index you define). Then the function carries out estimated species richness groupwise, using the index you specify and the vegan::poolaccum() command.

 

Value

The result is a "list" with a component for each grouping variable specified. Each component contains the result of the poolaccum() command, that is estimated species accumulation using various methods.

 

See Also

The poolaccum() command carries out the estimated species accumulation using various estimators. The eacc() command is a custom function that carries out a similar job for groupwise estimated species richness using the vegan::estaccumR() command. See lapply() for details of applying a function to list elements.

Make an index from rows of a dataset. Then use lapply() to get the extraplolated species richness for groups based on this index.

Examples

## Make an index from rows of community dataset
> index = list(Edge = 1:6, Grass = 7:12, Wood = 13:18)
> index
$Edge
[1] 1 2 3 4 5 6
$Grass
[1]  7  8  9 10 11 12
$Wood
[1] 13 14 15 16 17 18
## Use index to get extrapolated species richness for groups
> lapply(index, FUN = pacc, data = gb.biol)
$Edge
N     S     Chao Jackknife 1 Jackknife 2 Bootstrap
3 23.29 34.94850    27.38333    28.90000  25.22704
4 24.94 33.60742    29.44750    31.33917  27.05879
5 26.13 34.32750    30.99400    33.28750  28.38051
6 27.00 33.00000    32.00000    34.40000  29.30678
$Grass
N     S     Chao Jackknife 1 Jackknife 2 Bootstrap
3 32.21 38.03772    37.81000    39.45167  34.95630
4 34.10 38.48675    39.38750    40.78250  36.74922
5 35.79 43.29300    41.72600    44.46350  38.58681
6 37.00 53.00000    43.66667    47.93333  39.94114
$Wood
N     S   Chao Jackknife 1 Jackknife 2 Bootstrap
3 12.62 13.325    13.47333    13.66500  13.05148
4 12.91 13.160    13.76500    14.07833  13.32430
5 13.00 13.325    13.52000    13.30250  13.31851
6 13.00 13.000    13.00000    11.93333  13.17563

plot_H plot_H

Point plot of diversity index and bootstrap confidence intervals.

Requires packages: vegan and boot This command carries out a bootstrap resampling of a community dataset and plots the results as a point plot that includes error bars showing the 95% confidence intervals. You can use any diversity index used by the diversity() command in vegan.
 

Usage

plot_H(comm, index = "shannon",
       xlab = "Site Names", ylab = "Diversity Index",
       pch = 18, cex = 1.2, cex.axis = 0.75, boot = TRUE, ...)

 

Arguments

  comm community dataset, usually a matrix or data.frame.
  index a diversity index; use one of "shannon" (default), "simpson" or "invsimpson".
  xlab, ylab labels for x and y-axes.
  pch plotting symbol, default = 18 (a filled diamond).
  cex character expansion, default = 1.2.
  cex.axis character expansion for axes, default = 0.75.
  boot = TRUE logical, use bootstrap CI (TRUE, default) or variance estimators (FALSE).
  ... other graphical parameters to pass to plot().
Uses bootstrapping to assess confidence intervals, plotted as error bars

Details

Bootstrapping is a way to repeatedly resample a community dataset. The confidence interval of the diversity index is assessed from the quantiles of the resamples. The plot_H() command uses the diversity() command from vegan to calculate the diversity index for multiple samples in a community dataset. The H_ci() command is called internally, which computes the confidence intervals by calling boot() from the boot package. The diversity indices are then plotted (as a point plot), using the confidence intervals as error bars. This provides a visual representation of differences in diversity index between samples.

Alternatively you can specify boot = FALSE to use variance estimation for CI. This calls the H_CI() command internally.

Note: if you specify boot = FALSE and index = "invsimpson" an error is returned.

Produces a point plot with error bars

Value

The command produces a list object invisibly; this contains the diversity index, $H and confidence intervals, $CI. These are used to create a point plot with 95% error bars.

 

See Also

See the diversity() command in vegan for details of the diversity indices. The boot() command from package boot carries out the resampling; it is called from H_ci(), which calculates the confidence intervals. See the apply() command, which is used internally to obtain a statistics over the rows of a dataset. If boot = FALSE the CI are determined using H_CI() via a variance estimation process.

Plots Shannon entropy by default. You can also specify "simpson" or "invsimpson"

Examples

## Make a plot and save results
> gbh = plot_H(gb.biol, ylab = "Shannon Entropy")

Point plot from plot_H command
## View the results
> gbh
$H
E1 E2 E3 E4 E5 E6 G1
1.267096 1.424505 1.412179 1.737146 1.636923 1.557477 1.935163
G2 G3 G4 G5 G6 W1 W2
2.000272 1.929426 2.360250 2.139567 2.205175 1.397251 1.582102
W3 W4 W5 W6
1.539511 1.601073 1.616333 1.552470 $CI
E1 E2 E3 E4 E5 E6
2.5% 1.167435 1.339836 1.320449 1.639057 1.552808 1.477362
97.5% 1.341344 1.487790 1.484144 1.802874 1.698681 1.614796
G1 G2 G3 G4 G5 G6
2.5% 1.741574 1.805502 1.704016 2.192101 1.975294 2.028207
97.5% 2.047899 2.109588 2.046776 2.449227 2.230595 2.288512
W1 W2 W3 W4 W5 W6
2.5% 1.337487 1.519955 1.470808 1.534223 1.560459 1.488700
97.5% 1.443202 1.635484 1.597767 1.654974 1.658593 1.601543

polar.ord polar.ord

Polar Ordination (aka Bray-Curtis ordination)

 

This command carries out Polar Ordination, a method of ordination (also known as Bray-Curtis ordination). The command works out polar ordination site scores and works out axes as follows: Axis 1 is the combination of sites that gives the maximum separation (not necessarily the maximum variance explained). Axis 2 is the combination of sites that maximises the orthogonality to the 1st axis.

There are print, plot and summary methods and an identify method for use with plots.

polar.ord() has print, plot and summary methods and an identify method for use with plots

Usage

polar.ord(diss)

print.polar.ord(ord, digits = getOption("digits"))

summary.polar.ord(ord, k = 2, cor = TRUE, var.exp = TRUE, digits = getOption("digits"))

plot.polar.ord(ord, x.axis = 1, y.axis = 2, type = c("t", "p", "n"), ...)

identify.plot.polar.ord(pord, labels = pord$labels, ...)

 

Arguments

 

diss

A dissimilarity matrix e.g. as produced by dist() or vegdist()

 

digits

The number of digits to display, defaults to current settings

 

ord

The result of a polar.ord command

 

k

The number of axes to display

 

cor

If TRUE, shows the correlation between ordination axes

 

var.exp

If TRUE, shows the variance explained

 

x.axis

Which ordination axis to plot on the x-axis

 

y.axis

Which ordination axis to plot on the y-axis

 

type

The type of plot, "t" text, "p" points or "n" none

 

pord

The result of a plot.polar.ord command i.e. a polar ordination plot result

 

labels

A vector of labels to use for the sites

 

...

Other plotting parameters to pass to plot or identify

Polar Ordination maximises site separation. Subsequent axes maximise orthogonality.

Polar ordination is an indirect gradient method of ordination

Details

Polar ordination is a method of indirect gradient analysis. The starting point is a dissimilarity matrix, any method of producing a dissimilarity can be used as the starting point for the polar.ord() command. Typical commands would be dist() or vegdist() (from package: vegan). The analysis seeks to maximise the separation between sites along an axis. Subsequent axes are calculated to give the best separation but also to maximise the orthogonality, that is to minimise correlation between the axes.

The method is "simple" enough to be able to carry out using a spreadsheet but identification of subsequent axes can be a labour-intensive and iterative process. Note that in this application of Polar Ordination I have maximised separation, which was the original intention of Bray-Curtis. However, you can also go for a maximise variance explained approach as the summary method displays additional variance explained by each axis.

The print method allows you to view all the generated axes, their separation (the dissimilarity) and variance explained.

The summary method provides an overview that includes site socres, correlation and variance explained.

The plot method allows you to visualise the results from any combination of axes.

polar.ord() produces a list with custom class "polar.ord", which is used by the print, summary, plot and identify mehtods

Value

The polar.ord() command produces an object (essentially a list) holding the class "polar.ord", which has print, summary, plot and identify mehtods. The components are:

  • sites – the site scores for all the computed axes.
  • dist – the distance (separation) for each axis, in the same scale as the original dissimilarity.
  • cor – the correlation (Pearson) between the principal axis (the one with maximum separation) and each of the other axes.
  • var.exp – the variance explained by each axis.
  • var.add – the additional variance explained by each axis in addition ot the original principal axis.
  • samples – a vector of site names.
  • axes – a vector of axis names.
  • diss – the original dissimilarity result (a "dist" object).
  • method – a character vector giving the dissimilarity metric employed.

These components are used by the print, summary, plot and identify mehtods.

 

See Also

Various methods of ordination from the vegan package e.g. metaMDS for non-metric multidimensional scaling. Also see cmdscale(), which is part of the basic stats package.

polar.ord() command in CERE.RData file

see code also in Polar Ordination.R

Examples

moss # The data (columns are sites) from CERE.RData

## Transpose sites/species and make a dissimilarity matrix
d <- dist(t(moss), method = "euclidean") # Make dissimilarity matrix

## Run the PO
po <- polar.ord(d)

po # Standard print
print(po, digits = 4) # Use print method
summary(po, k = 4) # Show summary of first 4 axes

## Plot
plot(po) # Basic plot
plot(po, x.axis = 1, y.axis = 3) # Plot different axes

## Species scores
require(vegan) # Load vegan package
wa = wascores(scores(po), t(moss)) # Weighted average species scores

# Quick plot
plot(po)
text(wa, labels = rownames(wa), col = "red", cex = 0.8)

# ID plot
op <- plot(po, type = "p") # Plot points only
identify(op) # Click points to identify sites (Esc to cancel)

points(wa, col = "red", pch = "+") # Add species points
identify(wa, labels = rownames(wa)) # Add labels interactively

 
Top

-- Q --

qstar qstar

Plot Tsallis entropy profile using q*.

Requires package: vegan This command uses the eventstar() command from vegan to calculate q*. The profile is then plotted as three plots showing, evenness, diversity and effective species for the Tsallis scales selected.
 

Usage

qstar(x, q = seq(0, 2, 0.05), qmax = 5, type = "l",
      lcol = "red", llty = 1, llwd = 1, ...)

 

Arguments

  x a vector of community data.
  q the scales to use for the Tsallis entropy (passed to tsallis).
  qmax the maximum q value to use (passed to eventstar), default = 5.
  type the type of plot, default = "l" for lines.
  lcol
llty
llwd
colour, type and width for the lines showing q* value and diversity values at q*.
  ... other graphical parameters to pass to plot().
q* is the scale parameter of Tsallis entropy corresponding to minimum evenness

Details

The function eventstar finds a characteristic value of the scale parameter q of the Tsallis entropy corresponding to minimum of the evenness (equitability) profile based on Tsallis entropy. This is called q*. Use of q* and corresponding diversity, evenness, and Hill numbers (effective species), is recommended because it is a unique value representing the diversity profile. It is also positively associated with rare species in the community, thus it is a potentially useful indicator of certain relative abundance distributions of the communities.

The qstar() command uses eventstar() from vegan to calculate q* and then plots three graphs to show evenness, diversity and effective species at the chosed scales, and their relationship to q*.

Produces three plots in one graphic window

Value

The command returns the result from the eventstar() command, i.e. a data frame with several columns showing q* (qstar), the value of evenness at q* (Estar), the value of Tsallis entropy at q* (Hstar), and the effective species at q* (Dstar). See eventstar() for details.

The command also draws (in one graphic window) line plots of evenness, diversity, and effective species at the selected scales. The value of q* is displayed on the graphs as a line, which can be altered in appearence.

 

See Also

See tsallis() in vegan for details of Tsallis entropy and the inspiration for this plotting routine. The eventstar() command from vegan is called to calculate q*.

Produces the result from eventstar() and a graphic window containing three plots: evenness, diversity and effectice species

Examples

## Select a single sample from a community dataset
> qstar(DeVries[1,])
qstar Estar Hstar Dstar
1 0.6847494 0.1113006 5.062579 20.61567
Plot of q* based on Tsallis entropy
 
Top

-- R --

rad_aic rad_aic

Compute AIC values from multiple RAD models.

  This command takes the result of multiple RAD model-fitting (from radfit() in vegan) and computes the "best" AIC value for each sample. The result shows the best AIC value and the corresponding model name for each sample.
 

Usage

rad_aic(x)

 

Arguments

  x the result of radfit() from vegan. The result must hold the class "radfit.frame", i.e. have been run on a community dataset with multiple samples.
Requires the result of a vegan::radfit() command

Details

Rank – Abundance Dominance (RAD) or Dominance/Diversity plots display logarithmic species abundances against species rank order. These plots are supposed to be effective in analysing types of abundance distributions in communities. The radfit() command in vegan can compute various models for community datasets. The rad_aic() command takes the result of a radfit() command and, for each sample, shows the name of the model with the lowest AIC value as well as displaying the AIC.

Shows the AIC and name of "best" RAD model

Value

Returns a data frame with two columns giving the "best" RAD model for each sample in the original community dataset. The columns are: AIC: the AIC value, and Model: the name of the model. Each row corresponds to a sample in the original community dataset.

 

See Also

See radfit() in vegan for details of the model fitting. The rad_test() command carries out a significance test of differences between RAD models for a community sample.

 

Examples

## Compute RAD models for a community dataset
> gb.rad = radfit(gb.biol)
## Explore the radfit() result
> rad_aic(gb.rad)
AIC Model
E1 106.29094 Mandelbrot
E2 134.58026 Preemption
E3 108.41663 Preemption
E4 160.24125 Mandelbrot
E5 183.46446 Preemption
E6 146.97826 Preemption
G1 106.00117 Zipf
G2 83.88072 Mandelbrot
G3 68.55087 Mandelbrot
G4 104.48570 Mandelbrot
G5 100.04912 Mandelbrot
G6 98.59426 Mandelbrot
W1 154.01455 Preemption
W2 81.28201 Preemption
W3 74.92607 Preemption
W4 98.54388 Preemption
W5 104.03335 Preemption
W6 82.63346 Preemption

rad_test rad_test

A significance test of differences in deviance of RAD models.

Requires the result of a vegan::radfit() command This command takes the result of a radfit() command (from vegan) and carries out a test of significance (ANOVA) of the various RAD models, examining their deviance. The result holds a class, "rad.htest" that has print, summary and plot methods.
Class "rad.htest" has print, summary and plot methods

Usage

rad_test(x, conf.level = 0.95, log = TRUE)
print(x, ...)
summary(x)
plot(x, which = "original", las = 1, ...)

 

Arguments

  x the result of a radfit() command from vegan. For summary and plot methods x should be the result of the rad_test() command.
  conf.level the confidence level, the default is 0.95.
  log Logical. If TRUE (the default) the log of the model deviance is used as the response variable in ANOVA.
  which the type of plot, "original" (default) shows boxplot of model deviance, "post.hoc" displays Tukey post-hoc result.
  las the orientation of the axis annotations, the default produces labels all horizontal.
  ... other parameters to be passed to plot() or print().
Use replicated samples for a meaningful test of RAD models

Details

Rank – Abundance Dominance (RAD) or Dominance/Diversity plots display logarithmic species abundances against species rank order. These plots are supposed to be effective in analysing types of abundance distributions in communities. The radfit() command in vegan can compute various models for community datasets. The rad_test() command takes the result of a radfit() command and carries out a test of model deviance using ANOVA. By default the log of model deviance is used, which improves the normality for the ANOVA.

For a meaningful result you should use replicated samples (see Examples).

 

Value

The rad_test() command returns invisibly a list with a custom class "rad.htest", which has print, summary and plot methods.

The print method returns a matrix showing Mean deviance, Standard Error of deviance and Mean AIC for the RAD models.
The summary method shows the ANOVA table ($ANOVA) and a summary of the Tukey Post-Hoc test ($Tukey).
The plot method produces a boxplot that shows model deviance for each RAD model. Models are re-ordered by decreasing mean deviance. Optionally the plot routine can display the post-hoc results.

 

See Also

See radfit() in vegan for details of the model fitting. The ANOVA is carried out using aov() and the post-hoc via the TukeyHSD() command. The boxplot() command is used to plot the results (see also reorder() for re-ordering of samples for plotting).

See rad_aic() for an alternative command that shows the "best" RAD model for each sample using results from radfit().

The plot method can produce the "original" plot (boxplot of model deviance) or "post.hoc" (the Tukey post-hoc) result

Examples

## Made RAD models for six samples of dataset
## these are replicates (from same habitat type)
> gb.rad = radfit(gb.biol[1:6,])
## Carry out ANOVA > gb.rt = rad_test(gb.rad) ## Print result > print(gb.rt, digits = 3) Null Preemption Lognormal Zipf Mandelbrot Mean deviance 730 90.4 129.29 163.9 68.15 SE of deviance 66 14.7 8.43 14.0 9.48 Mean AIC 799 161.7 202.61 237.2 143.47 ## Summary of result shows ANOVA and Post-Hoc > summary(gb.rt) $ANOVA Df Sum Sq Mean Sq F value Pr(>F) model 4 20.9193 5.2298 65.545 6.937e-13 *** Residuals 25 1.9948 0.0798 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$Tukey
             diff         lwr       upr        p adj
Pre-Mb  0.2700865 -0.20887308 0.7490461 4.776096e-01
Log-Mb  0.6773801  0.19842055 1.1563397 2.816974e-03
Zip-Mb  0.9053575  0.42639796 1.3843171 8.202652e-05
BS-Mb   2.3975406  1.91858106 2.8765002 8.252288e-13
Log-Pre 0.4072936 -0.07166594 0.8862532 1.233203e-01
Zip-Pre 0.6352710  0.15631147 1.1142306 5.339678e-03
BS-Pre  2.1274541  1.64849457 2.6064137 1.145184e-11
Zip-Log 0.2279774 -0.25098216 0.7069370 6.345799e-01
BS-Log  1.7201605  1.24120094 2.1991201 1.043389e-09
BS-Zip  1.4921831  1.01322353 1.9711427 1.790295e-08
$Logarithmic
[1] TRUE
$Confidence.Level
[1] 0.95

## Print the results
> opar = par(mfrow = c(1,2)) # 2 columns in plot window
> plot(gb.rt, ylab = "Log of deviance", main = "RAD deviance") # boxplot
> plot(gb.rt, which = "post.hoc") # Tukey result
> abline(v = 0, lty = 3)
> par(opar) # reset plot window
Plots from rad_htest command
 
Top

-- S --

spec.rich spec.rich

Species richness for a community dataset.

  This command computes species richness for a community dataset.
 

Usage

spec.rich(data)

 

Arguments

  data a community dataset as a data.frame or matrix. Rows as samples.
Computes species richness

Details

The number of species in a community sample (the species richness) is a basic measure of diversity.

 

Value

A numeric vector containing the species richness for each sample of the original dataset. The vector has a names attribute to match the sample names of the original dataset.

 

See Also

The specnumber() command in the vegan package can also compute richness (including groupwise since version 2.0-0). See also the which() command that is used to select elements > 0 in the calculations, and length().

The sr_est() and specaccum() commands for modelling species richness with accumulating sites.

 

Examples

## Species richness using a 2-row matrix
> spec.rich(DeVries)
canopy under
56 65
## Richness from a 16-row data frame > spec.rich(gb.biol)
E1 E2 E3 E4 E5 E6 G1 G2 G3 G4 G5 G6 W1 W2 W3 W4 W5 W6
17 14 15 25 21 17 28 22 18 28 26 24 12 11 11 12 12 10

species_assoc species_assoc

Pairwise chi squared species association.

  This command takes two species that you specify from a community dataset and carries out a pairwise chi squared association test.
Requires package vegan

Usage

species_assoc(spA, spB, data)

 

Arguments

  spA, spB index values for the two species to be analysed, corresponding to their position in the rownames attributes, i.e. a numerical value corresponding to the row number.
  data the data.frame containing the species for analysis; rows as species.
Uses a chi squared approach to explore species co-occurrence

Details

A chi squared approach can be used to explore species co-occurrence. Species that are positively associated will tend to be from the same community, whereas species that are negatively associated will be from different communities. The species_assoc() command takes a dataset and computes the chi squared statistics of species co-occurrence for two species that you specify.

 

Value

A list object (returned invisibly) with four components:

  spA The name of the first species.
  spB The name of the second species.
  chi.sq The chi squared test result – includes the observed, expected and Pearson residuals for the pair of selected species. The "standard" chi squared result is returned to screen.
  co.occur The co-occurrence matrix.
 

See Also

See the chisq.test() for general details of chi squared testing. The designdist() command from vegan is used to compute the co-occurrence matrix. See also the species_ind() command for using chi squared in indicator species analysis.

Specify the species to explore by their position in the rownames()

Examples

## Look at the species available
> rownames(hsereT)
 [1] "Carex rostrata"           "Fillipendula ulmaria" 
 [3] "Caltha palustris"         "Bryophyte" 
 [5] "Galium palustre"          "Valeriana officianalis" 
 [7] "Ranunculus repens"        "Cochlearia spp" 
 [9] "Epilobium palustre"       "Myosotis scorpioides" 
[11] "Mentha aquatica"          "Rumex acetosa" 
[13] "Succisa pratensis"        "Molinea caerulea" 
[15] "Juncus effusus"           "Eriophorum vaginatum" 
[17] "Eriophorum angustifloium" "Sphagnum spp." 
[19] "Calluna vulgaris"         "Vaccinium myrtillus" 
[21] "Polytrichum spp."         "Vaccinium vitis-idea" 


## Carry out pairwise analysis for species in rows 1,2
> hsere.co = species_assoc(1, 2, data = hsereT)
  Pearson's Chi-squared test with Yates' continuity
  correction
data:  mat 
X-squared = 15.1532, df = 1, p-value = 9.913e-05
Warning message:
In chisq.test(mat) : Chi-squared approximation may be incorrect

## View the co-occurrence matrix
> hsere.co$co.occur
              Fillipendula ulmaria
Carex rostrata  +  -
             + 11 11
             -  0 28

## Look at the Pearson residuals
> hsere.co$chi.sq$residuals
               Fillipendula ulmaria
 Carex rostrata         +         -
              +  2.800000 -1.487038
              - -2.481935  1.318118

species_ind species_ind

Indicator species analysis by chi squared.

  This command takes a community dataset and carries out a chi squared test for indicator species. The species can be arranged as the rows or columns. The samples are used as the groupings for the analysis, so you may need to group the dataset using rowsum() before analysis. The result shows only those species that have a significant association (positive or negative) for all samples/groups.
 

Usage

species_ind(x, rows = c("species", "samples"))

 

Arguments

  x a community dataset, usually a data.frame or matrix.
  rows should the rows be treated as "species" (the default) or "samples"?
 

Details

An indicator species is one that is positively associated with a particular site or habitat and negatively associated with other sites/habitats. Ideally an indicator species will also be found only rarely in other sites/habitats. The chi squared approach is one way to explore species associations with sites/habitats.

 

Value

A list with two components (the indicators component is returned directly and the entire list returned invisibly):

  chi.test The results of the chi squared test, which includes the observed and expected values as well as the Pearson residuals.
  indicators A matrix with the names of the indicator species and the grouping; the data are the Pearson residuals. Only species with all Pearson residuals >2 are shown (i.e. only significant ones).
 

See Also

See chisq.test() for details of the chi squared process. See also species_assoc() for pairwise species association. The indval() command from labdsv is probably a more robust alternative, which uses a quite different approach.

Group samples before analysis.

Species can be arranged by row or by column

Examples

## Make a dataset by grouping community samples (by habitat)
> gb = rowsum(gb.biol, group = gb.site$Habitat)

## Indicator species based on habitat
> gb.ind = species_ind(gb, rows = "samples")
Aba.par Ago.afr Bem.big Bem.man Cal.fus Cli.fos
Edge 4.092740 -2.841032 -3.192145 -4.170376 -2.841032 -2.979700
Grass -2.034203 9.506136 11.182116 18.561500 9.506136 9.970119
Wood -2.706151 -2.963855 -3.629967 -7.107075 -2.963855 -3.108518
Lei.ruf Neb.bre Poe.cup
Edge -7.245067 -3.209243 -2.682207
Grass -5.450846 -6.097483 17.636905
Wood 10.205860 6.724140 -7.980425 ## Plot Pearson residuals for Edge habitat > barplot(gb.ind$ind[1,])
> title(main = "Habitat: Edge", ylab = "Pearson Residual")
> abline(h = 0)
> abline(h = c(2, -2), lty = 3)
Barplot of pearson residuals from species_ind results

sr_est sr_est

Species richness estimation with accumulating sites.

Can use accumulating species richness or the result of vegan::specaccum() or regular community dataset This command uses accumulating species richness and a log model to determine the estimated species richness for a complete dataset. You can use a community dataset or the result of the specaccum() command from vegan. The result contains the estimated species richness and the log model. The result holds a class "srest", which has print, plot and summary methods.
Class "srest" has print, summary and plot methods

Usage

sr_est(accum)
print(x, ...)
summary(srest, digits = getOption("digits"))
plot(srest, col.points = "black", col.line = "black",
     lwd.line = 1, lwd.points = 1, ...)

 

Arguments

  accum a vector of values corresponding to accumulating species richness or, a community dataset (rows as samples) or, the result of a specaccum() command from vegan.
  x, srest the result of an sr_est() command.
  digits the number of significant figures to display.
  col.points
lwd.points
the colour and width of the points used in the base plot.
  col.line
lwd.line
the colour and width used for the best-fit line (produced via abline).
  ... additional parameters to pass to plot() or print().
Requires the vegan package

Details

Species accumulation curves (SAC) are used to compare diversity properties of community data sets using different accumulator functions. The sr_est() command uses accumulating species richness as a starting point – the data can be a vector of accumulating values or the result of a vegan::specaccum() command. Alternatively you can use a community dataset as a data.frame or matrix. A log-linear model is then computed to show the estimated species richness from the accumulated sites.

 

Value

A list with custom class "srest", which has print, summary and plot methods. The components are:

  data.model the log model data; a data frame with cumulative species, number of new species and log of sumulative species corresponding to the samples in the original dataset.
  coefficients the intercept and slope from the log model.
  species.estimate the estimated species richness based on the log model.
 

The print method shows simply the estimated species richness. The summary method shows the model data, coefficients and richness. The plot method produces a scatter plot of new species against log of cumulative species, with a best-fit line.

 

See Also

See the specaccum() command in vegan for details of methods of estimating species richness using various collector methods, and specpool() for estimated richness in a species pool. See spec.rich() and specnumber() for simple calculation of species richness. The log model in sr_est() is computed using the lm() command.

Class "srest" has print, summary and plot methods.

Input can be vector of accumulating species richness, can be a community dataset (as a data frame or matrix), or can be the result of a vegan::specaccum() command

Examples

## Load vegan and dune data
> require(vegan)
> data(dune)
## Estimated richness
> specpool(dune)
Species chao chao.se jack1 jack1.se jack2 boot
All 30 32.25 3.395769 32.85 1.645448 33.84474 31.54040
boot.se n
All 1.153121 20
## Use log model for estimate > dune.sr = sr_est(dune)
> dune.sr
Estimated Species Richness
29.42548
## Visualise the model > plot(dune.sr, lty = 2)

Plot of log model from sr_est

## Use species accumulation curve
## for first 6 rows (all same habitat)
> gb.sa = specaccum(gb.biol[1:6,])

## Summary of log model
> summary(sr_est(gb.sa))
Model Data:
   cum.spp   new.spp   log.cs
1 18.16667 18.166667 1.259275
2 21.26667  3.100000 1.327699
3 23.30000  2.033333 1.367356
4 24.80000  1.500000 1.394452
5 26.00000  1.200000 1.414973
6 27.00000  1.000000 1.431364
Model coefficients:
Intercept     Slope 
 131.0205  -92.6311 
Estimated Species Richness 
                  25.96767
 
Top See: Support Files for downloads (RData, Excel and CSV files) to accompany the book
My Publications

R Tips & Tricks
MonogRaphs
Writer's Bloc
An Introduction to R

The R Project Homepage

Follow me...
Facebook Twitter Google+ Linkedin Amazon
Top DataAnalytics Home
Publications
Contact GardenersOwn Homepage