On this page find a table of contents and outline.
Other links:
- See the custom R commands I wrote during production.
- See the support files page, for data, spreadsheets and RData.
- See the News page for details about updated files, new custom commands and so on.
See our sister site, DataAnalytics: Ecology Matters, for resources for Ecology Students & Teachers. Including: data examples to use for practise and demonstration, and Custom Functions for R: The Statistical Programming Language.
Table of Contents
The book is split into 14 chapters; the first few are designed to get you up and running by helping you plan your approach and to become familiar with the software tools you’ll be using. The later chapters delve into various major topics of community ecology. The appendix contains answers to the end-of-chapter exercises and a list of custom R commands created during the writing of this book. Click on a heading to go to a more detailed description:
- Starting to look at communities
- Software tools for community ecology
- Recording your data
- Beginning data exploration – Using software tools
- Exploring data – Choosing your analytical method
- Exploring data – Getting insights
- Diversity – Species Richness
- Diversity – Indices
- Diversity – Comparing
- Diversity – Sampling scale
- Rank abundance or dominance models
- Similarity and Cluster Analysis
- Association analysis: Identifying Communities
- Ordination
- Appendix
What you will learn in this book
This book is intended to give you some insights into some of the analytical methods employed by ecologists in the study of communities. The book is not intended to be a mathematical or theoretical treatise but inevitably there is some maths! I’ve tried to keep this in the background and to focus on how to undertake the appropriate analysis at the right time. There are many published works concerning ecological theory; this book is intended to support them by providing a framework for learning how to analyse your data.
The book does not cover every aspect of community ecology. There are a few minor omissions – I hope to cover some of these in later works.
How this book is arranged
There are four main strands to scientific study; planning, recording, analysis and reporting. The first few chapters deal with the planning and recording aspects of study. You will see how to use the main software tools, Excel and R, to help you arrange and begin to make sense of your data. Later chapters deal more explicitly with the grand themes of community ecology, which are:
- Diversity – the study of diversity is split into several chapters covering species richness, diversity indices, beta diversity and dominance-diversity models.
- Similarity and Clustering – this is contained in one chapter covering similarity, hierarchical clustering and clustering by partitioning.
- Association Analysis – this shows how you can identify which species belong to which community by studying the associations between species. The study of associations leads into the identification of indicator species.
- Ordination – there is a wide range of methods of ordination and they all have similar aims; to represent complicated species community data in a more simplified form.
The reporting element is not covered explicitly; however, the presentation of results is shown throughout the book. A more dedicated coverage of statistical and scientific reporting can be found in my previous work, Statistics for Ecologists Using R and Excel.
Each chapter ends with a summary and several self-assessment questions – the answers are in the appendix. The appendix also contains a list of the custom R commands made during the production of this book. These commands are available as part of the CERE.RData file.
Throughout the book you will see example exercises that are intended for you to try out. In fact, they are expressly aimed at helping you on a practical level – reading how to do something is fine but you need to do it for yourself to learn it properly. The Have a Go exercises are hard to miss.
Chapter-by-chapter Outline
1 Starting to look at communities
1.1 A scientific approach
1.2 The topics of community ecology
1.3 Getting data – using a spreadsheet
1.4 Aims and hypotheses
This short chapter sets the scene and aims to get you thinking in a scientific manner. Section 1.2 provides a brief overview of the topics covered in the rest of the book.
2 Software tools for community ecology
2.1 Excel
2.2 Other spreadsheets
2.3 The R program
This chapter is also pretty short and is aimed at helping you to get the software tools you’ll need. In particular there is a section on how to download and install the R program. In general, you’ll be shown how to manage and manipulate your data using Excel, with the majority of the analyses being carried out using R. Where it is feasible to use Excel for an analysis then you’ll be shown how to do it using Excel and R. However, as analyses become more complicated you’ll see R used more exclusively.
3 Recording your data
3.1 Biological data
3.2 Arranging your data
The more complicated your data are the more important it is that you arrange your data in a “set” manner. This short chapter provides an introduction to the notion of “Biological Recording”, simply a way to write your data in a flexible format that allows you to access and manage it easily.
4 Beginning data exploration - Using software tools
4.1 Beginning to use R
4.2 Manipulating data in a spreadsheet
4.3 Getting data from Excel into R
This chapter is where you’ll start becoming familiar with the software tools that you’ll be using most often, your spreadsheet (probably Excel) and R, the statistical programming environment. Section 4.1 will get you started using R, so if you’ve never used R before then this will be worth working through. Section 4.2 gives you some help using Excel, including the use of Pivot Tables and lookup tables as well as more familiar things like sorting and filtering. The final section shows you how to transfer data from Excel into R.
5 Exploring data – Choosing your analytical method
5.1 Categories of study
5.2 How “classic” hypothesis testing can be used in community studies
5.3 Analytical methods for community studies
This chapter provides an overview of the analytical methods that you might use in exploration of ecological community data. There is also mention of a few approaches that are not so useful! The idea is to provide you with a sense of the analytical approaches that you can undertake so that you can plan your studies more effectively.
6 Exploring data – Getting insights
6.1 Error Checking
6.2 Adding extra information
6.3 Getting an overview of your data
This chapter is concerned with managing your data and starting to make sense of it. The first section is about error checking, an oft neglected aspect of data analysis! You may spot some errors in the datasets used as examples in this book. I’ve deliberately left some of these in place to keep you “on your toes”. Section 6.2 is about adding extra information; this can be particularly useful to help group and re-group your data for later analysis.
The final section is about looking at your data without doing any “real” analysis. This means getting averages for groups and showing your data in graphical form. Most of this can be done in Excel and this is illustrated using Pivot Tables and chart tools. The latter part of section 6.3 shows you how to do similar things in R – if you are not familiar with R then you’ll find this material especially helpful.
7 Diversity – Species Richness
7.1 Comparing species richness
7.2 Correlating species richness over time or against an environmental variable
7.3 Species richness and sampling effort
Diversity is one of the grand themes of community ecology covered by this book. In fact it is such a large topic that it is split into five chapters. This chapter is about species richness; put another way it is about what you can do when you only have presence-absence data. The last section covers species and sampling area, this includes rarefaction and estimating species richness using non-linear modelling.
8 Diversity – Indices
8.1 Simpson’s Index
8.2 Shannon Index
8.3 Other Diversity indices
This is about indices of diversity, that is, ways of taking species relative abundance into account as well as the number of species in a sample. The most commonly used indices are Simpson’s and Shannon, so these are highlighted with their own sections. In section 8.3 you’ll see other indices of diversity such as Fisher’s alpha, Berger-Parker dominance and two entropies, Rényi and Tsallis.
The chapter focusses on calculating the index values but you’ll also find out about evenness and the idea of “effective species”, the latter being what is also termed “true diversity”.
9 Diversity – Comparing
9.1 Graphical comparison of diversity profiles
9.2 A test for differences in diversity based on the t-test
9.3 Graphical summary of the t-test for Shannon and Simpson indices
9.4 Bootstrap comparisons for unreplicated samples
9.5 Comparisons using replicated samples
This chapter shows you a variety of ways to set about comparing diversity between samples in a meaningful manner. There are two main approaches aside from a purely graphical approach. Section 9.2 explores versions of the t-test that have been developed for use with Simpson’s and Shannon indices. These are popular so I’ve devoted quite a bit of space to them. In sections 9.4 and 9.5 you’ll see methods for using bootstrapping to assess differences between samples. Bootstrapping is a way of randomising samples and is becoming increasingly common as a technique, due no doubt to the increasing sophistication of computer software.
10 Diversity – Sampling scale
10.1 Calculating beta diversity
10.2 Additive diversity partitioning
10.3 Hierarchical partitioning
10.4 Group dispersion
10.5 Permutation methods
10.6 Overlap and similarity
10.7 Beta diversity using alternative dissimilarity measures
10.8 Beta diversity compared to other variables
Diversity can be measured at several scales and this chapter is mostly about beta diversity – that is, changes in diversity from sample to sample. Section 10.1 focusses on simply determining beta diversity between samples as well as dome methods for visualising the results (as dendrograms or ternary plots). Sections 10.2 to 10.5 are concerned with methods for exploring changes in diversity between samples and getting an idea of the significance of these changes.
Section 10.6 looks at beta diversity from another aspect, that of overlap (or similarity). In section 10.7 you’ll see how you can calculate beta diversity using various metrics of dissimilarity. The final section looks at one way to explore beta diversity in relation to other variables – this involves use of Mantel tests.
11 Rank abundance or dominance models
11.1 Dominance models
11.2 Fisher’s Log-series
11.3 Preston’s Lognormal model
One way to look at diversity is to rank species in order of their abundance and then plot the results on a graph, usually with a log scale. The shape of the graph can be modelled and many attempts have been made to link ecological theory to the shape of these models. This chapter is about these rank abundance models (or dominance models). In section 11.1 you’ll see the main models used and you’ll be able to see the “best” model for your samples. Sections 11.2 and 11.3 focus on two of the log-normal models, Fisher’s log-series and its “successor”, Preston’s lognormal.
12 Similarity and Cluster Analysis
12.1 Similarity and Dissimilarity
12.2 Cluster Analysis
Similarity and cluster analysis is one of the grand themes of community ecology. In section 12.1 you’ll see how to produce indices of similarity (and conversely dissimilarity) for samples using presence-absence or abundance data. Following on from this, section 12.2 covers cluster analysis, where samples are grouped together. This allows you to visualise your samples based on their composition (i.e. their dissimilarity or similarity to one another). There are two main approaches to clustering, hierarchical and partitioning. Both are demonstrated using a range of techniques.
13 Association analysis: Identifying Communities
13.1 Area approach to identifying communities
13.2 Transect approach to identifying communities
13.3 Using alternative dissimilarity measures for identifying communities
13.4 Indicator species
The analyses in the rest of the book assume that your samples are already “sorted” into communities. This chapter covers association analysis, which is the means by which you can identify which species tend to live together and which do not. In this way you can identify the species that make up various communities. This kind of analysis alters the focus from the samples to the species themselves. The main thrust of association analysis is the chi squared test and this is what you’ll see in sections 13.1 and 13.2. In section 13.3 you’ll see how to use other metrics of dissimilarity to do a similar job.
The final section is about indicator species. Chi square is not the only approach to analysis of indicator species but it “fits” with the theme of the chapter. There are other ways to look at indicator species but I have not included them here. I hope to produce a monograph about TWINSPAN at some point in the future. The Dufrene-Legendre method IndVal, could have been included as it can be calculated using R; I will try to fit it into a later edition or add material to the website.
14 Ordination
14.1 Methods of ordination
14.2 Indirect gradient analysis
14.3 Direct gradient analysis
14.4 Using ordination results
The topic of ordination (or multivariate analysis) is a broad one. Section 14.1 gives an overview of the main methods of ordination and some clues as to which might be used for which occasion. There are two main strands to ordination, indirect gradient analysis and direct gradient analysis. The former is used when you have species data, the latter is used when you wish to incorporate environmental data.
Section 14.2 covers Bray-Curtis ordination, which you’ll see illustrated using Excel. This is useful for “beginners” as it helps to work out what the main thrust of ordination is about. The other methods are not trivial to conduct using Excel so you’ll use R for them. Later in the section you’ll see MDS, NMDS, PCO, PCA, CA and DCA used to explore species composition. You’ll also see how to incorporate environmental data to the results of your ordination. In section 14.3 you’ll see CCA and RDA used as the main tools for direct gradient analysis. There is a short section on model-building, showing how you can build the “best” explanatory model for your data.
The final section looks at a few ancillary aspects of ordination, such as adding environmental information, and identifying groupings on plots. You’ll also see how to carry out Procrustean rotation to compare ordination results. Finally, there is a short review of alternative methods to ordination – most of which are covered elsewhere in the book (particularly in chapter 10).
15 Appendix
15.1 Answers to Exercises
15.2 Custom R commands in this book
Each chapter ends with a summary and several self-assessment questions. The answers to these questions are here in appendix 15.1. During the writing of this book I wrote a number of custom R commands that I thought were useful, they are mentioned in the text at appropriate points. The appendix 15.2 give a complete list of the custom commands as well as notes on their use.
My Publications
I have written several books on ecology and data analysis
Register your interest for our Training Courses
We run training courses in data management, visualisation and analysis using Excel and R: The Statistical Programming Environment. Courses will be held at one of our training centres in London. Alternatively we can come to you and provide the training at your workplace. Training Courses are also available via an online platform.
Get In Touch Now
for any information regarding our training courses, publications or help with a data project