Welcome to the support page for my book, Beginning R: The Statistical Programming Language.
Here you will find a Table of Contents and brief outline to help you see what’s included in each section of the book.
The book includes many examples and there is a file that you can download, which contains the data and example code that is shown in the book.
My publisher is also hosting an Instructor Support Site, where you can download additional materials to help you teach R.
Table of Contents
Here is a complete table of contents, with a few notes about each chapter so you can see the main learning outcomes (click on a link to jump to the chapter description):
- Introducing R
- Starting out
- Working with objects
- Descriptive statistics and tabulation
- Data: distribution
- Simple hypothesis testing
- Introduction to graphics
- Formula notation
- Manipulating data
- Regression
- More graphs
- Writing you own scripts
1 Introducing R: What It Is and How to Get It
What you will learn in this chapter
- Discovering what R is
- Getting to the R program
- Installing it on your computer
- Starting to run the program
- Using the help system and finding help from other sources
- Obtaining additional libraries of commands
In this chapter you see how to get R and install it on your computer. You also learn how to access the built-in help system and find out about additional packages of useful analytical routines that you can add to R.
2 Starting Out: Becoming Familiar with R
What you will learn in this chapter
- How to use R for simple math
- How to store results of calculations for future use
- How to create data objects from the keyboard, clipboard, or external data files
- How to see the objects that are ready for use
- How to look at the different types of data objects
- How to make different types of data objects
- How to save your work
- How to use previous commands in the history
This chapter builds some familiarity with working with R, beginning with some simple math and culminating in importing and making data objects that you can work with (and saving data to disk for later use).
3 Starting Out: Working with Objects
What you will learn in this chapter
- How to manipulate data objects
- How to select and display parts of data objects
- How to sort and rearrange data objects
- How to construct data objects
- How to determine what form a data object is
- How to convert a data object from one form to another
This chapter deals with manipulating the data that you have created or imported. These are important tasks that underpin many of the later exercises. The skills you learn here will be put to use over and over again.
4 Data: Descriptive Statistics and Tabulation
What you will learn in this chapter
- How to summarize data samples
- How to use cumulative statistics
- How to create summary tables
- How to cross-tabulate
- How to test for different object types
This chapter is all about summarizing data. Here you learn about basic summary methods, including cumulative statistics. You also learn how about cross-tabulation and how to create summary tables.
5 Data: Distribution
What you will learn in this chapter
- How to create histograms and other graphics of sample distribution
- How to examine various distributions
- How to test for the normal distribution
- How to generate random numbers
In this chapter you look at visualizing data using graphical methods—for example, histograms—as well as mathematical ones. This chapter also includes some notes about random numbers and different types of distribution (for example, normal and Poisson).
6 Simple Hypothesis Testing
What you will learn in this chapter
- How to carry out some basic hypothesis tests
- How to carry out the Student’s t-test
- How to conduct the U-test for non-parametric data
- How to carry out paired tests for parametric and non-parametric data
- How to produce correlation and covariance matrices
- How to carry out a range of correlations tests
- How to test for association using chi squared
- How to carry out goodness of fit tests
In this chapter you learn how to carry out some basic statistical methods such as the t-test, correlation, and tests of association. Learning how to do these is helpful for when you have to carry out more complex analyses and also illustrates a range of techniques for using R.
7 Introduction to Graphical Analysis
What you will learn in this chapter
- How to create a range of graphs to summarize your data and results
- How to create box-whisker plots
- How to create scatter plots, including multiple correlation plots
- How to create line graphs
- How to create pie charts
- How to create bar charts
- How to move graphs from R to other programs and to save graphs as files on disk
In this chapter you learn how to produce a range of graphs including bar charts, scatter plots, and pie charts. This is a “first look” at making graphs, but you return to this subject in Chapter 11, where you learn how to turn your graphs from merely adequate to simply stunning.
8 Formula Notation and Complex Statistics
What you will learn in this chapter
- How to use formula notation for simple hypothesis tests
- How to use formula notation in graphics
- How to carry out analysis of variance (ANOVA)
- How to conduct post-hoc tests
- How the formula syntax can be used to define complex analytical models
- How to carry out complex ANOVA
- How to draw summary graphs of ANOVA
- How to create interaction plots
As your analyses become more complex, you need a more complex way to tell R what you want to do. This chapter is concerned with an important element of R: how to define complex situations. The chapter has two main parts; the first part shows how the formula notation can be used with simple situations. The second part uses an important analytical method, analysis of variance, as an illustration. The rest of the chapter is devoted to ANOVA. This is an important chapter because the ability to define complex analytical situations is something you will inevitably require at some point.
9 Manipulating Data and Extracting Components
What you will learn in this chapter
- How to create data frames and matrix objects ready for complex analyses
- How to create or set factor data
- How to add rows and columns to data objects
- How to use simple summary commands to extract column or row information
- How to extract summary statistics from complex data objects
This chapter builds on the previous one. Now that you have seen how to define more complex analytical situations, you learn how to make and rearrange your data so that it can be analyzed more easily. This also builds on knowledge gained in Chapter 3. In many cases, when you have carried out an analysis you will need to extract data for certain groups; this chapter also deals with that, giving you more tools that you will need to carry out complex analyses easily.
10 Regression (Linear Modelling)
What you will learn in this chapter
- How to carry out linear regression (including multiple regression)
- How to carry out curvilinear regression using logarithmic and polynomials as examples
- How to build a regression model using both forward and backward stepwise processes
- How to plot regression models
- How to add lines of best-fit to regression plots
- How to determine confidence intervals for regression models
- How to plot confidence intervals
- How to draw diagnostic plots
This chapter is all about regression. It builds on earlier chapters and covers various aspects of this important analytical method. You learn how to carry out basic regression as well as complex model building and curvilinear regression. It is also important because it illustrates some useful aspects of R (for example, how to dissect results). The later parts of the chapter deal with graphical aspects of regression, such as how to add lines of best-fit and confidence intervals.
11 More About Graphs
What you will learn in this chapter
- How to add error bars to existing graphs
- How to add legends to plots
- How to add text to graphs, including superscripts and subscripts
- How to add mathematical symbols to text on graphs
- How to add additional points to existing graphs
- How to add lines to graphs, including mathematical expressions
- How to plot multiple series on a graph
- How to draw multiple graphs in one window
- How to export graphs to graphics files and other programs
This chapter builds on the earlier chapter on graphics (Chapter 7) and also from the previous chapter on regression. It shows you how to produce more customized graphs from your data. For example, you learn how to add text to plots and axes, and how to make superscript and subscript text and mathematical symbols. You learn how to add legends to plots and how to add error bars to bar charts or scatter plots. Finally, you learn how to export graphs to disk as high-quality graphics files, suitable for publication.
12 Writing Your Own Scripts - Beginning to Program
What you will learn in this chapter
- How to store series of commands as snippets to be used with copy/paste
- How to make your own help file
- How to create simple customized functions
- How to edit, store, and recall customized functions
- How to add notes/annotations to your scripts
- How to create complex program code
In this chapter you learn how to start producing customized functions and simple scripts that can automate your workflow and make complex and repetitive tasks a lot easier.
Example Data File
The book includes many examples and these are included in the Beginning.RData file.
Get the example data
You can download that file by clicking on the link (Beginning.RData). This one file contains all the example datasets and scripts you need for the whole book.
For additional data examples see our sister site, DataAnalytics: Ecology Matters, where you can find resources for Ecology Students & Teachers. Including: data examples to use for practise and demonstration, and Custom Functions for R: The Statistical Programming Language.
Install the example data
Once you have the file on your computer you can load it into R by one of several methods:
- For Windows or Mac you can drag the RData file icon onto the R program icon; this will open R if it is not already running and load the data. If R is already open, the data will be appended to anything you already have in R otherwise only the data in the file will be loaded.
If you have Windows or Macintosh you can load the file using menu commands or use a command typed into R:
- For Windows use File > Load Workspace, or type the following command in R:
load(file.choose())
- For Mac use Workspace, Load Workspace File, or type the following command in R (same as in Windows):
load(file.choose())
- If you have Linux then you can use the load() command but must specify the filename (in quotes) exactly, for example:
load(“Beginning.RData”)
The Beginning.RData file must be in your default working directory and if it is not you must specify the location as part of the filename. Alternatively you can find the working directory in R by using the getwd()
command:
getwd()
Then drag the Beginning.RData file into that directory and use the load()
command:
load("Beginning.RData")
Using the example data
R uses named objects so everything gets a name. You can see what is included in the Beginning.RData file by using the ls()
command:
ls()
This will show you everything currently in the memory of R. Remember that names are case sensitive so that Qty
is not the same as qty
. There are four main kinds of object in the Beginning.RData file:
- Data
- Results
- One-line functions
- Complex functions/scripts
You can look at an object simply by typing its name.
Data
Many of the objects in the Beginning.RData file are data. For example, the bv object shows some results for visits of bees to various colors of flower.
> bv ratio visit Red 10.0 100 Blue 5.0 33 White 15.0 12 Green 10.0 16 Yellow 5.0 22 Orange 2.5 7 Pink 6.0 23 Purple 12.0 17
These data are used to carry out a Goodness of fit test by comparing the observed visits to the theoretical ratio expected.
Results
Some of the objects in the Beginning.RData file are results. For example the pw.kw
object shows the results of a Kruskal-Wallis test.
> pw.kw Kruskal-Wallis rank sum test data: height by water Kruskal-Wallis chi-squared = 15.205, df = 2, p-value = 0.0004992
The results of analyses are sometimes used for further analyses and to draw graphs.
One-line functions
R is very flexible and one useful aspect is the ability to create simple functions. For example, the pn
object is a function that applies a polynomial formula to any numerical value.
> pn function(x) (2.06*x)+(-0.04 * x^2)-2
In this case the polynomial formula was taken from a previous analysis and is used to draw a line of best-fit onto a graph.
Complex functions/scripts
If you require a more complex task or want to automate your workflow, you can create a longer “script”. The cum.fun object is an example of such a script.
> cum.fun function(x, fun = median, ...) { tmp = seq_along(x) for(i in 1:length(tmp)) tmp[i] = fun(x[1:i], ...) cat('\n', deparse(substitute(fun)),'of', deparse(substitute(x)),'\n') print(tmp)
This script allows you to generate a cumulative statistic for a set of numbers. The default uses the median but you can specify any sensible function (the mean for example to create a running mean).
Instructor Support Materials
Instructors (teachers, lecturers, professors) can now access a range of support materials via the Instructor Companion Site on the Wiley Higher Education website (you have to register but it is free).
The materials include:
- An annotated syllabus split into 30 sections. Intended to be approximately 1 hour each section.
- A series of PowerPoint decks. Each deck is linked to a section of the annotated syllabus.
- Classroom exercises. These compliment the 30 sections of the syllabus and form a structured approach to teaching R.
- Questions and Answers. Each of the 12 chapters has 12 questions (the answers are separate). The questions are in 3 forms:
- TRUE or FALSE?
- Multiple choice
- Fill in the missing word
If you are an instructor and are teaching R, then these materials can help you structure your course and provide you with additional materials that you can press into service as you like.
My Publications
I have written several books on ecology and data analysis
Register your interest for our Training Courses
We run training courses in data management, visualisation and analysis using Excel and R: The Statistical Programming Environment. Courses will be held at one of our training centres in London. Alternatively we can come to you and provide the training at your workplace. Training Courses are also available via an online platform.
Get In Touch Now
for any information regarding our training courses, publications or help with a data project