An Introduction to R
Data Analysis and Visualisation
by: Mark Gardener
Available from Pelagic Publishing
An Introduction to R Support and Outline for the book. Our book An Introduction to R is a guide to R: The Statistical Programming Language. You’ll learn how to use R for: data analysis, data visualization, data manipulation, programming and a lot more. This book An Introduction to R will help you learn how to use R from the ground up, giving you a start in the world of Data Science.
On this page you’ll find information about additional articles to support the book An Introduction to R as well as see the Table of Contents and a more detailed outline.
Overview
Our book An introduction to R is designed to help you learn R: The Statistical Programming Language. Regardless of your background or specialty; science, business, engineering, or social science, you’ll find this book a starting point for learning about Data Science, that is, Data Analysis, Data Visualization, and Data Management. The book An Introduction to R is not aimed at any particular educational level and should be accessible to anyone who wants to learn R, a powerful and flexible analytical toolbox.
Who this book is for
An Introduction to R is designed for beginners as well as more seasoned veterans
What you will learn from this book
In this book An Introduction to R you will learn how to use the R programming language. You’ll learn how to get started with R, in particular:
- Making and Importing Data items.
- Exporting Data.
- Managing and Manipulating Data objects.
- Summarizing and Aggregating Data.
- Visualizing Data.
- The basics of Data Analysis, including:
- Differences tests.
- Correlation.
- Association.
- Regression.
- R Programming Functions.
Support Articles and Links
- An Introduction to R will be published by Pelagic Publishing. See all my books at Pelagic Publishing.
- Be notified when the book is ready for sale Pelagic Notification page
- For more articles visit the Tips and Tricks page and look for the various categories or use the search box.
- See also the Knowledge Base where there are other topics related to R and data science.
- Look at our other books on the Publications Page.
- Visit our other site at GardenersOwn for a more ecological matters.
Table of Contents
The main chapter headings for An Introduction to R are as follows in this mini TOC:
- A brief introduction to R
- Basic math
- Introduction to R objects
- Making and importing data objects
- Managing and exporting data objects
- R object types and their properties
- Working with data objects
- Manipulating data objects
- Summarizing data
- Tabulation
- Graphics: Basic Charts
- Graphics: Adding to plots
- Graphics: Advanced methods
- Analyze data: Statistical analyses
- Programming tools
- Appendix
Outline
There are 15 main chapters in the book An Introduction to R. Each chapter starts with a brief overview (the basis for this outline) and ends with a summary table. There are also some self-assessment exercises (answers in the Appendix).
1. A brief introduction to R
The R program is a free Open Source statistical environment. R is a programming language that carries out statistical computing and can produce high quality graphics. R can run on all operating systems, e.g. Windows, Apple Macintosh and most Linux distributions. This chapter shows you how to get R and how to get started.
What’s in this chapter
- Obtaining and installing R.
- Introduction to the R interface.
- Getting help in and about R.
- Extending R with additional command packages.
- Alternative ways to run R.
2. Basic math
R is a statistical programming environment so you’d expect it to be able to perform in the math department. This chapter is an introduction to how to carry out some mathematical operations using R.
What’s in this chapter
- Simple math.
- Not so simple math:
- Logarithms
- Trigonometry
- Modulo math.
- Making numbers look nicer:
- Rounding.
- Exponent notation.
- Significant figures.
- Complex numbers.
This chapter is about some of the more basic math, in later chapters you’ll see more about statistics, summarizing data, hypothesis testing and modelling.
3. Introduction to R objects
This chapter is a brief introduction to R objects. R deals with objects, which usually means either data or functions(instructions to “do something”). You’ll see more about the properties (attributes) of objects in Chapter 6 R object types and their properties.
What’s in this chapter
- Naming conventions for R objects.
- Assigning results to named objects.
- Constants, datasets and built-in R objects.
- An introduction to the different kinds of R object.
4. Making and importing data objects
So far you’ve done some simple math but not dealt with data objects in any meaningful way. In this chapter you’ll learn how join things together to make larger data items and how to get data into R from external sources.
What’s in this chapter
- Joining items.
- Number sequences.
- Repeated elements.
- Enter data via the keyboard.
- Import data:
- Paste data from the clipboard.
- Import data from a disk file (or Internet location).
See Chapter 8 Manipulating data objects for details about making random numbers and generally adding data to existing objects.
5. Managing and exporting data objects
This chapter is about managing your assets and exporting data and other items to disk.
What’s in this Chapter
- View objects in your workspace.
- Manage and remove objects from the workspace.
- Manage files and folders on disk.
- Export different kinds of object to disk files.
6. R object types and their properties
This chapter is about the properties of R objects and how you can view and alter them. So far you’ve seen that there are different kinds of R object, and had a brief introduction to them in Chapter 3 Introduction to R objects. In this chapter the focus is on the properties, such as names for rows and columns, and getting information about the size and shape of your data.
What’s in this Chapter
- Object
class
: how to see what it is and how to change it. - Size and shape: how to see the extent of your data object.
- Names: how to view the names for rows, columns and so on, and how to alter them.
- Preview: how to get an overview of large datasets.
See Chapter 8 Manipulating data objects for details about altering the contents of data objects.
7. Working with data objects
This chapter is about your data, specifically how to “slice and dice” objects. In other words, it’s about how to extract bits of your data that you want/need to deal with and how to rearrange and sort your data into some kind of “sensible” order.
What’s in this chapter
- Elements within R objects.
- Subsets and selecting elements.
- Missing values and complete cases.
- Sampling.
- Rearranging data.
- Sorting.
- Indexing.
- Ranking.
- Working with text (
character
) data. - Working with
factor
objects.
If you need to add data, remove elements, merge data, change data shape or layout, then the next Chapter 8 Manipulating data objects is what you’ll need.
8. Manipulating data objects
This chapter is concerned with the manipulating of data objects and changing them in some manner. This might involve adding new variables or cases, or perhaps deleting them. Topics include how to merge datasets and how to alter the layout or “shape” of data objects.
What’s in this chapter
- Random variates.
- Making “empty” objects.
- Adding new elements.
- Adding variables.
- Adding rows/cases.
- Removing elements.
- Merging datasets.
- Transforming elements.
- Transforming contents.
- Altering data layout or shape.
9. Summarizing data
This chapter is concerned with data summary, that is ways to simplify data and make them more easily understandable. Classic methods of summarizing data include statistics such as averages and measures of the spread of values around the average. However, there are other ways of summarizing data, such as group proportions and percentages. In this chapter you’ll see how to carry out a variety of summarizing functions that will help you deal with data in various ways.
What’s in this chapter
- Summary functions.
- Cumulative functions.
- Functions on entire elements (e.g. rows/columns) of data.
- Functions on groups in your data.
- Custom data summary routines.
10. Tabulation
This is a chapter about data tables, that is, frequency tables (also known as contingency tables). Frequency tables are a useful way of summarizing categorical data and have many uses. R contains various functions for creating and manipulating table data.
What’s in this chapter
Major types of tabulation:
table()
— basic tabulation.ftable()
— flat tables.xtabs()
— cross tabulation.
Also:
- Convert other R objects into
table
format. - Manipulate and alter tables.
- Summary functions for table objects.
11. Graphics: Basic Charts
This chapter is concerned with data visualization. R has powerful and flexible graphical capabilities, and using R you can produce a wide range of graphs and charts. In this chapter you will see how to produce a variety of graphs and charts for various purposes. You will also see how to customize your plots, allowing you to visualize data in different ways.
What’s in this chapter
- Making various graphs and charts.
- Commonly used and useful graphical parameters.
- Visualizing data distribution.
- Histogram.
- Density plot.
- QQ plots.
- Visualizing sample differences.
- Box-whisker plots.
- Bar charts.
- Visualizing relationships.
- Scatter plots.
- Line charts.
- Compositional data.
- Pie charts.
- Mosaic plots.
- Dot charts.
12. Graphics: Adding to plots
In chapter 11 you learnt how to create a range of graphical objects. This chapter is about how to add extra content to those graphics and improve your data visualization. There are many ways to add to existing plot windows, and various functions dedicated for that purpose.
What’s in this chapter
Adding elements to existing plots:
- Adding more data.
- as points.
- as lines.
- Adding grid-lines.
- Adding best-fit lines.
- Adding error bars.
- Adding titles.
- Adding text.
- to the plot.
- in the margin.
- text formatting.
- Adding legends.
- Adding axes.
13. Graphics: Advanced methods
The graphical capabilities of R are extensive. In previous chapters you have seen examples of various “basic” charts, and methods for adding various useful elements to your data visualization (e.g. axis titles, error bars, legends). In this chapter you’ll see various “advanced” graphical topics, things that simply don’t fit neatly into the “basic” category.
What’s in this chapter
- Working with color.
- Tweaking the graphical system.
- Split plot windows.
- Exporting graphics to disk.
- Time-series plots.
- Multivariate plots:
- scatter plot matrix.
- interaction plots.
- conditional plots.
- matrix plots.
- multivariate time-series.
14. Analyze data: Statistical analyses
R has a wide range of statistical capabilities, and can compute many statistics. The base distribution of R includes functions that can carry out many different kinds of data analysis. In this chapter you’ll see some of the more widely used analytical methods.
The functions you see here will give you a flavour of the capabilities of R.
What’s in this chapter
- Tests of data distribution
- Distribution families
- Tests for differences
- T-test
- U-test
- Analysis of variance (ANOVA)
- Kruskal-Wallis
- Test of relationships
- Correlation
- Regression
- Tests of association
- Chi-Squared
- Goodness of Fit
This is not a book about statistics, so it is not possible to include all the statistical analyses that R can conduct. This chapter includes a table giving brief details of various other statistical tests that R can conduct. See also the Tips and Tricks page to find extra information that there wasn’t room for in the book.
15. Programming tools
In this chapter you’ll see some of the ways you can use special programming tools to help you make the most of R. This is not an exhaustive thesis about programming, but you’ll see some of the basic tools that will carry you a long way.
If you have followed some of the examples in this book then you’ll also have done some programming already.
What’s in this chapter
- R Scripts
- Functional Programming
- Input parameters
- Function results
- User intervention
- Conditional expressions
- Error trapping
- Loops
- Complex result objects
- Custom classes
- Managing Functions
- R Environments
One of the strengths of R is you can customize it for your own solutions. You can do this in two main ways:
- Custom scripts.
- Run using
source()
or from within a “helper” program. - Do the same thing over and over.
- Limited user intervention.
- Run using
- Custom commands/functions.
- Allow great flexibility.
- Allow user intervention.
- Custom functions saved as
.RData
files. - Loaded using
load()
command.
16. Appendix
Here you can find the answers to the end-of-chapter exercises.
My Publications
I have written several books on ecology and data analysis
Register your interest for our Training Courses
We run training courses in data management, visualisation and analysis using Excel and R: The Statistical Programming Environment. Courses will be held at one of our training centres in London. Alternatively we can come to you and provide the training at your workplace. Training Courses are also available via an online platform.
Get In Touch Now
for any information regarding our training courses, publications or help with a data project