Dr. Mark Gardener 

Data Analysis  Publications  Courses  About  
On this page... Introduction to graphing 
Using R for statistical analyses  Graphs 2This page is intended to be a help in getting to grips with the powerful statistical program called R. It is not intended as a course in statistics (see here for details about those). If you have an analysis to perform I hope that you will be able to find the commands you need here and copy/paste them into R to get going. On this page you can find out information on producing a range of graphs to illustrate your analyses. Specifically on this page find out about scatter plots, stemleaf plots and pie charts. To find out about bar charts, histograms and boxwhisker plots go to the graphs1 page. What is R?  Topic Navigation Index R Tips, Tricks and Hints  MonogRaphs  Go to 1st Topic I run courses in using R; these may be held at various locations:
If you are interested then see our Courses page or contact us for details. My publications about R and Data Science 

See my books about R and Data Science on my Publications page  
I have more projects in hand  visit my Publications page from time to time. You might also like my random essays on selected R topics in MonogRaphs. See also my Writer's Bloc page, details about my latest writing project including R scripts developed for the book. 

R is Open Source 
What is R?R is an opensource (GPL) statistical environment modeled after S and SPlus. The S language was developed in the late 1980s at AT&T labs. The R project was started by Robert Gentleman and Ross Ihaka (hence the name, R) of the Statistics Department of the University of Auckland in 1995. It has quickly gained a widespread audience. It is currently maintained by the R coredevelopment team, a hardworking, international team of volunteer developers. The R project web page is the main site for information on R. At this site are directions for obtaining the software, accompanying packages and other sources of documentation. R is a powerful statistical program but it is first and foremost a programming language. Many routines have been written for R by people all over the world and made freely available from the R project website as "packages". However, the basic installation (for Linux, Windows or Mac) contains a powerful set of tools for most purposes. Because R is a programming language it can seem a bit daunting; you have to type in commands to get it to work. However, it does have a Graphical User Interface (GUI) to make things easier. You can also copy and paste text from other applications into it (e.g. word processors). So, if you have a library of these commands it is easy to pop in the ones you need for the task at hand. That is the purpose of this web page; to provide a library of basic commands that the user can copy and paste into R to perform a variety of statistical analyses. 

Navigation index 

R is not a point and click interface. However, it has
great power and versatility.

Introduction to GraphingR has great graphical power but it is not a point and click interface. This means that you must use typed commands to get it to produce the graphs you desire. This can be a bit tedious at first but once you have the hang of it you can save a list of useful commands as text that you can copy and paste into the R command line. 

The plot() command is used to make scatter plots The plot() command accepts input as plot(x, y) or a formula plot(y ~ x) Use the abline() command to draw a straight line of bestfit Additional instructions: 
Scatter PlotsA scatter plot is used when you have two variables to plot against one another. R has a basic command to perform this task. The command is plot(). As usual with R there are many additional parameters that you can add to customise your plots. The basic command is: plot(x, y) Where x is the name of your xvariable and y is the name of your yvariable. This is fine if you have two variables but if they are part of a bigger data set then you have to remember to attach(data.file) your data set. A more powerful command is: plot(y ~ x, data= your.data) Note the use of the model syntax. This model syntax is used widely in R for settingup ANOVA and regression analyses for example (see also it's use in the boxwhisker plot). R comes with a number of data sets builtin; these are used in the examples and can be useful to 'play with'. For example the data set cars contains two variables speed and dist. To see a basic scatter plot try the following: > plot(dist ~ speed, data= cars) This basic scatter takes the axes labels from the variables and uses open circles as the plotting symbol. As usual with R we have a wealth of additional commands at our disposal to beef up the display. A useful additional command is to add a line of bestfit. This is a command that adds to the current plot (like the title() command). For the above example you would type: > abline(lm(dist ~ speed, data= cars)) The basic command uses abline(a, b), where a= slope and b= intercept. Here you use a linear model command to calculate the bestfit equation (try typing the lm() command separately, you get the intercept and slope). If you combine this with a couple of extra lines you can produce a better looking plot: > plot(dist
~ speed, data= cars, xlab="Speed",
ylab="Distance", col= "blue")
This illustrates several additional instructions. You have set the axis labels and the colour of the plotting symbols. Next you added a main title and set the font to bold italic (try other values). Finally you set the bestfit line and made it red. You can alter the plotting symbol using the command pch= n, where n is a simple number. You can also alter the range of the x and y axes using xlim= c(lower, upper) and ylim= c(lower, upper). The size of the plotted points is manipulated using the cex= n command, where n = the 'magnification' factor. Here are some commands that illustrate these parameters: > plot(dist
~ speed, data= cars, pch= 19, xlim= c(0,25), ylim= c(20, 120), cex=
2) Here the plotting symbol is set to 19 (a solid circle) and expanded by a factor of 2. Both x and y axes have been rescaled. The labels on the axes have been left blank and default to the name of the variable (which is taken from the data set). 

The stem() command makes a simple stem & leaf plot 
Stem and leaf plotsA very basic yet useful plot is a stem and leaf plot. It is a quick way to represent the distribution of a single sample. The basic command is: stem(variable) Here is a vector of numbers saved as the variable test.data: [1] 2.1 2.6 2.7 3.2 4.1 4.3 5.2 5.1 4.8 1.8 1.4 2.5 2.7 3.1 2.6 2.8 To see the stem plot of these data you type: > stem(test.data) The decimal point is at the 
You can now see quite clearly that the data are not normally distributed. This is a useful command for moderately small samples as you can easily reconstruct the original data from the plot. For other samples the barplot function may be used to create a frequency plot. Alternatively a histogram may be more useful. 

The pie() command produces pie charts Use the clockwise = TRUE instruction to alter the direction of the slices Control colours using the col instruction 
Pie chartsPie charts are not necessarily the most useful way of displaying data but they remain popular. You can produce pie charts easily in R using the basic command pie() To start with get your data organised into a .CSV file. Make a file with multiple columns then each column can have a title and a single value (to plot). Here is a simple example file:
To produce a simple pie chart you type the following: > pie(pie.data) This is a basic chart; you can see that the names of the columns have been appended to each slice. You can add a title in the usual way using the title() command. By default the slices are presented in anticlockwise order, you can alter this by adding a simple command clockwise= TRUE The colours are set to pastel shades by default, to alter them you can add a list of colours to the command line in the form col= c("col1", col2", col3"). Here is the finished article: > pie(pie.data,
clockwise=TRUE, col= c("red", "orange", "yellow", "green", "blue", "purple")) Now you have clockwise slices with our own selection of colours. The title was set with a separate command and the font set to bold italic (try other values). 

Data Analysis Home  Back to Graphs 1 (Bar, Hist, Box)  R Tips & Tricks  MonogRaphs  Forward to Graphs 3 (Line, Custom axes)  