The 3 Rs – Writing
You can probably categorize the sorts of thing that you can write into four types:
- R Objects.
Data is a general term and by this I mean numbers and characters that you might reasonably suppose could be handled by a spreadsheet. Your data are what you use in your analyses.
R objects may be data or other things, such as custom R commands or results. They are usually stored (on disk) in a format that can only be read by R but sometimes they may be in text form.
Graphics are anything that you produce in a separate graphics window, which seems fairly obvious. These items do not appear as regular R objects and have to be treated differently.
Scripts are collections of R commands that are designed to be run “automatically”. They are generally saved (on disk) in text format.
Writing data files
It is useful to be able to write a basic dataset to disk in a standard format that allows it to be opened by different people. The basic distribution of R allows you to save basic text formats easily. If you need to write a proprietary format, such as XLSX you’ll need to use additional command packages.
Writing basic text formats
Basic text formats are the most generally useful formats for saving datasets, since they can be handled by the widest range of programs. The comma delimited format (.csv) is the most widely used but Tab and space delimited are also commonly encountered. The workhorse is the write.table() command for this kind of work.
The write.table() command allows you to specify a range of options so you can tailor the output exactly as you want it. However, there are also two convenience commands that help to produce CSV files:
write.csv()
write.csv2()
The write.csv() command gives common defaults that produce basic CSV files, whilst the write.csv2() command produces European style CSV, with commas as decimal point characters and the semi-colon as the delimiter.
The write.table() command
This is the basic command and has a range of options.
wrtie.table(x, file = "", append = FALSE, quote = TRUE, sep = " ", eol = "\n", na = "NA", dec = ".", row.names = TRUE, col.names = TRUE, qmethod = c("escape", "double"))
x | The object to be written; ideally this is a data.frame or matrix. |
file = “” | The filename in quotes; if blank, the output goes to the current device (usually the screen). Filename defaults to the current working directory unless specified explicitly. Can also link to URL. For Windows and Mac OS the filename can be replaced by file.choose(), which brings up a file browser. |
append = FALSE | If the output is a file, append = TRUE adds the result to the file, otherwise the file is overwritten. |
quote = TRUE | Adds quote marks around text items if set to TRUE (the default). |
sep = ” “ | The separator between items (a space), for write.csv the default is “,” whilst for write.csv2 it is “;”. Specify “\t” for Tab character. |
eol = “\n” | Sets the character(s) to print at the end of a row. The default, “\n” creates a newline only. Use “\n\r” to mimic Windows endings. |
na = “NA” | Sets the character string to use for missing values in the data. |
dec = “.” | The decimal point character. For write.csv2 this is “,”. |
row.names = TRUE | If set to FALSE, the first column is ignored. A separate vector of values can be given to use as row names. |
col.names = TRUE | If set to FALSE, the first row is ignored. A separate vector of values can be given to use as column names. If col.names = NA, an extra column is added to accommodate row names (this is the default for write.csv and write.csv2). |
qmethod = “escape” | Specifies how to deal with embedded double quote characters. The default “escape” produces a backslash and “double” doubles the quotes. |
Using the write.table() command is quite straightforward but you need to be aware of how row names are dealt with. Here is a simple data.frame with two columns and three rows, which are named:
> dat = data.frame(col1 = 1:3, col2 = 4:6) > rownames(dat) = c("First", "Second", "Third") > dat col1 col2 First 1 4 Second 2 5 Third 3 6
The defaults assume that there are both column and row names:
> write.table(dat, file = "") # sends to screen "col1" "col2" "First" 1 4 "Second" 2 5 "Third" 3 6
The file may not be read correctly because there are one fewer items in the first row. R will generally read such files okay but your spreadsheet will not. You need to add an extra column to the column names, you do this by specifying col.names = NA like so:
> write.table(dat, file = "", col.names = NA) "" "col1" "col2" "First" 1 4 "Second" 2 5 "Third" 3 6
Now you get an extra item in the column headings and the spreadsheet will read thing correctly.
The write.csv() command
The write.csv() and write.csv2() commands are convenience functions that provide useful defaults that you’d expect to use for writing CSV files. The defaults are set so that the separator is a comma (or semicolon for write.csv2) and the decimal point is a period (or comma for write.csv2). Importantly col.names = NA and row.names = TRUE are set. This means that row names are automatically written and an extra column added to the column names.
If you do not want to write the row names you simply set row.names = FALSE:
# Default writes row names and adds to column heading > write.csv(dat) "","col1","col2" "First",1,4 "Second",2,5 "Third",3,6
> write.csv(dat, row.names = FALSE) # row names not written "col1","col2" 1,4 2,5 3,6
In most cases the CSV file is the “go to” format for transferring data to the widest range of computer programs. However, space or Tab delimited are useful for showing data on web pages.
The write() command
The write.table() command is designed to deal with 2D objects such as data.frame and matrix items. If you have a simple vector you need a different approach.
The write() command can deal with vector or matrix objects. A matrix is essentially a vector that’s been split into rows and columns. However, the write() command cannot handle the row or column names, only the values.
write(x, file = "data", ncolumns = if(is.character(x)) 1 else 5, append = FALSE, sep = " ")
x | The object to be written, usually a vector or matrix. |
file = “data” | The filename. If you use “” the output goes to screen. |
ncolumns | The number of columns required for the output, the default is to use 1 for character data and 5 for numeric. |
append = FALSE | if TRUE output is added to an existing file. |
sep = ” “ | The separator character to use between items, the default is a space. Use “\t” for Tab. |
> set.seed(1) > vec = floor(runif(36, min = 1, max = 100)) # Defaults to 5 columns for numbers > write(vec, file = "") 27 37 57 90 20 89 94 66 63 7 21 18 69 39 77 50 72 99 38 77 93 22 65 13 27 39 2 38 87 34 48 60 49 19 82 67 # Use Tab and 8 columns > write(vec, file = "", sep = "\t", ncolumns = 8) 27 37 57 90 20 89 94 66 63 7 21 18 69 39 77 50 72 99 38 77 93 22 65 13 27 39 2 38 87 34 48 60 49 19 82 67
The write() command is most useful for vector objects without name attributes.
Writing special format files
There are occasions when you want to write a data file in a “special” format, the most commonly “requested” format is Excel but you can also write some other formats.
Excel files
To write an Excel file you’ll need the xlsx package, which also uses the xlsxjars and rJava packages.
install.packages(“xlsx”)
If you use the install.packages() command the default will be to get the xlsxjars and rJava packages as well.
The write.xlsx() command is what you’ll use most of the time. You specify the object you require and the filename. You can also specify a name for the worksheet and if you use append = TRUE the new worksheet will be added to an existing file.
write.xlsx(x, file, sheetName = "Sheet1", col.names = TRUE, row.names = TRUE, append = FALSE, showNA = TRUE)
x | The object to be written as an Excel file, usually a data.frame or matrix. |
file | The filename to use. The output will default to the working directory unless an explicit filepath is used. |
sheetName = “Sheet1” | The name to give to the worksheet. |
col.names = TRUE | By default, the column names are written to the Excel file. |
row.names = TRUE | By default, the row names are written to the Excel file. |
append = FALSE | If append = TRUE, a new worksheet is added to an existing file. |
showNA = TRUE | By default NA items appear as #NA in Excel. If showNA = FALSE, then NA items appear as blank. |
Try the following and then open the Excel file to see the results:
> library(xlsx) > dat2 <- data.frame(col1 = c(1, NA, 3), col2 = 4:6) # data with NA entry > rownames(dat2) <- c("First", "Second", "Third") # set row names > dat2 col1 col2 First 1 4 Second NA 5 Third 3 6 # Write as Excel > write.xlsx(dat2, file = "X1.xlsx", sheetName = "First") # append worksheet and use blanks for NA > write.xlsx(dat2, file = "X1.xlsx", sheetName = "Second", append = TRUE, showNA = FALSE)
The xlsx package contains other commands to help prepare and write Excel files. I won’t deal with them at this point because I want to keep things as simple as possible (I may do a separate monogRaph on the subject). Look at the help index for the package for more details.
Other file formats
The foreign package allows you to read various file formats. It also allows some to be written back to disk. The package is quite old and probably doesn’t support some of the later versions of SPSS for example. You are probably better off saving data as CSV and then using the target program to read the files.
Writing objects
There are several sorts of object you might want to write.
- General R objects like lists, vectors and so on.
- Results objects, which can be in form of list, matrix and so on.
- Custom functions
- The entire console.
Mostly you’ll want to write the objects to disk but there are some useful commands that allow you to write things to the screen.
Writing objects to screen
Generally speaking you can view an R object by typing its name! This shows the “contents” on the screen in a basic form.
The print() command
Typing the object name is really a shortcut for print(object_name). If the object has a class attribute and a print method exists for it, then the object is displayed using the commands in the print method.
Different print methods will have different parameters but the print.default() command will come into operation of no other class attribute is found. Here are the essentials:
print.default(x, digits = NULL, quote = TRUE, print.gap = NULL, right = FALSE)
x | The object to be printed. |
digits = NULL | The number of significant digits to show. The default will depend on the options. |
quote = TRUE | If TRUE items are shown with quotes. |
print.gap = NULL | Sets the gap between columns, NULL equates to 1, any integer up to 1024 can be used. |
right = FALSE | By default, text items are left justified. |
These parameters give some basic control over the look of the output.
> set.seed(1) > dat = data.frame(col1 = runif(3), col2 = runif(3)) > rownames(dat) = c("First", "Second", "Third") > dat # Default output col1 col2 First 0.2655087 0.9082078 Second 0.3721239 0.2016819 Third 0.5728534 0.8983897 # Significant figures > print(dat, digits = 4) col1 col2 First 0.2655 0.9082 Second 0.3721 0.2017 Third 0.5729 0.8984 # Widen space between columns > print(dat, digits = 4, print.gap = 4) col1 col2 First 0.2655 0.9082 Second 0.3721 0.2017 Third 0.5729 0.8984 # Left justify text > print(dat, digits = 4, print.gap = 4, right = FALSE) col1 col2 First 0.2655 0.9082 Second 0.3721 0.2017 Third 0.5729 0.8984
The print() command gives you some control over the output. It’s most important in allowing you to take an object holding a particular class attribute and define a print.xxxx method for that class.
The format() command
Use the format() command to get finer control over the display of objects. The command provides a wider range of options that give you more choice over the result. The command is linked to the class attribute of an object so you can define your own format.xxxx method.
The essentials of the format() command are:
format(x, digits = NULL, justify = “left”, width = NULL, scientific = NA)
x | The object to be formatted and displayed. |
digits = NULL | The number of significant figures to display. |
justify = “left” | How to justify character vectors, “left” (the default), “right” or “centre”. |
width = NULL | The minimum width to use for the columns. |
scientific = NA | If TRUE, the number is displayed in scientific format. |
> format(dat, digits = 4, width = 6) col1 col2 First 0.2655 0.9082 Second 0.3721 0.2017 Third 0.5729 0.8984 # Make columns wider > format(dat, digits = 4, width = 8) col1 col2 First 0.2655 0.9082 Second 0.3721 0.2017 Third 0.5729 0.8984 # Force scientific number format > format(dat, digits = 4, width = 8, scientific = TRUE) col1 col2 First 2.655e-01 9.082e-01 Second 3.721e-01 2.017e-01 Third 5.729e-01 8.984e-01
When you have character items you have a bit more control over justification Note that “centre” is spelt in UK style!
> txt = data.frame(Colour = c("Red", "Blue", "Green"), Size = c("Large", "Medium", "Small")) > format(txt) # The defaults Colour Size 1 Red Large 2 Blue Medium 3 Green Small # Wide columns and justification options > format(txt, width = 13, justify = "centre") Colour Size 1 Red Large 2 Blue Medium 3 Green Small > format(txt, width = 13, justify = "left") Colour Size 1 Red Large 2 Blue Medium 3 Green Small > format(txt, width = 13, justify = "right") Colour Size 1 Red Large 2 Blue Medium 3 Green Small
There are other options available but these essentials will be suitable for many purposes. See the help entry for all the details. See also the prettyNum() command, where you can get much finer control over the display of numbers.
The cat() and paste() commands
The cat() command can be used to join items together, which are then printed. Unlike format() or print() the cat() command cannot deal with 2D objects, so you can only use it with vectors.
The strength of the cat() command is in being able to join items together, this allows you to use it to make output messages in custom commands and scripts.
cat(... , sep = ” “, fill = FALSE, labels = NULL)
… | R objects (including text strings) to be concatenated and printed. |
sep = ” “ | The separator character to use between items, the default is a space. |
fill = FALSE | The width of the output to use. If FALSE only “\n” will create newlines. If TRUE, the output is split according to the current width option. If set to a number, this overrides any global width setting. |
labels = NULL | A vector of labels to use for lines of the output. |
> cat(dat$col1, dat$col2) 0.2655087 0.3721239 0.5728534 0.9082078 0.2016819 0.8983897 > cat(dat$col1, dat$col2, fill = 30, sep = "-") 0.2655087-0.3721239-0.5728534- 0.9082078-0.2016819-0.8983897 > cat(dat$col1, dat$col2, fill = 20, sep = ",", labels = letters[1:5]) a 0.2655087, b 0.3721239, c 0.5728534, d 0.9082078, e 0.2016819, a 0.8983897
Use “\n” to generate explicit newlines. If you want to use the name of an R object you must wrap it in a deparse(substitute()) command, otherwise the command will attempt to output the object, rather than its name:
> cat("\n", "Your data:\n", deparse(substitute(dat))) Your data: dat > cat(dat) Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type ‘list’) cannot be handled by ‘cat’
The paste() command joins items together but doesn’t do anything else with the object other than converting to a character vector. You can use it in conjunction with cat() or other commands to produce output.
paste(... , sep = " ", collapse = NULL)
… | R objects to be concatenated. |
sep = ” “ | The character to use is separating the items, the default is a space. |
collapse = NULL | The output is collapsed to form a single vector, separated by the character you specify instead of NULL. |
The items are combined element by element; here is a data.frame as an example:
> dat col1 col2 First 1 4 Second 2 5 Third 3 6 > paste(letters[1:3], dat$col1) [1] "a 1" "b 2" "c 3" > paste(dat$col1, dat$col2, sep = "**") [1] "1**4" "2**5" "3**6" > paste(dat$col1, dat$col2, sep = "-", collapse = " ") [1] "1-4 2-5 3-6"
You can also use the write() command to send output to the screen, see the details from the earlier section. The only difference is that you specify file = “” instead of an explicit filename.
Writing objects to disk
Any R object can be saved onto disk in a format that allows R to open it later. Some R objects can be saved in text format and retrieved later.
Writing binary objects
Any R object can be saved to disk. The basic command to do this is save(). You simply provide the names of the objects you want to save (separated by commas) and the filename for the target file.
save(..., file)
You can also use a list of names instead of specifying them explicitly. This means you could use another command to make your list, for example:
save(list = ls(), file = "my_objects.RData")
The save.image() command is a convenience command that essentially uses list = ls() to save all the objects.
save.image(file = "my_stuff.RData")
If you leave the filename empty and use save.image() this is essentially what you get when you quit R and say “yes” when asked if you want to save the workspace.
Writing text objects
R objects can also be saved in text form. You can see how to save data files, such as data.frame and matrix objects using write.table(). You can save vector and matrix objects using the write() command. Other objects can be trickier to represent as text. R has a couple of commands that make ASCII representations of objects (as far as possible), which can be read by humans and restored to R.
dput() dump()
The main difference between the two commands is that dput() writes a single object, whilst dump() can write several objects and append them to an existing file.
dput(x, file = "")
The dput() command attempts to write an ASCII representation of the object. This is human-readable, but not in a spreadsheet like form. To get an object back to R use dget().
> dat col1 col2 First 1 4 Second 2 5 Third 3 6 > dput(dat, file = "") structure(list(col1 = 1:3, col2 = 4:6), .Names = c("col1", "col2" ), row.names = c("First", "Second", "Third"), class = "data.frame")
You can see that the object looks more like a set of R commands (which essentially, it is).
dump(list, file = "dumpdata.R", append = FALSE, control = "all")
list | An object containing the names of the objects to be written. You can also use a command that produces a vector of names. |
file = “dumpdata.R” | The filename to use. To send to screen use file = “”. |
append = FALSE | If TRUE, the objects (as text) are appended to an existing file. |
control = “all” | Sets deparsing control. Use control = NULL to skip many of the object attributes. |
The dump() command requires a list of names as a character vector; you can use a command that will produce a character vector (such as the ls() command) instead of explicit names.
> dat col1 col2 First 1 4 Second 2 5 Third 3 6 > dump("dat", file = "") dat <- structure(list(col1 = 1:3, col2 = 4:6), .Names = c("col1", "col2" ), row.names = c("First", "Second", "Third"), class = "data.frame") > dump(ls(pattern = "dat"), file = "", control = NULL) dat <- list(col1 = 1:3, col2 = 4:6) > dump(c("mat", "dat"), file = "") mat <- structure(c(27, 37, 57, 90, 20, 89, 94, 66, 63, 7, 21, 18, 69, 39, 77, 50, 72, 99, 38, 77, 93, 22, 65, 13, 27, 39, 2, 38, 87, 34, 48, 60, 49, 19, 82, 67), .Dim = c(6L, 6L)) dat <- structure(list(col1 = 1:3, col2 = 4:6), .Names = c("col1", "col2"), row.names = c("First", "Second", "Third"), class = "data.frame")
Note that using control = NULL strips out most of the attributes. However, if you want to read the object into R (using the source() command) you’ll need to preserve as many attributes as possible.
You can also use the cat() command to join items together and then send the result to disk. You simply supply an explicit filename to the file parameter.
Divert console output to disk
Sometimes it is useful to be able to divert the output that would normally appear on screen to a disk file. For example, results of analyses such as analysis of variance and regression produce a table-like output. These results can be “ported” to disk files with the sink() or capture.output() commands.
The sink() command allows you to send anything that would have gone to the console (your screen) to a disk file instead.
sink(file = NULL, append = FALSE, split = FALSE)
You need to supply the filename, setting file = NULL closes the connection and stops sink()ing. To add to an existing file use append = TRUE. If you set split = TRUE the output goes to the console and the file you specified.
When you issue the command a file is created, ready to accept the output. If you set append = FALSE and the file already exists, it will be overwritten. If you set file = TRUE a connection is opened and subsequent output goes to the file.
# Send output to screen and file > sink(file = "Out1.txt", split = TRUE, append = FALSE) > summary(lm(Fertility ~ . , data = swiss)) Call: lm(formula = Fertility ~ ., data = swiss) Residuals: Min 1Q Median 3Q Max -15.2743 -5.2617 0.5032 4.1198 15.3213 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 66.91518 10.70604 6.250 1.91e-07 *** Agriculture -0.17211 0.07030 -2.448 0.01873 * Examination -0.25801 0.25388 -1.016 0.31546 Education -0.87094 0.18303 -4.758 2.43e-05 *** Catholic 0.10412 0.03526 2.953 0.00519 ** Infant.Mortality 1.07705 0.38172 2.822 0.00734 ** — Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1 Residual standard error: 7.165 on 41 degrees of freedom Multiple R-squared: 0.7067, Adjusted R-squared: 0.671 F-statistic: 19.76 on 5 and 41 DF, p-value: 5.594e-10 > sink(file = NULL) # Stop sending output to file
Note that even if you set append = FALSE subsequent output is appended to the file. Once you issue the command sink(file = NULL) output stops and you can see your file using any kind of text editor.
If you only want to send a single “result” to a disk file you can use the capture.output() command instead.
capture.output(..., file = NULL, append = FALSE)
You provide the commands that will produce the output and the filename. If you set append = TRUE and the target file exists, the output will be added to the file. If you set append = FALSE (the default) the file will be “blanked” and the output will therefore overwrite the original contents.
Note that there is no equivalent of the split argument, all output goes to the file and cannot be “mirrored” to the console. You can supply several commands, separated by commas.
> capture.output(ls(), search(), file = "Out1.txt")
This example sent the ls() command followed by search(), with the results being output to the disk file.
Once you have your output in a text file you can transfer it to your word processor, possibly with a little pre-processing via Excel.
Writing the console
You can save the entire console output from the GUI if you have Windows or Mac OS:
- Mac: File > Save
- Windows: File > Save to File
The result is a plain text file that mimics the console, whatever appears in your console will end up in the file.
Writing Graphics
R has extensive graphical capabilities and there are many commands that will create graphics, which appear in graphical windows. These graphics are separate from the console. You can write an R graphical object into a disk file in one of several ways:
- Copy and Paste to a different program (such as a word processor).
- Save the graphic from the GUI, as a graphics file (e.g. PNG, JPG, PDF).
- Use the device drivers to copy a graphic from the graphic window to a disk file.
- Use the device drivers to channel graphical commands directly to a disk file.
The route you take will depend largely on the quality of the final graphic you want. Copy and Paste will work quite well for many purposes but for high quality images you’ll need to use the device drivers. It is possible to save a graphic direct to disk from Windows or Mac GUI but the quality is limited to 72dpi.
Copy & Paste Graphics
You can simply select the graphics window in R and copy to the clipboard. The clipboard can be pasted into most programs and be recognized as a graphic.
- On a Mac the graphic will be copied as a PDF object.
- On Windows you can choose to copy the graphic as a bitmap (Ctrl+C, the default) or as a metafile (Ctrl+W).
In any event the graphic is transferred to your current application as a graphic. The quality of image will depend somewhat on your computer settings but is generally suitable for most daily purposes.
Save graphics from the GUI
You can also save a graphic directly from R using the GUI (assuming you are using Windows or Mac). A browser window opens, allowing you to send the file to a location of your choosing.
- On a Mac the file will be saved as a PDF.
- On Windows you can select a file type, there are several options.
The quality of your image (the image size) will depend upon your system settings but you’ll only achieve 72 dpi as a resolution. If you need high-quality images, then you need to use the device drivers.
Device Drivers
The device drivers enable you to send graphics commands directly to a file, rather than the screen. In this way you are able to produce graphics in various formats, with much higher resolution. You can use the device drivers in two main ways:
- To write an existing graphics window to a file.
- To write graphics commands direct to a file.
The most commonly used device drivers correspond to popular graphics formats, here are the essentials:
bmp(width = 480, height = 480, units = "px", bg = "white", res)
jpeg(width = 480, height = 480, units = "px", quality = 75, bg = "white", res)
png(width = 480, height = 480, units = "px", bg = "white", res)
tiff(width = 480, height = 480, units = "px", compression, bg = "white", res)
You specify the size of the graphic as width and height. The default is to treat these measurements as pixels, but you can specify units as pixels, inches, centimetres or millimetres. The resolution can be specified, so setting res = 300 will give 300 dpi.
For jpeg you can specify the quality, this sets the approximate percentage filesize so 25 is a smaller file with more compression than 75.
For tiff you can specify a compression, the options are, “none”, “rle”, “lzw”, “jpeg” or “zip”.
You can also make pdf using the pdf() device driver:
pdf(height = 7, width = 7, onefile, paper, colormodel)
For pdf you specify the size in inches. The onefile parameter allows multiple plots to be sent to one file (as separate pages). You can also specify the target page size. The colormodel parameter allows you to specify the colour encoding, the default is “srgb” but you can specify “gray” or “cmyk”.
Copy a graphics device to disk
If you have produced a graphic, in a regular graphic window, and want to save it as a high quality file you can use one of two commands:
dev.copy() dev.print()
You specify a filename and the type of device you want to make, for example:
> set.seed(22) # set random number seed > ## Now make a plot > boxplot(rnorm(20), rpois(20, 1), names = c("norm", "poisson"), las = 1, col = "gray90") > ## Send to file as PNG > dev.print(device = png, file = "Myplot.png", height = 512, width = 512) quartz 2
If you use dev.print() the file is written and immediately “closed”. If you use dev.copy() the file it written but not “closed”, which allows you to send additional commands to the file, which must be “closed” using the dev.off() command.
> set.seed(11) # Set random number seed > ## Make a plot > boxplot(rnorm(20), rpois(20, 1), names = c("norm", "poisson"), las = 1, col = "gray90") > ## Copy to a file as PNG > dev.copy(device = png, file = "Myplot.png", height = 512, width = 512, res = 150) quartz_off_screen 3 > ## Add more graphics commands, which go to file not on-screen graphic > title(“Title added later”) > ## Close graphic file and finish > dev.off() quartz 2
Note that in the preceding example the resolution was set to 150, which affected the size of the text relative to the graphic elements. If you wanted to keep the same relative size as before, set height and width to 512*150/72.
Note that PNG files generally have the background set to “transparent”. If you want to have a plain white background you will need to specify this explicitly in the original graphical command(s) before you use dev.print() or dev.copy(). The simplest way is to set the default:
par(bg = "white")
Now any PNG files you produce will have a white background. Reset to transparent in the same manner.
Send graphics commands direct to a file
If you want to send graphics direct to disk as files you simply issue the appropriate device instruction, which you follow with the graphics commands. Close out the file with dev.off().
> jpeg(file = "MyJpeg.jpg") # Prepare a jpeg file using the defaults > set.seed(33) # Set random number seed > ## Make a boxplot, the graphics go direct to file > boxplot(rnorm(20), rpois(20,1), names = c("norm", "poisson"), las = 1, col = “cornsilk”) > dev.off() # Finish and close the file quartz 2
PDFs are handled generally in the same manner but the parameters are slightly different. Resolution is not an issue so you specify the height and width in inches.
By default, multiple plots are sent to a single file, as separate pages. The default page size (“special”) is set to the same as the graphic size (height and width both 7″ default) but you can specify alternatives, paper = :
- “a4” or “A4”
- “letter”
- “legal” or “us”
- “executive”
Landscape orientation can be achieved”
- “A4r”
- “USr”
These can all be capitalised.
It is possible to change the font(s) used in the file by setting the family parameter. By default, fonts are not embedded so it is best to stick to basic ones e.g.
- “Helvetica” – the default
- “AvantGarde”
- “Bookman”
- “Courier”
- “Helvetica-Narrow”
- “NewCenturySchoolbook”
- “Palatino”
- “Times”
## Set PDF to single file using Bookman font family > pdf(file = "MyPDF.pdf", family = "Bookman) ## Make a couple of plots > boxplot(rnorm(20), rpois(20,1), names = c("norm", "poisson"), las = 1, col = "cornsilk") > boxplot(rnorm(20), rpois(20, 1), names = c("norm", "poisson"), las = 1, col = "gray90") ## Close the file and finish > dev.off() pdf 3
The preceding example should produce two boxplots in a single file. The size of the paper will be the default (7”). The Bookman font family was used.
Multiple plot files
If you want to produce multiple plots it is not necessary to issue a separate filename for each plot. You can simply add an index to the filename; something like %03d will produce three-digit integer values in the filename:
## Start a jpeg device with default size but an indexed name > jpeg(file = "MyJpeg%03d.jpg") ## Make a plot > boxplot(rnorm(20), rpois(20, 1), names = c("norm", "poisson"), las = 1, col = "gray90") ## Add a title > title(main = "Fig. 1") ## Start a new plot, this closes the previous > boxplot(rnorm(20), rpois(20,1), names = c("norm", "poisson"), las = 1, col = "cornsilk") > title(main = "Fig. 2") ## Close the last device and finish the file > dev.off() null device 1
The preceding example should produce two plots, one called MyJpeg001.jpg and another MyJpeg002.jpg. Note that the first file is “closed” when you issue a graphical command that would create a new plot. So if you want to add titles or similar, then you should do it before starting the next graphic, you cannot go back.
Writing scripts
A script is simply a text file containing a series of R commands. You store the file and run it using the source() command. You have two main choices for writing of script files: use the built-in script editor or an outside editor. The GUI for Windows and Mac incorporates a script editor but only the Mac supports syntax highlighting.
To start a new script:
- Win: File >New script
- Mac: File >New Document
You can open a script in any text editor of course. In Windows the Notepad++ program is a simple editor with syntax highlighting. On the Mac the BBEdit program is highly recommended. On Linux there are many options, the default text editor will often support syntax highlighting, Geany is one IDE that not only has syntax highlighting but integrates with the terminal.
The RStudio IDE is very capable and makes a good platform for using R for any OS. The script editor has syntax highlighting.
If you start to get serious about R coding, then the Sublime Text editor is worth a look. This has versions for all OS and syntax highlighting for R and many other languages.
Working Directory
R uses a working directory, where it stores files and where it looks for items to read. You can see the current working directory using getwd():
> getwd() [1] "/Users/markgardener" > getwd() [1] "C:/Users/Mark/Documents"
So, whenever you specify a filename it will be output to the working directory unless you specify a “complete” location, that is the full directory path. There is more about the working directory in the page about Reading.
Comments are closed.