The 3 Rs – Arithmetic
You can think of R as like a giant calculator. It has a great capacity for mathematical operations. In this article I’ll look at some of the more basic ones, to give a flavour of what basic arithmetic you can do.
Operators
In the main your arithmetic is going to involve adding up, dividing, subtracting and multiplication. These things are all carried out using basic operators that are familiar to everyone: + – * / and ().
The order of the operators is important. R will evaluate your maths in a set order. multiplication (*) and division (/) are evaluated first. Then addition (+) and subtraction (-). As well as this basic order, things inside parentheses () will be evaluated before anything “outside” the parentheses (still in the */+- order). So, remember this running order when you type your maths e.g.
7 + 4 * 11 [1] 51 (7 + 4) * 11 [1] 121
As well as the basic operators there are some “extras”.
Powers (i.e. exponents) are designated using the caret (^) character. These are evaluated before any of the other operators. For example:
2^2 ; 2^3 ; 2^4 [1] 4 [1] 8 [1] 16 3 * 2^3 + 1 [1] 25
If you want a fractional power, you enclose this in parentheses e.g.
64^(1/2) [1] 8 64^(-2) [1] 0.0002441406
A square root is simply a power of 0.5 e.g. x^0.5 but there is a special command sqrt() to deal with this.
x <- c(9, 16, 25, 36, 49, 59) sqrt(x) [1] 3.000000 4.000000 5.000000 6.000000 7.000000 7.681146
You can use the % symbol to compute modulo (%%) and integer division (%/%) like so:
16 %/% 3 [1] 5 16 %% 3 [1] 1
So, you have 16 ÷ 3 giving 5 and 1 remainder.
Matrix operations
There are a bunch of commands associated with matrix math.
Matrix multiplication
Multiply two matrices using the %*% operator.
## Make some matrices (x <- cbind(1,1:3,c(2,0,1))) [,1] [,2] [,3] [1,] 1 1 2 [2,] 1 2 0 [3,] 1 3 1 (y <- c(1, 3, 2)) [1] 1 3 2 (y1 <- matrix(1:3, nrow = 1)) [,1] [,2] [,3] [1,] 1 2 3 (z <- matrix(3:1, ncol = 1)) [,1] [1,] 3 [2,] 2 [3,] 1 ## Various multiplications > x %*% y [,1] [1,] 8 [2,] 7 [3,] 12 x %*% y1 # Order can be important Error in x %*% y1 : non-conformable arguments y1 %*% x [,1] [,2] [,3] [1,] 6 14 5 x %*% z [,1] [1,] 7 [2,] 7 [3,] 10 z %*% x # Order is important Error in z %*% x : non-conformable arguments y %*% y # Matrix multiplication of two vectors [,1] [1,] 14
Matrix Cross Products
You can compute the cross product using the %*% operator and the t() command. Alternatively the crossprod() and tcrossprod() commands can do the job:
(x <- cbind(1,1:3,c(2,0,1))) [,1] [,2] [,3] [1,] 1 1 2 [2,] 1 2 0 [3,] 1 3 1 (y = c(1, 3, 2)) [1] 1 3 2 (y1 = matrix(1:3, nrow = 1)) [,1] [,2] [,3] [1,] 1 2 3 ## Cross products > crossprod(x, y) # Same as t(x) %*% y [,1] [1,] 6 [2,] 13 [3,] 4 crossprod(x, y1) # Gives error as y1 wrong "shape" Error in crossprod(x, y1) : non-conformable arguments tcrossprod(x, y1) # Same as x %*% t(y1) [,1] [1,] 9 [2,] 5 [3,] 10
Other matrix maths
There are various other commands associated with matrix math. They are not really operators (except %o%) as such but I’ll list them here:
backsolveforwardsolve | These commands solve a system of linear equations where the coefficient matrix is upper (“right”, “R”) or lower (“left”, “L”) triangular. |
chol | This command computes the Choleski factorization of a real matrix. The matrix must be symmetric, positive definite. The result returned is the upper triangular factor of the Choleski decomposition, that is, the matrix R such that R’R = x. |
detdeterminant | The determinant command calculates the modulus of the determinant (optionally as a logarithm) and the sign of the determinant. The det command calculates the determinant of a matrix (it is a wrapper for the determinant command). |
diag | Matrix diagonals. This command has several uses; it can extract or replace the diagonal of a matrix. Alternatively, the command can construct a diagonal matrix. |
eigen | Computes eigenvalues and eigenvectors for matrix objects; that is, carries out spectral decomposition. The result is a list containing $values and $vectors. |
outer%o% | The outer command calculates the outer product of arrays and matrix objects. The %o% symbol is a convenience wrapper for outer(X, Y, FUN = “*”). |
qr | This command computes the QR decomposition of a matrix. It provides an interface to the techniques used in the LINPACK routine DQRDC or the LAPACK routines DGEQP3 and (for complex matrices) ZGEQP3. The result holds a class attribute “qr”. |
solve | Solves a system of equations. This command solves the equation a %*% x = b for x, where b can be either a vector or a matrix. |
svd | The svd command computes the singular value decomposition of a rectangular matrix. The result is a list containing $d, the singular values of x. If nu > 0 and nv > 0, the result also contains $u and $v, the left singular and right singular vectors of x. |
Logical Operators
Some operators are used to provide a logical result (i.e. TRUE or FALSE).
Logical AND
The & operator is used for the logical AND in an elementwise manner. If you double up the operator && you perform the operation on only the first element
Logical OR
The vertical bar | is used for the logical OR in an elementwise manner. If you double up the operator || you perform the operation on only the first element.
Logical NOT
The ! operator is used for the logical NOT.
Logical Exclusive OR
The command xor(x, y) acts as an exclusive OR operator.
The logical operators work in conjunction with other operators (comparisons) to produce results. So, look at the examples in the next section.
Comparisons
Comparison operators allow you to compare elements. These are often used in conjunction with the logical operators.
== | Equal to. Note that the single = denotes assignment so == is used as a comparison operator. |
!= | Not equal to. |
< | Less than. |
<= | Less than or equal to. |
>= | Greater than or equal to. |
> | Greater than. |
Here are some simple examples:
## Make some vectors yy <- c(2, 3, 6, 5, 3) zz <- c(4, 2, 4, 7, 4) yy ; zz [1] 2 3 6 5 3 [1] 4 2 4 7 4 yy > 4 # Elements greater than 4 [1] FALSE FALSE TRUE TRUE FALSE zz != 4 # Elements not equal to 4 [1] FALSE TRUE FALSE TRUE FALSE zz == 4 # Elements that are equal to 4 [1] TRUE FALSE TRUE FALSE TRUE yy > 3 & zz > 4 # Two conditions AND [1] FALSE FALSE FALSE TRUE FALSE yy > 3 && zz > 4 # Test first element only [1] FALSE yy == 3 | zz == 4 # Two conditions OR [1] TRUE TRUE TRUE FALSE TRUE xor(yy > 3, zz > 4) # Exclusive OR [1] FALSE FALSE TRUE FALSE FALSE
Selection and Matching
The comparison and logical operators are used to obtain TRUE/FALSE results. However, they are not always the best choice for selection. There are times when you are looking for a single logical result but using regular operators either fails, or produces more than one. In these cases, the selection commands are helpful. They aren’t strictly mathematical operators but it is helpful to be aware of them. Additionally the isTRUE() command can “force” a single logical as a result.
Match any item in an object
Use the any() command to return a single TRUE result if any element meets the specified criteria.
Match all items in an object
Use the all() command to return a single TRUE result if all elements meet the specified criteria.
Compare objects
The identical() command compares two items and returns a TRUE if they are exactly equal. The all.equal() command is similar but with a bit more tolerance.
(yy <- c(2, 3, 6, 5, 3)) # Make a vector [1] 2 3 6 5 3 any(yy > 5) # Are any elements > 5? [1] TRUE all(yy < 5) # Are all elements < 5? [1] FALSE all(yy >= 2) # Are all elements >= 2? [1] TRUE ## Some simple values x1 <- 0.5 – 0.3 x2 <- 0.3 – 0.1 ## Are results the same? x1 == x2 [1] FALSE ## Set tolerance to 0 to show actual difference all.equal(x1, x2, tolerance = 0) [1] "Mean relative difference: 1.387779e-16" ## Use default tolerance all.equal(x1, x2) [1] TRUE # Wrap command in isTRUE() for logical isTRUE(all.equal(x1, x2)) [1] TRUE isTRUE(all.equal(x1, x2, tolerance = 0)) [1] FALSE ## Character vectors pp <- c("a", "f", "h", "q", "r") qq <- c("d", "e", "x", "c", "s") rr <- pp # Test for equality all.equal(pp, qq) [1] "5 string mismatches" identical(x1, x2) [1] FALSE identical(pp, qq) [1] FALSE identical(pp, rr) [1] TRUE
Note that you can use identical(TRUE, x) in lieu of isTRUE(x), where x is the condition to test.
Selection with non-logical result
You can use the which() command to obtain an “index” instead of a logical result. The command works in conjunction with the comparison and logical operators but returns a result that indicates which elements match your criteria.
yy <- c(2, 3, 6, 5, 3) yy [1] 2 3 6 5 3 which(yy > 3) [1] 3 4 which(yy == 3) [1] 2 5
Complex numbers
Complex numbers are those with “imaginary” parts. You can make complex numbers using the complex() and as.complex() commands, whilst the is.complex() command provides a quick logical test to see if an object has the class “complex”. R has several commands that can deal with complex numbers.
Arg | Returns the argument of an imaginary number. |
Conj | Displays the complex conjugate for a complex number. |
Im | Shows the imaginary part of a complex number. |
Mod | Shows the modulus of a complex number. |
Re | Shows the real part of complex numbers. |
Here are some simple examples:
## Make some complex numbers z0 <- complex(real = 1:8, imaginary = 8:1) z1 <- complex(real = 4, imaginary = 3) z2 <- complex(real = 4, imaginary = 3, argument = 2) z3 <- complex(real = 4, imaginary = 3, modulus = 4, argument = 2) z0 ; z1 ; z2 ; z3 [1] 1+8i 2+7i 3+6i 4+5i 5+4i 6+3i 7+2i 8+1i [1] 4+3i [1] -0.4161468+0.9092974i [1] -1.664587+3.63719i ## Get the real and imaginary parts of a complex object Re(z0) [1] 1 2 3 4 5 6 7 8 Im(z0) [1] 8 7 6 5 4 3 2 1 ## Get the Argument and Modulus Arg(z1) [1] 0.6435011 Mod(z1) [1] 5 ## Display the complex conjugate Conj(z2) [1] -0.4161468-0.9092974i ## Get the modulus and argument Mod(z3) [1] 4 Arg(z3) [1] 2
Besides these special commands, the regular math operators work on complex numbers:
z0 + z1 [1] 5+11i 6+10i 7+ 9i 8+ 8i 9+ 7i 10+ 6i 11+ 5i 12+ 4i z0 * z1 [1] -20+35i -13+34i -6+33i 1+32i 8+31i 15+30i 22+29i 29+28i z2 / z3 [1] 0.25+0i
Rounding
There are various commands that deal generally with precision and rounding.
abs | This command returns the absolute magnitude of a numeric value (that is, ignores the sign). If it is used on a logical object the command produces 1 or 0 for TRUE or FALSE, respectively. |
sign | This command returns the sign of elements in a vector. If negative an item is assigned a value of –1; if positive, +1; and if zero, 0. |
floor | This command rounds values down to the nearest integer value. |
ceiling | This command rounds up a value to the nearest integer. |
trunc | Creates integer values by truncating items at the decimal point. |
round | Rounds numeric values to a specified number of decimal places. |
signif | This command returns a value rounded to the specified number of significant figures. |
These are all fairly obvious; here are some examples:
## Make some values yy <- log(2^(2:6)) yy [1] 1.386294 2.079442 2.772589 3.465736 4.158883 floor(yy) [1] 1 2 2 3 4 ceiling(yy) [1] 2 3 3 4 5 trunc(yy) [1] 1 2 2 3 4 round(yy, digits = 3) [1] 1.386 2.079 2.773 3.466 4.159 signif(yy, digits = 3) [1] 1.39 2.08 2.77 3.47 4.16 ## Include negative numbers xx <- -3:3 xx [1] -3 -2 -1 0 1 2 3 sign(xx) [1] -1 -1 -1 0 1 1 1 abs(xx) [1] 3 2 1 0 1 2 3
These commands work on most numeric objects (e.g. data.frame, vector, matrix, table). If you have logical objects, you’ll return 1 for a TRUE and 0 for a FALSE.
Scientific Format
You can enter numbers using an exponent to make it easier to deal with very large or very small values. The exponent is indicated by the letter e or E. You can use the – sign to indicate a negative exponent. The + sign can be omitted. You must not leave a space between the value and the exponent. You can only add an exponent to a numeric value and not to a named object.
1e3 [1] 1000 1E4 [1] 10000 # No spaces "allowed" 1 e-2 Error: unexpected symbol in "1 e" 1e-2 [1] 0.01
Values are generally “printed” by R in regular format but sometimes they will appear in scientific format. This makes no difference to your calculations but sometimes you want the result to be displayed in scientific format and at other times not. There are two ways to achieve the result you want.
The simplest way to present your results objects in an appropriate format is to use the format() command. You simply set scientific = TRUE to prepare an object in that format (set FALSE to use regular format). The downside to this is that the object is prepared as a text result, which might be inconvenient.
The other way is to alter the options() and to set the scipen option. The default is 0. Negative values tend to produce scientific notation and positive values are less likely to do so.
## Make some values yy <- c(1, 10, 100, 1000, 10000, 100000) zz <- c(1, 12, 123, 1234, 12345, 123456) yy ; zz [1] 1e+00 1e+01 1e+02 1e+03 1e+04 1e+05 [1] 1 12 123 1234 12345 123456 ## Use format() to force scientific or foxed format format(yy, scientific = FALSE) [1] " 1" " 10" " 100" " 1000" " 10000" "100000" format(zz, scientific = TRUE, digits = 3) [1] "1.00e+00" "1.20e+01" "1.23e+02" "1.23e+03" "1.23e+04" "1.23e+05" ## Check the scipen option options("scipen") $scipen [1] 0 ## Set scipen options(scipen = 1) > yy [1] 1 10 100 1000 10000 100000 options(scipen = -6) zz [1] 1.00000e+00 1.20000e+01 1.23000e+02 1.23400e+03 1.23450e+04 1.23456e+05 print(zz, digits = 3) [1] 1.00e+00 1.20e+01 1.23e+02 1.23e+03 1.23e+04 1.23e+05 options(scipen = 0) # Reset scipen
You may need to tweak the values of scipen but in general the number of digits in the result is your guideline.
Extremes
The largest and smallest items can be extracted using max() and min() commands respectively. These commands produce a single value as the result.
The range() command produces two values, the smallest and largest, in that order.
## Set random number generator set.seed(99) ## Make some values xx <- runif(n = 10, max = 100, min = -100) max(xx) [1] 98.50176 min(xx) [1] -77.24366 range(xx) [1] -77.24366 98.50176 range(xx)[1] # Just the min value [1] -77.24366 range(xx)[2:1] # Display max then min [1] 98.50176 -77.24366
If you want the 2nd largest, or the 3rd smallest (for example) then you need to use the order() command to get an “index”. Set the sort order to decreasing = FALSE (the default) to get the smallest values, set decreasing = TRUE to get the largest values.
xx [1] 16.942370 -77.243665 36.852949 98.501755 6.998717 93.322813 34.285512 [8] -41.084458 -28.327403 -64.937049 order(xx) # Index in ascending order [1] 2 10 8 9 5 1 7 3 6 4 xx[order(xx)[1]] # 1st smallest (min) [1] -77.24366 xx[order(xx)[2]] # 2nd smallest [1] -64.93705 xx[order(xx, decreasing = TRUE)[1]] # 1st largest (max) [1] 98.50176 xx[order(xx, decreasing = TRUE)[2]] # 2nd largest [1] 93.32281
Logarithms
Logarithms and their reverse (anti-logs?) are dealt with using the log() and exp() commands respectively. The default base is the natural log (e) but you can specify the base explicitly. There are also several convenience commands:
log(x, base = exp(1)) | The basic log command. The default base is natural. Use base parameter to specify alternative (as long as it works out as a numeric). |
log10 | A convenience function computes log base 10. |
log2 | Computes log base 2. |
log1p | Computes log(x + 1). See also expm1(). |
exp | The antilog for base e. |
expm1 | The antilog for base e -1 i.e. exp(x) -1. See also log1p(). |
Using the commands is fairly simple.
log(1:4) # Natural log [1] 0.0000000 0.6931472 1.0986123 1.3862944 log(1:4, base = 3) # Log base 3 [1] 0.0000000 0.6309298 1.0000000 1.2618595 log10(1:4) # Log base 10 [1] 0.0000000 0.3010300 0.4771213 0.6020600 log2(1:4) # Log base 2 [1] 0.000000 1.000000 1.584963 2.000000 log(0:3) # Regular log gives infinity for 0 [1] -Inf 0.0000000 0.6931472 1.0986123 log1p(0:3) # Add 1 to values then log [1] 0.0000000 0.6931472 1.0986123 1.3862944 10^0.6 # The antilog (base 10) of 0.6 [1] 3.981072 3^0.63 # The antilog (base 3) of 0.63 [1] 1.997958 exp(1.098) # The natural antilog of 1.098 [1] 2.998164 expm1(1.098) # The natural antilog of 1.098 then minus 1 [1] 1.998164
Trigonometry
R has a suite of commands that carry out trigonometric functions.
Regular | Hyperbolic | |
Sine | sin() asin() | sinh() asinh() |
Cosine | cos() acos() | cosh() acosh() |
Tangent | tan() atan() | tanh() atanh() |
The commands work out the basic functions and also hyperbolic equivalents. Angles are in radians (a right angle is pi/2 radians).
Here are some simple examples using the cosine:
cos(45) # Angle in radians [1] 0.525322 cos(45 * pi/180) # Convert 45 radians to degrees [1] 0.7071068 acos(1/2) * 180/pi # To get result in degrees [1] 60 acos(sqrt(3)/2) * 180/pi # To get result in degrees [1] 30 cosh(0.5) # Hyperbolic function [1] 1.127626
Summation
There are various commands associated with lists of values (that is list in a general sense, not an R list object).
Adding things
The sum() command returns the sum of all the numbers specified. This works for most R objects, including data.frame, matrix and vectors. Logical values are treated as 1 (TRUE) or 0 (FALSE). NA items are treated as 0. You can “ignore” NA items using na.rm = TRUE as a parameter in the command.
Multiplying things
The prod() command returns the product of all the numbers specified, that is each value multiplied by the “next”. This works for most R objects, including data.frame, matrix and vectors (items are taken columnwise for data.frame objects). Logical values are treated as 1 (TRUE) or 0 (FALSE). NA items are treated as 0. You can “ignore” NA items using na.rm = TRUE as a parameter in the command.
v <- 1:9 m <- matrix(1:9, ncol = 3) d <- data.frame(a = 1:3, b = 4:6, c = 7:9) v ; m ; d [1] 1 2 3 4 5 6 7 8 9 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 a b c 1 1 4 7 2 2 5 8 3 3 6 9 ## sum for vector, matrix and data.frame sum(v) [1] 45 sum(m) [1] 45 sum(d) # Reads data columnwise [1] 45 ## product for vector, matrix and data.frame prod(v) [1] 362880 prod(m) [1] 362880 prod(d) # Reads data columnwise [1] 362880 ## Make a locical vector z <- c(TRUE, TRUE, FALSE, TRUE) z [1] TRUE TRUE FALSE TRUE ## Results for logicals sum(z) [1] 3 prod(z) [1] 0
The factorial() command is similar to prod(). With prod() you specify prod(x:y) as a sequence, whilst in factorial() you specify factorial(y). If you provide more than one value, you end up with multiple results:
factorial(3) # i.e. 3 * 2 * 1 [1] 6 factorial(5) # i.e. 5 * 4 * 3 * 2 * 1 [1] 120 factorial(v) [1] 1 2 6 24 120 720 5040 40320 362880 factorial(m) [,1] [,2] [,3] [1,] 1 24 5040 [2,] 2 120 40320 [3,] 6 720 362880
You can also use the gamma() command, which equates to factorial(x-1). Essentially prod(x:y) = gamma(y+1) = factorial(y).
Cumulative functions
There are several cumulative functions built-in to R. These determine sum, product, maximum and minimum.
cumsum | Determines the cumulative sum |
cumprod | Cumulative product |
cummax | Cumulative maximum |
cummin | Cumulative minimum |
The commands operate on numeric objects, usually vector or matrix objects. The commands work on data.frame objects but compute results per column. The commands do not work directly on lists (but you can use the lapply() command).
v <- 1:9 m <- matrix(1:9, ncol = 3) d <- data.frame(a = 1:3, b = 4:6, c = 7:9) v ; m ; d [1] 1 2 3 4 5 6 7 8 9 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 a b c 1 1 4 7 2 2 5 8 3 3 6 9 cumsum(v) [1] 1 3 6 10 15 21 28 36 45 cumprod(m) [1] 1 2 6 24 120 720 5040 40320 362880 # For data.frame calculations are carried out column by column cummax(d) a b c 1 1 4 7 2 2 5 8 3 3 6 9
It is possible to make custom functions that calculate other cumulative results, but that is another story.
There are many other mathematical functions in R. This has been a brief overview of some of the “simpler” arithmetic (although the foray into logic may not count as math).
Comments are closed.