Introduction

Please work through the commands and check out functions used using help(). There is also a plain R script Intro1.R with most of the R commands, in case you want to execute the commands from the script.

Help

Try out the following:

  • ?help
  • ?variance
  • ??variance
  • help.search("variance")
  • help(var)
  • ?&&
  • ?"&&"
  • help.start()

A brief demo

Try out the following commands.

In [1]:
2 + 3

x = 2 + 3

exp(-4 * 4 / 2) / sqrt(2 * pi)

dnorm(4, 0, 1) ## or dnorm(0, 4, 1)
5
0.000133830225764885
0.000133830225764885

Note that "=" and "<-" are equivalent in R. For consistency with other languages this tutorial will use "=", however you can use "<-" if you prefer. You can also use "->" for assignment in the opposite direction but this is not frequently used.

In [2]:
x =  2 + 3
y <- 2 + 3
2 + 3 -> z
x
y
z
5
5
5

Data entry from keyboard

In [3]:
ages <- c(30, 40, 55, 46, 57)
ages
  1. 30
  2. 40
  3. 55
  4. 46
  5. 57

Input data from spreadsheet

Download the file chickwt.csv and put it in the folder where you are running R.

In [4]:
mydata <- read.csv("chickwt.csv")
summary(mydata)
str(mydata)
write.table(mydata, file="mydata.txt", sep=";") # write out as ASCI with ; as delimitor.
     weight             feed   
 Min.   :108.0   casein   :12  
 1st Qu.:204.5   horsebean:10  
 Median :258.0   linseed  :12  
 Mean   :261.3   meatmeal :11  
 3rd Qu.:323.5   soybean  :14  
 Max.   :423.0   sunflower:12  
'data.frame':	71 obs. of  2 variables:
 $ weight: int  179 160 136 227 217 168 108 124 143 140 ...
 $ feed  : Factor w/ 6 levels "casein","horsebean",..: 2 2 2 2 2 2 2 2 2 2 ...
In [5]:
mydata$weight
  1. 179
  2. 160
  3. 136
  4. 227
  5. 217
  6. 168
  7. 108
  8. 124
  9. 143
  10. 140
  11. 309
  12. 229
  13. 181
  14. 141
  15. 260
  16. 203
  17. 148
  18. 169
  19. 213
  20. 257
  21. 244
  22. 271
  23. 243
  24. 230
  25. 248
  26. 327
  27. 329
  28. 250
  29. 193
  30. 271
  31. 316
  32. 267
  33. 199
  34. 171
  35. 158
  36. 248
  37. 423
  38. 340
  39. 392
  40. 339
  41. 341
  42. 226
  43. 320
  44. 295
  45. 334
  46. 322
  47. 297
  48. 318
  49. 325
  50. 257
  51. 303
  52. 315
  53. 380
  54. 153
  55. 263
  56. 242
  57. 206
  58. 344
  59. 258
  60. 368
  61. 390
  62. 379
  63. 260
  64. 404
  65. 318
  66. 352
  67. 359
  68. 216
  69. 222
  70. 283
  71. 332

Statistical summaries

In [6]:
mean(mydata$weight)

var(mydata$weight)
261.30985915493
6095.50261569416

Fit a linear model

In [7]:
(mylm <- lm(weight ~ feed, mydata))
Call:
lm(formula = weight ~ feed, data = mydata)

Coefficients:
  (Intercept)  feedhorsebean    feedlinseed   feedmeatmeal    feedsoybean  
      323.583       -163.383       -104.833        -46.674        -77.155  
feedsunflower  
        5.333  

Graphical

In [8]:
plot(density(mydata$weight)) # plot a kernel density estimate of the weight distribution
par(mfrow=c(2,2)) # make next 4 plots appear in 2x2 subplot layout
plot(mylm) # produce 4 plots showing diagnostic information from the linear model
par(mfrow=c(1,1)) # reset to regular full sized plot layout

#boxplots for tooth growth againts absorbic acid and orange juice using 'ToothGrowth' dataset
boxplot(len ~ dose, data = ToothGrowth,
        boxwex = 0.25, at = 1:3 - 0.2,
        subset = supp == "VC", col = "yellow",
        main = "GuineaPigs' Tooth Growth",
        xlab = "Vitamin C dose mg",
        ylab = "tooth length",
        xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = "i")
boxplot(len ~ dose, data = ToothGrowth, add = TRUE,
        boxwex = 0.25, at = 1:3 + 0.2,
        subset = supp == "OJ", col = "orange")
legend(2, 9, c("Ascorbic acid", "Orange juice"),
       fill = c("yellow", "orange"))

Matrices

In [9]:
mymat <- data.matrix(mydata) #save data in matrix format

colMeans(mymat) #print mean of matrix columns
mymat[70, 1] <- NA  #insert an NA value into column 1

colMeans(mymat) #column means now give an error because of NA
colMeans(mymat, na.rm=TRUE) #use na.rm flag to ignore entries that are NA
weight
261.30985915493
feed
3.57746478873239
weight
<NA>
feed
3.57746478873239
weight
261
feed
3.57746478873239

Specific Saving and Restoring

In [10]:
save(mymat, file = "mymat.RData")

load("mymat.RData")

Install, load and explore the package faraway:

In [11]:
install.packages("faraway")
also installing the dependencies 'minqa', 'nloptr', 'statmod', 'RcppEigen', 'lme4'

package 'minqa' successfully unpacked and MD5 sums checked
package 'nloptr' successfully unpacked and MD5 sums checked
package 'statmod' successfully unpacked and MD5 sums checked
package 'RcppEigen' successfully unpacked and MD5 sums checked
package 'lme4' successfully unpacked and MD5 sums checked
package 'faraway' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\td314\AppData\Local\Temp\Rtmpq6pVfD\downloaded_packages
In [12]:
library(faraway) # attaches the package to the search path
library(help=faraway) # gives you a list of all available objects and functions in the package
Warning message:
"package 'faraway' was built under R version 3.6.3"