INTRODUCTION TO R
ECO 3007, 2016 SPRING INSTRUCTOR : JUNGMO YOON
HANYANG UNIVERSITY
1. Download and Install R Go to
http://cran.r-project.org/index.html
and follow instruction there. We will do this together in class.
References:
1. An Introduction to R, from the main R website, CRAN, 2. Introductory Statistics with R, by Peter Dalraard, 3. Using R for introductory Statistics, by John Verzani,
4. For an advanced user, read The R Inferno which is available at http://www.burns-stat.com/documents/books/the-r-inferno/
2. R is a Calculator Type and Enter,
> 2 + 2
> exp(-2)
> log(1)
Exercises: Type the following commands at the prompt and then enter
> 10+20
> 10-20
> 10*20
> 10/20
1
> 10^2
> 1-2*(3^2)
> (1-2)*(3^2)
> sqrt(4)
> sqrt(-4)
> abs(-4)
> pi
> sin(pi)
> cos(pi)
> exp(1)
> log(10) # Notice that wasn’t = 1
> log(exp(1))
In the last two examples, you need to know how to change the base of logarithm.
Use help() function.
> help(log) # or ?log
> log(10,base=10)
3. R is a random number generator
> rnorm(10)
> rnorm(10,mean=2,sd=0.5)
Exercise: Type the following commands at the prompt and then enter
> runif(8)
> runif(8,-1,1) # what is the difference?
4. Assignment You can assign a value to a variable
> x <- 2 # Method 1. Assign the value of 2 to x. Assignment is quiet.
> x
> x+x
> x*x
> x = 5 # Method 2
> 2*x+3
Remark: Acceptable variable names: You are free to make variable names out of letters, numbers, and the dot or underline characters. A name starts with a letter or a dot. R is case sensitive. Type > ?make.names.
Quiz 1: Find the value of a function at x=2
> y = x + x^2 + sin(x)
Quiz 2: Find the value of a function at x=-1
> y = exp(x) + x^4
5. Use a function c( ) to enter data
> x = c(74, 122, 235, 111, 292, 111, 133)
> y = c(44, 124, 333, 234, 144, 323, 222)
> plot(x,y)
> c(x,y) #c( ) can also combine data vectors
> simpsons = c("homer", "lisa", "maggie") # data can be non-numeric 6. Use functions on data vectors
> sum(x)
> length(x) # length of data vector
> sum(x)/length(x) # what is this?
> mean(x)
> sort(x) #the sorted values
> min(x)
> max(x)
You can mix functions and variables
> x + y # adds up each corresponding entry
> x - mean(x) # R repeats values from one vector so that its length matches the other.
Exercise: Type the following commands at the prompt
> weight <- c(65,45,67,78,56)
> weight
> height <- c(1.7,1.8,1.76,1.65,1.74)
> height
> length(weight)
> length(height)
> bmi <- weight/height^2
> bmi
> sum(weight)
> xbar <- sum(weight)/length(weight)
> mean(weight)
7. Plots
> plot(height,weight)
> plot(height,weight,pch=2)
> plot(height,weight,pch=3)
> abline(a=45,b=10,col="blue") Line plot
> od = order(height)
> plot(height[od],weight[od],ylim=c(15,80),type="l")
> lines(height[od],bmi[od],col="red")
8. Create structured data
Three useful functions to make a variable (that is, a vector) is :, seq(), rep().
Some examples are
> 1:10
> 10:1
> seq(1,9, by=2) #odd numbers
> rep(1,10)
> rep(1:3,3)
9. Matrix and Array
A dataset is often structured as a matrix. A matrix consists of several columns. One convention in data analysis is that one column in a dataset represents one variable.
One row on the other hand represents one observation (one individual, one firm, one country, etc.).
> x <- 1:12
> x
> dim(x) <- c(3,4)
> x
> matrix(1:12,nrow=3,byrow=T)
> matrix(1:12,nrow=3,byrow=F)
> rownames(x) <- LETTERS[1:3]
> x
> t(x) # what does it do?
> cbind(weight,height)
> rbind(weight,height)
10. Indexing Type it and figure out how indexing works
> x[,1]
> x[1,]
> x[,1:2]
> x[,c(2,4)]
> weight[height>1.7]