First_Name | Second_Name | Full_Name | Sex | Age | Weight | Consent | |
---|---|---|---|---|---|---|---|
1 | Adam | Jones | Adam Jones | Male | 50 | 70.8 | TRUE |
2 | Eve | Parker | Eve Parker | Female | 21 | 67.9 | TRUE |
3 | John | Evans | John Evans | Male | 35 | 75.3 | FALSE |
4 | Mary | Davis | Mary Davis | Female | 45 | 61.9 | TRUE |
5 | Peter | Baker | Peter Baker | Male | 28 | 72.4 | FALSE |
6 | Paul | Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE |
7 | Joanna | Edwards | Joanna Edwards | Female | 42 | 63.5 | FALSE |
8 | Matthew | Smith | Matthew Smith | Male | 33 | 71.5 | TRUE |
9 | David | Roberts | David Roberts | Male | 57 | 73.2 | FALSE |
10 | Sally | Wilson | Sally Wilson | Female | 62 | 64.8 | TRUE |
age <- c(50, 21, 35, 45, 28, 31, 42, 33, 57, 62)
weight <- c(70.8, 67.9, 75.3, 61.9, 72.4, 69.9,
63.5, 71.5, 73.2, 64.8)
firstName <- c("Adam", "Eve", "John", "Mary",
"Peter", "Paul", "Joanna", "Matthew",
"David", "Sally")
secondName <- c("Jones", "Parker", "Evans", "Davis",
"Baker","Daniels", "Edwards", "Smith",
"Roberts", "Wilson")
Notice how a particular line of R code can be typed over multiple lines. R won’t execute the code until it sees the closing bracket )
that matches the initial bracket (
) - We often use this trick to make our code easier to read
TRUE
and FALSE
:consent <- c(TRUE, TRUE, FALSE, TRUE, FALSE,
FALSE, FALSE, TRUE, FALSE, TRUE)
c(20, "a string", TRUE)
[1] "20" "a string" "TRUE"
class()
function: class(firstName)
[1] "character"
class(age)
[1] "numeric"
class(weight)
[1] "numeric"
class(consent)
[1] "logical"
sex <- c("Male", "Female", "Male", "Female", "Male",
"Male", "Female", "Male", "Male", "Female")
sex
[1] "Male" "Female" "Male" "Female" "Male" "Male" "Female" "Male" "Male"
[10] "Female"
factor(sex)
[1] Male Female Male Female Male Male Female Male Male Female
Levels: Female Male
paste()
function joins character vectors together)patients <- data.frame(firstName, secondName,
paste(firstName, secondName),
sex, age, weight, consent)
patients
firstName <fctr> | secondName <fctr> | paste.firstName..secondName. <fctr> | sex <fctr> | age <dbl> | weight <dbl> | consent <lgl> |
---|---|---|---|---|---|---|
Adam | Jones | Adam Jones | Male | 50 | 70.8 | TRUE |
Eve | Parker | Eve Parker | Female | 21 | 67.9 | TRUE |
John | Evans | John Evans | Male | 35 | 75.3 | FALSE |
Mary | Davis | Mary Davis | Female | 45 | 61.9 | TRUE |
Peter | Baker | Peter Baker | Male | 28 | 72.4 | FALSE |
Paul | Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE |
Joanna | Edwards | Joanna Edwards | Female | 42 | 63.5 | FALSE |
Matthew | Smith | Matthew Smith | Male | 33 | 71.5 | TRUE |
David | Roberts | David Roberts | Male | 57 | 73.2 | FALSE |
Sally | Wilson | Sally Wilson | Female | 62 | 64.8 | TRUE |
$
’ operator:patients$age
[1] 50 21 35 45 28 31 42 33 57 62
paste()
command)names()
function, and we can use the same function to see the names:names(patients) <- c("First_Name", "Second_Name",
"Full_Name", "Sex", "Age",
"Weight", "Consent")
names(patients)
[1] "First_Name" "Second_Name" "Full_Name" "Sex" "Age"
[6] "Weight" "Consent"
patients <- data.frame(First_Name = firstName,
Second_Name = secondName,
Full_Name = paste(firstName,
secondName),
Sex = sex,
Age = age,
Weight = weight,
Consent = consent)
names(patients)
[1] "First_Name" "Second_Name" "Full_Name" "Sex" "Age"
[6] "Weight" "Consent"
patients$First_Name
[1] Adam Eve John Mary Peter Paul Joanna Matthew David Sally
Levels: Adam David Eve Joanna John Mary Matthew Paul Peter Sally
factor()
:patients <- data.frame(First_Name = firstName,
Second_Name = secondName,
Full_Name = paste(firstName,
secondName),
Sex = factor(sex),
Age = age,
Weight = weight,
Consent = consent,
stringsAsFactors = FALSE)
patients
First_Name <chr> | Second_Name <chr> | Full_Name <chr> | Sex <fctr> | Age <dbl> | Weight <dbl> | Consent <lgl> |
---|---|---|---|---|---|---|
Adam | Jones | Adam Jones | Male | 50 | 70.8 | TRUE |
Eve | Parker | Eve Parker | Female | 21 | 67.9 | TRUE |
John | Evans | John Evans | Male | 35 | 75.3 | FALSE |
Mary | Davis | Mary Davis | Female | 45 | 61.9 | TRUE |
Peter | Baker | Peter Baker | Male | 28 | 72.4 | FALSE |
Paul | Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE |
Joanna | Edwards | Joanna Edwards | Female | 42 | 63.5 | FALSE |
Matthew | Smith | Matthew Smith | Male | 33 | 71.5 | TRUE |
David | Roberts | David Roberts | Male | 57 | 73.2 | FALSE |
Sally | Wilson | Sally Wilson | Female | 62 | 64.8 | TRUE |
patients$Sex
[1] Male Female Male Female Male Male Female Male Male Female
Levels: Female Male
patients$First_Name
[1] "Adam" "Eve" "John" "Mary" "Peter" "Paul" "Joanna" "Matthew"
[9] "David" "Sally"
Now that we are happy with our data frame, we no longer have any use for the vectors that were used to create it
rm
that will allow us to remove variablesrm(age)
Once something has been removed, we can no longer use it
age
Multiple objects can be removed at the same time
rm(list = c("age","firstName","secondName","sex","weight","consent"))
Recall that we can create a new variable using an assignment operator and specifying a name that R isn’t currently using as a variable name
myNewVariable <- 42
myNewVariable
[1] 42
We use a similar trick to define new columns in the data frame - The value you assign must be the same length as the number of rows in the data frame.
patients$ID
NULL
patients$ID <- paste("Patient", 1:10)
patients
First_Name <chr> | Second_Name <chr> | Full_Name <chr> | Sex <fctr> | Age <dbl> | Weight <dbl> | Consent <lgl> | ID <chr> |
---|---|---|---|---|---|---|---|
Adam | Jones | Adam Jones | Male | 50 | 70.8 | TRUE | Patient 1 |
Eve | Parker | Eve Parker | Female | 21 | 67.9 | TRUE | Patient 2 |
John | Evans | John Evans | Male | 35 | 75.3 | FALSE | Patient 3 |
Mary | Davis | Mary Davis | Female | 45 | 61.9 | TRUE | Patient 4 |
Peter | Baker | Peter Baker | Male | 28 | 72.4 | FALSE | Patient 5 |
Paul | Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE | Patient 6 |
Joanna | Edwards | Joanna Edwards | Female | 42 | 63.5 | FALSE | Patient 7 |
Matthew | Smith | Matthew Smith | Male | 33 | 71.5 | TRUE | Patient 8 |
David | Roberts | David Roberts | Male | 57 | 73.2 | FALSE | Patient 9 |
Sally | Wilson | Sally Wilson | Female | 62 | 64.8 | TRUE | Patient 10 |
object[rows, colums]
patients[2,1]
[1] "Eve"
patients[1,2]
[1] "Jones"
patients[1,1:3]
First_Name <chr> | Second_Name <chr> | Full_Name <chr> | ||
---|---|---|---|---|
1 | Adam | Jones | Adam Jones |
patients[1,]
First_Name <chr> | Second_Name <chr> | Full_Name <chr> | Sex <fctr> | Age <dbl> | Weight <dbl> | Consent <lgl> | ID <chr> | |
---|---|---|---|---|---|---|---|---|
1 | Adam | Jones | Adam Jones | Male | 50 | 70.8 | TRUE | Patient 1 |
-
in front of the indexpatients[,-1]
Second_Name <chr> | Full_Name <chr> | Sex <fctr> | Age <dbl> | Weight <dbl> | Consent <lgl> | ID <chr> |
---|---|---|---|---|---|---|
Jones | Adam Jones | Male | 50 | 70.8 | TRUE | Patient 1 |
Parker | Eve Parker | Female | 21 | 67.9 | TRUE | Patient 2 |
Evans | John Evans | Male | 35 | 75.3 | FALSE | Patient 3 |
Davis | Mary Davis | Female | 45 | 61.9 | TRUE | Patient 4 |
Baker | Peter Baker | Male | 28 | 72.4 | FALSE | Patient 5 |
Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE | Patient 6 |
Edwards | Joanna Edwards | Female | 42 | 63.5 | FALSE | Patient 7 |
Smith | Matthew Smith | Male | 33 | 71.5 | TRUE | Patient 8 |
Roberts | David Roberts | Male | 57 | 73.2 | FALSE | Patient 9 |
Wilson | Sally Wilson | Female | 62 | 64.8 | TRUE | Patient 10 |
patients[-c(5,7),]
First_Name <chr> | Second_Name <chr> | Full_Name <chr> | Sex <fctr> | Age <dbl> | Weight <dbl> | Consent <lgl> | ID <chr> | |
---|---|---|---|---|---|---|---|---|
1 | Adam | Jones | Adam Jones | Male | 50 | 70.8 | TRUE | Patient 1 |
2 | Eve | Parker | Eve Parker | Female | 21 | 67.9 | TRUE | Patient 2 |
3 | John | Evans | John Evans | Male | 35 | 75.3 | FALSE | Patient 3 |
4 | Mary | Davis | Mary Davis | Female | 45 | 61.9 | TRUE | Patient 4 |
6 | Paul | Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE | Patient 6 |
8 | Matthew | Smith | Matthew Smith | Male | 33 | 71.5 | TRUE | Patient 8 |
9 | David | Roberts | David Roberts | Male | 57 | 73.2 | FALSE | Patient 9 |
10 | Sally | Wilson | Sally Wilson | Female | 62 | 64.8 | TRUE | Patient 10 |
letters
is a vector containing all letters in the English alphabetletters
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t"
[21] "u" "v" "w" "x" "y" "z"
s <- letters[1:5]
s
[1] "a" "b" "c" "d" "e"
So far we have seen how to extract the first and third values in the vector
s[c(1,3)]
[1] "a" "c"
R can perform the same operation using a vector of logical values. Only indices with a TRUE
value will get returned
s[c(TRUE, FALSE, TRUE, FALSE, FALSE)]
[1] "a" "c"
TRUE
and FALSE
values to subset the vectora <- 1:5
a < 3
[1] TRUE TRUE FALSE FALSE FALSE
s[a < 3]
[1] "a" "b"
<, >, <=, >=, ==, !=
!, &, |, xor
TRUE
, FALSE
)s[a > 1 & a <3]
[1] "b"
s[a == 2]
[1] "b"
The vector that you use to perform the logical test could be extracted from a data frame
patients$First_Name == "Peter"
[1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
patients[patients$First_Name == "Peter",]
First_Name <chr> | Second_Name <chr> | Full_Name <chr> | Sex <fctr> | Age <dbl> | Weight <dbl> | Consent <lgl> | ID <chr> | |
---|---|---|---|---|---|---|---|---|
5 | Peter | Baker | Peter Baker | Male | 28 | 72.4 | FALSE | Patient 5 |
First_Name | Second_Name | |
---|---|---|
1 | Adam | Jones |
2 | Eve | Parker |
HINT: you can use the seq
function that we saw earlier to define a vector of even numbers
First_Name | Second_Name | Full_Name | Sex | Age | Weight | Consent | |
---|---|---|---|---|---|---|---|
2 | Eve | Parker | Eve Parker | Female | 21 | 67.9 | TRUE |
4 | Mary | Davis | Mary Davis | Female | 45 | 61.9 | TRUE |
6 | Paul | Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE |
8 | Matthew | Smith | Matthew Smith | Male | 33 | 71.5 | TRUE |
10 | Sally | Wilson | Sally Wilson | Female | 62 | 64.8 | TRUE |
HINT: the nrow
function will give the number of rows in the data frame
First_Name | Second_Name | Full_Name | Sex | Age | Weight | Consent | |
---|---|---|---|---|---|---|---|
1 | Adam | Jones | Adam Jones | Male | 50 | 70.8 | TRUE |
2 | Eve | Parker | Eve Parker | Female | 21 | 67.9 | TRUE |
3 | John | Evans | John Evans | Male | 35 | 75.3 | FALSE |
4 | Mary | Davis | Mary Davis | Female | 45 | 61.9 | TRUE |
5 | Peter | Baker | Peter Baker | Male | 28 | 72.4 | FALSE |
6 | Paul | Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE |
7 | Joanna | Edwards | Joanna Edwards | Female | 42 | 63.5 | FALSE |
8 | Matthew | Smith | Matthew Smith | Male | 33 | 71.5 | TRUE |
9 | David | Roberts | David Roberts | Male | 57 | 73.2 | FALSE |
age <- c(50, 21, 35, 45, 28, 31, 42, 33, 57, 62)
weight <- c(70.8, 67.9, 75.3, 61.9, 72.4, 69.9,
63.5, 71.5, 73.2, 64.8)
firstName <- c("Adam", "Eve", "John", "Mary",
"Peter", "Paul", "Joanna", "Matthew",
"David", "Sally")
secondName <- c("Jones", "Parker", "Evans", "Davis",
"Baker","Daniels", "Edwards", "Smith",
"Roberts", "Wilson")
consent <- c(TRUE, TRUE, FALSE, TRUE, FALSE,
FALSE, FALSE, TRUE, FALSE, TRUE)
sex <- c("Male", "Female", "Male", "Female", "Male",
"Male", "Female", "Male", "Male", "Female")
patients <- data.frame(First_Name = firstName,
Second_Name = secondName,
Full_Name = paste(firstName,
secondName),
Sex = factor(sex),
Age = age,
Weight = weight,
Consent = consent,
stringsAsFactors = FALSE)
rm(list = c("firstName","secondName","sex","weight","consent"))
patients
First_Name <chr> | Second_Name <chr> | Full_Name <chr> | Sex <fctr> | Age <dbl> | Weight <dbl> | Consent <lgl> |
---|---|---|---|---|---|---|
Adam | Jones | Adam Jones | Male | 50 | 70.8 | TRUE |
Eve | Parker | Eve Parker | Female | 21 | 67.9 | TRUE |
John | Evans | John Evans | Male | 35 | 75.3 | FALSE |
Mary | Davis | Mary Davis | Female | 45 | 61.9 | TRUE |
Peter | Baker | Peter Baker | Male | 28 | 72.4 | FALSE |
Paul | Daniels | Paul Daniels | Male | 31 | 69.9 | FALSE |
Joanna | Edwards | Joanna Edwards | Female | 42 | 63.5 | FALSE |
Matthew | Smith | Matthew Smith | Male | 33 | 71.5 | TRUE |
David | Roberts | David Roberts | Male | 57 | 73.2 | FALSE |
Sally | Wilson | Sally Wilson | Female | 62 | 64.8 | TRUE |
### Your Answer ###
e <- matrix(1:10, nrow=5, ncol=2)
e
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
rowMeans(e)
[1] 3.5 4.5 5.5 6.5 7.5
Matrices (and indeed data frames) can be joined together using the functions cbind
and rbind
Let’s first create some example data
mat1 <- matrix(11:20, nrow=5,ncol=2)
mat1
[,1] [,2]
[1,] 11 16
[2,] 12 17
[3,] 13 18
[4,] 14 19
[5,] 15 20
mat2 <- matrix(21:30, nrow=5, ncol=2)
mat2
[,1] [,2]
[1,] 21 26
[2,] 22 27
[3,] 23 28
[4,] 24 29
[5,] 25 30
mat3 <- matrix(31:40,nrow=5,ncol=2)
mat3
[,1] [,2]
[1,] 31 36
[2,] 32 37
[3,] 33 38
[4,] 34 39
[5,] 35 40
and now try out these functions:-
cbind(mat1,mat2)
[,1] [,2] [,3] [,4]
[1,] 11 16 21 26
[2,] 12 17 22 27
[3,] 13 18 23 28
[4,] 14 19 24 29
[5,] 15 20 25 30
rbind(mat1,mat3)
[,1] [,2]
[1,] 11 16
[2,] 12 17
[3,] 13 18
[4,] 14 19
[5,] 15 20
[6,] 31 36
[7,] 32 37
[8,] 33 38
[9,] 34 39
[10,] 35 40