12 Exercise Solutions by Module
12.1 Module 1
12.1.1 General exercises
Exercise: Which of the following will give a different output from the other 3? Hint: Look at the help file for the log()
function.
log(x=6, base=4)
log(4, 6)
log(base=4, x=6)
log(6, 4)
Solution:
?log
, 2. log(4,6)
Exercise: Create a vector that goes from 5 to 25 by increments of 1.
Solution: 5:25
(or some variation)
Exercise: Create another vector that goes from 5 to 25 by increments of 1, but using a different method. Call this vector V1
.
Solution: V1 <- seq(5,25,by=1)
(or some variation)
Exercise: Extract the 5th element of V1
.
Solution: V1[5]
Exercise: Remove the 7th element from the vector V1
and call this new vector V2
.
Solution: V2 <- V1[-7]
Exercise: How many elements in your vector, V1
, are greater than 11?
Solution: sum(V1>11)
= 14
Exercise: Create a matrix that contains the sequence of numbers from 1 to 16, going up in increments of 1. Let the matrix have 4 rows and have the matrix elements fill by row. Call this matrix M1
.
Solution: M1 <- matrix(data=1:16, nrow=4, byrow=T)
Exercise: Extract the 3rd and 4th rows and the 1st and 2nd columns of your matrix, M1
. Call this matrix M2
.
Solution: M2 <- M1[c(3,4), c(1,2)]
Exercise: Find the row sums of the matrix M2
.
Solution: apply(M2, 1, sum)
(or rowSums(M2)
)
12.1.2 End of module exercises
Exercise 1: Use R to find the value of the square root of 25. (\(\sqrt(25)\))
Hint: Look at the help file for the function sqrt()
.
Solution 1: sqrt(25)
Exercise 2: Use R to find the value of the exponential of \(6\times 14 -3\). (\(\exp(6\times 14-3)\))
Solution 2: exp(6*4-3)
Exercise 3: Use R to find the value of the absolute value of \(7\times 3 - 4\times 9 - 30\). (\(|7 \times 3 - 4 \times 9 - 30|\))
Solution 3: abs(7*3-4*9-30)
Exercise 4: Use R to find the value of \(\frac{73-42}{3}+2\times\left(\frac{36}{4}-17\right)\). Give your answer to 2 decimal places.
Solution 4: (73-42)/3+(2*((36/4)-17))
Exercise 5: How many elements does the following vector have?
seq(from = 0, to = 100, by = 3.14)
Solution 5: length(seq(0,100,by=3.14))
Exercise 6: Look at the help file for the function rnorm()
and use this function to generate 100 random numbers with mean 10 and standard deviation 5.
Solution 6: rnorm(n = 100, mean = 10, sd = 5)
12.2 Module 2
12.2.1 General exercises
Exercise: What are the defaults for the header argument in the functions read.table()
and read.csv()
?
header = T
inread.table()
andheader = T
inread.csv()
header = T
inread.table()
andheader = F
inread.csv()
header = F
inread.table()
andheader = T
inread.csv()
header = F
inread.table()
andheader = F
inread.csv()
Solution: header = F
in read.table()
and header = T
in read.csv()
12.2.2 End of module exercises
Exercise 1: How many cares have a miles per gallon (mpg
) of less than 15 in the mtcars
data?
Solution 1:
library(datasets)
cars <- mtcars
sum(cars$mpg<15)
= 5
Exercise 2: How many cars have exactly 4 cylinders (cyl
) in the mtcars
data?
Solution 2: sum(cars$cyl==8)
=14
Exercise 3: What is the mean value of horsepower (hp
) to 2 decimal places in the mtcars
data?
Solution 3: round(mean(cars$hp), 2)
=146.69
Exercise 4: What car has the lowest miles per gallon (mpg
) value for cars with 8 cylinders (cyl
) in the mtcars
data?
Solution 4: cars[which.min(cars$mpg),]
=Cadillac Fleetwood
Exercise 5: What is the median miles per gallon (mpg
) value for cars with 8 cylinders (cyl
) in the mtcars
data?
Solution 5: median(cars$mpg[cars$cyl==8])
= 15.2
Exercise 6: What car has the highest weight (wt
) for each amount of cylinders (cyl
) in the mtcars
data?
Solution 6:
unique(cars$cyl
cars[which.max(cars$wt[cars$cyl==4]),]
=Mazda RX4 Wag
cars[which.max(cars$wt[cars$cyl==6]),]
=Hornet 4 Drive
cars[which.max(cars$wt[cars$cyl==8]),]
=Duster 360
Exercise 7: Create a bar plot that shows the number of cars with each gear type (gear
) in the mtcars
data.
Solution 7:
gear_types <- table(cars$gear)
barplot(gear_types, names.arg=c("Type 3", "Type 4", "Type 5"),
xlab = "Gear types", ylab = "Total count",
main = "Distribution of gear types")
Exercise 8: Create a bar plot that shows the number of cars with each gear type (gear
) with the distribution of cylinders (cyl
) for each type in the mtcars
data. Add a legend for the cylinders.
Solution 8:
cyl_gear_types <- table(cars$cyl, cars$gear)
colours <- c("#004C92", "#FF0000", "#FFC600")
barplot(cyl_gear_types, names.arg=c("Type 3", "Type 4", "Type 5"),
xlab = "Gear types", ylab = "Total count",
main = "Distribution of gear types vs cylinders",
col = colours)
legend("topright", rownames(cyl_gear_types), cex = 1.3, fill = colours)
Exercise 9: Create a scatter plot to show the relationship between weight (wt
) and miles per gallon (mpg
) in the mtcars
data.
Solution 9:
plot(x = cars$wt,y = cars$mpg,
xlab = "Weight",
ylab = "Miles per gallon",
xlim = c(1,6),
ylim = c(10,35),
main = "Weight vs miles per gallon")
ggplot(cars, aes(x = wt, y = mpg)) +
geom_point(col="#004C92")+
labs(x="Weight", y="Miles per gallon")+
ggtitle("Weight vs miles per gallon")
Exercise 10: Create a scatter plot to show the relationship between weight (wt
) and miles per gallon (mpg
) by cylinder (cyl
) in the mtcars
data.
Solution 10:
plot(cars$wt, cars$mpg, col = cars$cyl,
xlab="Weight", ylab="Miles per gallon",
main="Weight vs miles per gallon by cylinder")
lapply(cars$cyl, function(x) {
abline(lm(mpg ~ wt, cars, subset = (cyl == x)), col = x)})
legend("topright", rownames(cyl_gear_types),
col = c(4,6,8), pch = 1, bty = "n")
ggplot(cars, aes(x = wt, y = mpg, group=cyl, col=cyl)) +
geom_point() +
geom_smooth(method="lm", se=FALSE)+
labs(x="Weight", y="Miles per gallon")+
ggtitle("Weigth vs miles per gallon by cylinder")
12.3 Module 3
12.3.1 General exercises
Exercise: How can you add multiple layers, such as polygons for states and points for health facilities, to a single map using tmap
?
By using
tm_shape()
andtm_fill()
By using
tm_shape()
andtm_dots()
By using multiple
tm_shape()
functions with different elementsBy using
tm_shape()
andtm_lines()
Solution:
3. By using multiple tm_shape()
functions with different elements
12.3.2 End of module exercises
Exercise 1: What is the primary purpose of the .prj
file in a shapefile set, and which package is commonly used to handle this file type in R?
Stores map projection information, handled by
raster
package.Stores attribute information, handled by
sp
packageStores projection information, handled by
sf
packageStores metadata information, handled by
terra
package
Solution 1: 3. Stores projection information, handled by sf
package
Exercise 2: Create a map of Nigeria that combines the raster layer for building count and the vector layer for region.
Solution 2:
library(terra)
library(sf)
library(tmap)
#load data
building_raster <- rast(nga_building_count.tif) #raster data
regions_vector <- st_read(nga_regions.shp) #vector data
#both are in same CRS
if (st_crs(regions_vector)$epsg != crs(building_raster, describe=TRUE)$code) {
regions_vector <- st_transform(regions_vector, crs(building_raster))
}
#plot tmap
tmap_mode("plot")
tm_shape(building_raster) +
tm_raster(style = "quantile", palette = "YlOrRd", title = "Building Count") +
tm_shape(regions_vector) +
tm_borders(lwd = 1, col = "black")
Exercise 3: Create a choropleth map that shows the variation in population density for Nigeria at region level. Export the results as a .png file.
Solution 3:
library(sf)
library(tmap)
library(dplyr)
#load data
nigeria_regions <- st_read(nga_regions.shp)
#calculate population density
nigeria_regions <- nigeria_regions %>%
mutate(pop_density = nga_pop / nga_area)
#create choropleth map
tmap_mode("plot")
choropleth_map <- tm_shape(nigeria_regions) +
tm_polygons("pop_density",
style = "quantile",
palette = "Blues") +
tm_layout(title = "Nigeria: Population Density by Region")
#save as PNG
tmap_save(choropleth_map, filename = "nigeria_population_density.png",
width = 2000, height = 1500, dpi = 300)
Exercise 4: How do you plot a raster dataset using tmap
, ensuring it can handle large datasets efficiently?
tm_shape(raster_data)
+tm_fill()
tm_shape(raster_data)
+tm_raster()
tm_shape(raster_data)
+tm_polygons()
tm_shape(raster_data)
+tm_dots()
Solution 4:
2. tm_shape(raster_data)
+ tm_raster()
12.4 Module 4
12.4.1 General exercises
Exercise: Identify the dependent and independent variables in the following sentence:
‘A researcher investigates the effects of school density on the grades of its pupils.’
Solution: Dependent variable: the grades of the pupils, independent variable: school density
Exercise: Fit a Poisson GLM using the ccancer
dataset exploring the relationship between the count of
cancer deaths and the covariates for the cancer site and region in Canada. Is the relationship between the
dependent and independent variables significant?
Solution:
ccancer_glm_exercise <- glm(Count ~ Site + Region, family = "poisson", data = ccancer)
summary(ccancer_glm_exercise)
Yes, the relationships are significant.
Exercise: What is the expected birth weight for a baby boy at a gestational age of 39.5 weeks?
Solution: \[Weight = -1447.24 -163.04\times 0+120.89\times 39.5 = 3327.92g\]
12.5 Module 5
12.5.1 General exercises
Exercise: Following on from the example above, what is the probability that an individual that is vaccincated but has a high fever is infected with disease A?
Solution: 0.19
Exercise: Following on from the example above, what is the conditional probability of drawing a purple button given that the button drawn is not blue?
Solution:
\[ \begin{aligned} P(A|B)&=\frac{P(A\cap B)}{P(B)}\\ &= \frac{\text{number of pruple buttons}/\text{total number of buttons}}{\text{blue}/\text{total number of buttons}}\\ &=\frac{4/10}{6/10}\\ &=\frac{2}{3}=0.67. \end{aligned} \]
Exercise: Following on from the example above, what would the probability be of selecting 2 men from the island without replacement?
Solution: \[ \begin{aligned} P(M_1 \cap M_2)&=P(M_1)P(M_2|M_1)\\ &= \frac{\text{number of men}}{\text{total number of people}}\times\frac{\text{number of men left}}{\text{total number of people left}}\\ &=\frac{4}{10}\times\frac{3}{9}\\ &=\frac{2}{15}. \end{aligned} \] ## Module 6
12.5.2 General exercises
Exercise: Following on from the example above, what is the probability that someone who is vaccinated is from region C?
Solution: \[P(A|Vaccinated) = \frac{P(C)P(Vaccinated|C)}{P(Vaccinated)} = \frac{0.3 \times 0.74}{0.728} = 0.305\]
Exercise: Create a histogram of posterior values for hospital D.
Solution:
#histogram of posterior values for hospital D
ggplot(data = predicted_values_df , aes(x = D)) +
geom_histogram(bins = 20, color = "#00b4d8", alpha = 1, fill = "#1d3557") +
theme_bw() +
labs(title = "Posterior Predicted values for D",
x = "Posteriors",
y = "Frequency")
Exercise: Create a density plot of the posterior values for hospital G.
Solution: