rm(list = ls()) # clean-up workspace
library("tidyverse")
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Q1. What’s gone wrong with this code?
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, colour = "blue"))
Q2. Which variables in mpg
are categorical? Which
variables are continuous? (Hint: type ?mpg
to read the
documentation for the dataset). How can you see this information when
you run mpg
?
Q3. Map a continuous variable to color
,
size
, and shape
. How do these aesthetics
behave differently for categorical vs. continuous variables?
Q4. What happens if you map the same variable to multiple aesthetics?
Q5. What happens if you map an aesthetic to something other than a
variable name, like aes(colour = displ < 5)
? Note,
you’ll also need to specify x
and y
.
Q6. What geom would you use to draw a line chart? A boxplot? A histogram? An area chart?
Q7. Will these two graphs look different? Why/why not?
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth()
ggplot() +
geom_point(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_smooth(data = mpg, mapping = aes(x = displ, y = hwy))
In this part, we want to replicate the plots in reference.
set.seed(42)
n <- 1000
x <- runif(n) * 3
y <- x * sin(1/x) + rnorm(n) / 25
df <- tibble(x = x, y = y)
p1 <- ggplot(df, aes(x, y)) +
geom_point(alpha = 0.3) +
geom_smooth(se = FALSE) +
theme_bw()
p1
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
Let’s further take a look over the function \(f(x) = x \sin(\frac{1}{x})\).
Create a new variable \(z\) to store the \(f(x)\) values at each \(x\) point.
df.with.z <- df %>%
add_column(z = x * sin(1 / x))
Now plot another figure with two layers:
Scatter plot of the points (x, y)
A line by (x, z)
Put your code inside the code block, explain your findings outside the code block (so that it renders to text in the HTML file).
You may want to change eval = TRUE
or remove it for the
code to be executed.
p2
Now we see that the default smoothing function doesn’t catch the beginning part of these points.
Let’s zoom in (with clipping) the region \(x \in [0, 0.5]\) and make another plot.
p3
Now embed p3 inside p1.
Please add appropriate titles and subtitles to them.
p1 + annotation_custom(ggplotGrob(p3), xmin = 1, xmax = 3, ymin = -0.3, ymax = 0.6)
Now please follow the second half of the reference, but instead create a UK map similar to this one and save it as uk_map.pdf
Hint: it’s time to practice your self-learning (google?) skills.
Target UK
map
A bad map
your instructor created.
# Your code for the UK map