These exercise cover the scales, statistics and themes of ggplot2 for Plotting in R.

Exercise 2

In these exercises we look at adjusting the scales and themes of our plots.

Scales

  1. Using the patient dataset from earlier, generate a scatter plot of BMI versus Weight
library(ggplot2)

patients_clean <- read.delim("./data/patients_clean_ggplot2.txt",sep="\t")

plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight))+geom_point()
plot

  1. With the plot above, from exercise 1, adjust the BMI axis to show only labels for 20, 30, 40 and the weight axis to show breaks for 60 to 100 in steps of 5 as well as to specify units in y axis label.
plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight))+geom_point()
plot+scale_x_continuous(breaks=c(20,30,40),label=c(20,30,40),limits=c(20,40))+
  scale_y_continuous(breaks=seq(60,100,by=5),label=seq(60,100,by=5),
                     name="Weight (kilos)")

  1. Create a violin plot of BMI by Age where violins are filled using a sequential colour palette.
plot <- ggplot(data=patients_clean,
               mapping=aes(x=factor(Age),y=BMI))+geom_violin(aes(fill=factor(Age)))+
               scale_fill_brewer(palette="Blues", na.value="black")
  1. Create a scatterplot of BMI versus Weight and add a continuous colour scale for the height. Make the colour scale with a midpoint (set to mean point) colour of gray and extremes of blue (low) and yellow (high).
library(ggplot2)

plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight,colour=Height))+geom_point()+
  scale_colour_gradient2(low="blue",high="yellow",mid="grey",midpoint=mean(patients_clean$Height))
plot

  1. Adjust the plot from exercise 4 using scales to remove values greater than 180.
library(ggplot2)

plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight,colour=Height))+geom_point()+
  scale_colour_gradient2(low="blue",high="yellow",
                         mid="grey",midpoint=mean(patients_clean$Height),
                         limits=c(min(patients_clean$Height),180),
                         na.value=NA)
plot
## Warning: Removed 14 rows containing missing values or values outside the scale range
## (`geom_point()`).

  1. Adjust the scale legend from plot in exercise 4 to show only 75%, median and min values in scale legend.
library(ggplot2)

plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight,colour=Height))+geom_point()+
  scale_colour_gradient2(low="blue",high="yellow",
                         mid="grey",midpoint=mean(patients_clean$Height),
                         breaks=c(min(patients_clean$Height),
                                  median(patients_clean$Height),
                                  quantile(patients_clean$Height)[4]),
                         labels=c(signif(min(patients_clean$Height),3),
                                  signif(median(patients_clean$Height),3),
                                  signif(quantile(patients_clean$Height)[4],3)))
plot

  1. With the plot from exercise 4, create another scatterplot with Count variable mapped to transparency/alpha and size illustrating whether a person is overweight. Is there a better combination of aesthetic mappings?
library(ggplot2)

plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight,colour=Height,alpha=Count,size=Overweight))+geom_point()
plot
## Warning: Using size for a discrete variable is not advised.

Statistics

  1. Recreate the scatterplot of BMI by height. Colour by Age group and add fitted lines (but no SE lines) for each Age group.
library(ggplot2)

plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight,colour=factor(Age)))+geom_point()+stat_smooth(method="lm",se=F)

plot
## `geom_smooth()` using formula = 'y ~ x'

Themes

  1. Remove the legend title from plot in exercise 7, change the background colours of legend panels to white and place legend at bottom of plot.
plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight,colour=factor(Age)))+geom_point()+stat_smooth(method="lm",se=F)

plot <- plot+theme(legend.title=element_blank(),legend.background=element_rect(fill="white"),legend.key=element_rect(fill="white"),legend.position="bottom")

plot
## `geom_smooth()` using formula = 'y ~ x'

  1. Add a title to the plot, remove minor grid lines and save to 7 by 7 inch plot on disk.
plot <- ggplot(data=patients_clean,
               mapping=aes(x=BMI,y=Weight,colour=factor(Age)))+geom_point()+stat_smooth(method="lm",se=F)

plot <- plot+theme(legend.title=element_blank(),legend.background=element_rect(fill="white"),legend.key=element_rect(fill="white"),legend.position="bottom")

plot <- plot+ggtitle("BMI vs Weight")+theme(panel.grid.minor=element_blank())

plot
## `geom_smooth()` using formula = 'y ~ x'

ggsave(plot,file="BMIvsWeight.png",units = "in",height = 7,width = 7)
## `geom_smooth()` using formula = 'y ~ x'
  1. Produce a Height vs Weight scatter plot with point sizes scaled by BMI. Present only the points and title of plot with all other graph features missing.