Additional Box Plot Customization Options - ggplot Tutorial 8

In a previous video, we learned how to create a basic box pot, but in today's video we are going to go over a ton of cool customization options. By the end of this video you will know how to change the color of your box plots, how to adjust outliers, and how to add in transparency. Ready?? Let's get to it!

Getting Started

For the purposes of this post I'm going to assume that you've already created a basic box plot. If you haven't already done so, you can create a sample box plot using the code below.

library(tidyverse)

mtcars$cyl <- as.factor(mtcars$cyl) #needs to be factor var

ggplot(data = mtcars, aes(x=mpg, y = cyl))+geom_boxplot(

)

Once you run your code, you should get a graph that looks something like this.

image.png

To be fair, this isn't a completely terrible box plot, but we can add in a lot of additional customization to make it truly stand out.

Customization Options

The first thing to mention is that there are a lot of different customization options. You can find a full list in the official documentation, but I'm going to show you some that I think are the most important to understand.I know this sounds like a lot of code, so I want to break everything down one item at a time to provide a brief description of what each argument does.

ggplot(data = mtcars, aes(x = mpg, y = cyl))+geom_boxplot(
  color = "hotpink",          # Outline color
  fill = "blue",              # Fill color          
  outlier.shape = 4,          # 1/open circ, 17solid tri, 4x
  outlier.size = 5,           # Bigger outliers
  outlier.color = "red",      # Red outliers
  linetype = "solid",         # "solid", "dashed", "dotted"
  linewidth = 1,
  alpha = .7,                 # transparency 0 (transparent) -> 1 (opaque)
  outliers = TRUE,
  show.legend = TRUE,
  staplewidth = 0.5           # vertical lines
)+
  labs(
    title = "MPG vs CYL",
    x = "Miles Per Gallon",
    y = "Cylinders"
  )

image.png

ArgumentDescription
colorchanges outline color
fillchanges inside color
outlier.shapetakes a numeric code that is assigned to a particular shape
outlier.sizetakes a numeric value, bigger # = bigger point
outlier.colorchanges outlier color
linetypeoptions include "solid", "dashed", "dotted"
linewidthbigger # makes line thicker
alpharanges from 0 - 1. Bigger number = more opaque. Smaller = more transparent
outliersdefault is TRUE, FALSE hides outliers
show.legend =default is TRUE, FALSE hides legend
staplewidthlarger # = bigger

Notes

Most of the arguments are straightforward, but there's a couple notes to keep in mind. The first thing to note is that any color coding we put inside the geom_boxplot will override things that are inside of the aesthetics. For example the code below specifies that the aesthetic chould be colored by cylinder (which would give a different color to each plot), but the fill = green command overrides this and all plots are green.

ggplot(data = mtcars, aes(x = mpg, y = cyl, fill = cyl))+geom_boxplot(fill = "green", show.legend = TRUE)

image.png

Also note that there is no legend displayed when the colors of all the plots are the same. This applies even if we say legend = TRUE.

Summary

Obviously, the sky is the limit when it comes to customizing the plots that we create in the R programming language. That being said, I've tried to cover some of the most useful customization options in today's post. As always, thanks for reading, and I hope you enjoy the rest of your day!



0
0
0.000
1 comments
avatar

Congratulations @algoswithamber! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)

You got more than 50 replies.
Your next target is to reach 100 replies.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

0
0
0.000