Additional Box Plot Customization Options - ggplot Tutorial 8
In a previous video, we learned how to create a basic box pot, but in today's video we are going to go over a ton of cool customization options. By the end of this video you will know how to change the color of your box plots, how to adjust outliers, and how to add in transparency. Ready?? Let's get to it!
Getting Started
For the purposes of this post I'm going to assume that you've already created a basic box plot. If you haven't already done so, you can create a sample box plot using the code below.
library(tidyverse)
mtcars$cyl <- as.factor(mtcars$cyl) #needs to be factor var
ggplot(data = mtcars, aes(x=mpg, y = cyl))+geom_boxplot(
)
Once you run your code, you should get a graph that looks something like this.

To be fair, this isn't a completely terrible box plot, but we can add in a lot of additional customization to make it truly stand out.
Customization Options
The first thing to mention is that there are a lot of different customization options. You can find a full list in the official documentation, but I'm going to show you some that I think are the most important to understand.I know this sounds like a lot of code, so I want to break everything down one item at a time to provide a brief description of what each argument does.
ggplot(data = mtcars, aes(x = mpg, y = cyl))+geom_boxplot(
color = "hotpink", # Outline color
fill = "blue", # Fill color
outlier.shape = 4, # 1/open circ, 17solid tri, 4x
outlier.size = 5, # Bigger outliers
outlier.color = "red", # Red outliers
linetype = "solid", # "solid", "dashed", "dotted"
linewidth = 1,
alpha = .7, # transparency 0 (transparent) -> 1 (opaque)
outliers = TRUE,
show.legend = TRUE,
staplewidth = 0.5 # vertical lines
)+
labs(
title = "MPG vs CYL",
x = "Miles Per Gallon",
y = "Cylinders"
)

| Argument | Description |
|---|---|
| color | changes outline color |
| fill | changes inside color |
| outlier.shape | takes a numeric code that is assigned to a particular shape |
| outlier.size | takes a numeric value, bigger # = bigger point |
| outlier.color | changes outlier color |
| linetype | options include "solid", "dashed", "dotted" |
| linewidth | bigger # makes line thicker |
| alpha | ranges from 0 - 1. Bigger number = more opaque. Smaller = more transparent |
| outliers | default is TRUE, FALSE hides outliers |
| show.legend = | default is TRUE, FALSE hides legend |
| staplewidth | larger # = bigger |
Notes
Most of the arguments are straightforward, but there's a couple notes to keep in mind. The first thing to note is that any color coding we put inside the geom_boxplot will override things that are inside of the aesthetics. For example the code below specifies that the aesthetic chould be colored by cylinder (which would give a different color to each plot), but the fill = green command overrides this and all plots are green.
ggplot(data = mtcars, aes(x = mpg, y = cyl, fill = cyl))+geom_boxplot(fill = "green", show.legend = TRUE)

Also note that there is no legend displayed when the colors of all the plots are the same. This applies even if we say legend = TRUE.
Summary
Obviously, the sky is the limit when it comes to customizing the plots that we create in the R programming language. That being said, I've tried to cover some of the most useful customization options in today's post. As always, thanks for reading, and I hope you enjoy the rest of your day!
Congratulations @algoswithamber! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)
Your next target is to reach 100 replies.
You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word
STOP