Join the Shiny Community every month at Shiny Gatherings

# How to Make Stunning Histograms in R: A Complete Guide with ggplot2

16 November 2021

Updated: September 1, 2022.

## R ggplot histogram

Be honest. How uninspiring are your data visualizations? Expert designers make graph design look effortless, but in reality, it can’t be further from the truth. Luckily, the R programming language provides countless ways to make your visualizations eye-catching. Today you’ll learn how to make R ggplot histograms and how to tweak them to their full potential.

Read more on our R ggplot series:

This article will show you how to make stunning histograms with R’s `ggplot2` library. We’ll start with a brief introduction and theory behind histograms, just in case you’re rusty on the subject. You’ll then see how to create and tweak R ggplot histogram taking them to new heights.

## What is a Histogram?

A histogram is a way to graphically represent the distribution of your data using bars of different heights. A single bar (bin) represents a range of values, and the height of the bar represents how many data points fall into the range. You can change the number of bins easily.

The easiest way to understand them is through visualization. The image below shows a histogram of 10,000 numbers drawn from a standard normal distribution (mean = 0, standard deviation = 1):

Image 1 – Histogram of a standard normal distribution

Although at first glance the histogram doesn’t look like much, it actually tells you a lot. When data is distributed normally (bell curve), you can draw the following conclusions:

• 68.26% of the data points are located between -1 and +1 standard deviations (34.13% in either direction).
• 95.44% of the data points are located between -2 and +2 standard deviations (47.72% in either direction).
• 99.72% of the data points are located between -3 and +3 standard deviations (49.86% in either direction).
• Anything outside the -3 and +3 standard deviation range is considered to be an outlier.

In reality, you’re rarely dealing with a perfectly normal distribution. It’s usually skewed in either direction or has multiple peaks. Keep this in mind when drawing conclusions from the shape of a histogram, alone.

Let’s see how you can use R and ggplot to visualize histograms.

## Make Your First ggplot Histogram

We’ll use the `Gapminder` dataset throughout the article to visualize histograms. It’s a relatively small dataset showing life expectancy, population, and GDP per capita in countries between 1952 and 2007. We’ll use only a subset that shows countries in Europe and discard everything else.

Here’s the code you need to import libraries, load, and filter the dataset:

``````library(dplyr)
library(ggplot2)
library(gapminder)

gm_eu <- gapminder %>%
filter(continent == "Europe")
gm_eu``````

Here’s how the first couple of rows from `gm_eu` look like:

Image 2 – European countries of the Gapminder dataset

We’ll visualize the `lifeExp` column with histograms, as it provides enough continuous data to play around with.

Let’s make the most basic ggplot histogram first. You can use the `geom_histogram()` function to do so. Provided you’ve passed in the dataset and the default aesthetics:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram()``````

Image 3 – Default histogram

Well, you won’t see anything like that on a website or in a magazine, so we better get our keyboard dirty with some tweaking.

Let’s start by changing the number of bins (bars). The default value is 30, and it works in most cases. If you want your histograms to look boxier, use fewer bins. On the other hand, go big if you want your histograms to look like density plots. Here’s what a histogram with 10 bins looks like:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(bins = 10)``````

Image 4 – Histogram with 10 bins

Let’s stick with the default number of bins for the rest of the article, as it looks somewhat better.

The coloring is painful to look at. There’s nothing wrong with gray, but it looks too boring. Here’s how to enhance your ggplot histogram to make give it some Appsilon flair — blue fill color with black borders:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8")``````

Image 5 – Tweaking the fill and outline color

Much better, provided you like the blue color. Let’s dive deeper into styling and annotations next.

## How to Style and Annotate ggplot Histograms

### Styling

You can bring more life to your ggplot histogram. For example, we sometimes like to add a vertical line representing the mean, and two surrounding lines representing the range between -1 and +1 standard deviations from the mean. It’s a good idea to style the lines differently, just so your histogram isn’t confusing.

The following code snippet draws a black line at the mean, and dashed black lines at -1 and +1 standard deviation marks:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8") +
geom_vline(aes(xintercept = mean(lifeExp)), color = "#000000", size = 1.25) +
geom_vline(aes(xintercept = mean(lifeExp) + sd(lifeExp)), color = "#000000", size = 1, linetype = "dashed") +
geom_vline(aes(xintercept = mean(lifeExp) - sd(lifeExp)), color = "#000000", size = 1, linetype = "dashed")``````

Image 6 – Adding vertical lines to histograms

Are you up for a challenge? Try to recreate our histogram from Image 1. Hint: use `geom_segment()` instead of `geom_vline()`.

Every so often you want to make your ggplot histogram richer by combining it with a density plot. It shows more or less the same information, just in a smoother format. Here’s how you can add a density plot overlay to your histogram:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(aes(y = ..density..), color = "#000000", fill = "#0099F8") +
geom_density(color = "#000000", fill = "#F85700", alpha = 0.6)``````

Image 7 – Adding density plots to histograms

It’s somewhat of a richer data representation than if you’d’ve gone with the histogram alone. For example, if you were to embed the above chart to a dashboard, you could let the user toggle the overlay for maximum customizability.

Do you want to build dashboards professionally? Here’s how to start a career as an R Shiny Developer.

### Annotations

Finally, let’s see how you can add annotations to your ggplot histogram. Maybe you find vertical lines too intrusive, and you just want a plain textual representation of specific values.

First things first, you’ll need to create a `data.frame` for annotations. It should contain X and Y values, and also the labels that will be displayed:

``````annotations <- data.frame(
x = c(round(min(gm_eu\$lifeExp), 2), round(mean(gm_eu\$lifeExp), 2), round(max(gm_eu\$lifeExp), 2)),
y = c(4, 52, 5),
label = c("Min:", "Mean:", "Max:")
)
``````

You can now include these in a `geom_text()` layer. Hint: make the annotations bold, so they’re easier to spot:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8") +
geom_text(data = annotations, aes(x = x, y = y, label = paste(label, x)), size = 5, fontface = "bold")``````

Image 8 – Adding annotations to histograms

The trick with annotations is making sure there’s some gap between them, so the text doesn’t overlap.

### R ggplot histogram theming

Let’s also see how you can remove this grayish background color. The easiest approach is by adding a more minimalistic theme to the chart. The `theme_classic()` is one of our top picks:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8") +
theme_classic()``````

Image 9 – Changing the theme

If that theme isn’t your piece of the pie, here is the good news – you have options. Let’s explore a couple of them.

The one below will apply a dark look to your charts:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8") +
theme_dark()``````

Image 10 – Dark theme

Dark and blur combo don’t necessarily go well together, but you can always tweak the bin color for something lighter.

In case you want to get rid of axes and axes labels altogether, the Void theme is your friend:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8") +
theme_void()``````

Image 11 – Void theme

We also like the Test theme – it keeps the stylings on a minimal level and surrounds the entire chart with a light grayish border:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8") +
theme_test()``````

Image 12 – Test theme

The only thing missing from our ggplot histogram is the title and axis labels. The users don’t know what they’re looking at without them.

## Add Text, Titles, Subtitles, Captions, and Axis Labels to ggplot Histograms

Titles and axis labels are mandatory for production-ready charts. Subtitles or captions are optional, but we’ll show you how to add them as well. The magic happens in the `labs()` layer. You can use it to specify the values for title, subtitle, caption, X-axis, and Y-axis:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8") +
labs(
title = "Histogram of Life Expectancy in Europe",
caption = "Source: Gapminder dataset",
x = "Life expectancy",
y = "Count"
) +
theme_classic()``````

Image 13 – Adding title, subtitle, caption, and axis labels

It’s a good start, but the newly added elements don’t stand out. You can change the font, color, size, among other things, in the `theme()` layer. Just make sure to include a custom theme layer like `theme_classic()` before you write your styles. These would get overridden otherwise:

``````ggplot(gm_eu, aes(lifeExp)) +
geom_histogram(color = "#000000", fill = "#0099F8") +
labs(
title = "Histogram of Life Expectancy in Europe",
caption = "Source: Gapminder dataset",
x = "Life expectancy",
y = "Count"
) +
theme_classic() +
theme(
plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
plot.subtitle = element_text(size = 10, face = "bold"),
plot.caption = element_text(face = "italic")
)``````

Image 14 – Styling title, subtitle, and caption

It’s starting to shape up now. And it also matches the color palette of our ggplot histogram. We’ve covered everything needed to get you started visualizing your data distributions with histograms, so we’ll call it a day here. But there’s so much more you can do with your visualizations. Check out some of our Shiny demos to see where advanced-level R programming can take your data visualizations.

Did you know there’s another way to visualize data distributions? Read our complete guide to boxplots.

## Summary of R ggplot Histogram

Today you’ve learned what histograms are, why they are important for visualizing the distribution of continuous data, and how to make them appealing with R and the `ggplot2` library. It’s enough to set you on the right track, and now it’s up to you to apply this knowledge to your datasets. We’re sure you can manage it.

At Appsilon, we’ve used histograms and the `ggplot2` package in developing enterprise R Shiny dashboards for Fortune 500 companies. If R and R Shiny is something you have experience with, we might have a position ready for you.

Start a career at Appsilon —  positions available.