5 Key Data Visualization Principles Explained – Examples in R
Data visualization can be tricky to do right. There are a ton of key principles you need to be aware of. Today we bring you 5 best practices for visualizing data with examples in R programming language. Incorporate these key R data visualization principles into your toolset to improve your data storytelling.
After reading, you’ll know how to produce publication-ready charts that won’t leave users questioning the data or the logic. You’ll know how to use
plotly for both static and interactive charts, and also how to get maximum interactivity out of your visualizations with R Shiny.
Want a deeper dive into visualization principles in R Shiny? Read our guide for bar plots.
These are the 5 key data visualization principles you must know:
Table of Contents
- Don’t Manipulate with Axis Ranges
- Always Add Title and Axis Labels
- Choose Appropriate and Appealing Color Palettes
- Ditch 3D Charts – 2D is Plenty Enough
- Make Your Charts Interactive – Go the Extra Mile
- Summary of Key Data Visualization Principles
Don’t Manipulate with Axis Ranges
In the past, companies and individuals loved to exaggerate small and insignificant differences by manipulating axis ranges. For example, imagine a company had a profit of $100M in 2020 and $105M in 2021. In relative terms, that’s only a 5% increase – nothing to write home about – so the difference wouldn’t be immediately visible on a chart if the Y-axis range goes from 0 to 120 (Y-axis shows the profit).
What you could do – but shouldn’t – is to shorten the Y-axis range. A range between 99.5 and 105.5 would do the trick.
UX Design of Shiny apps is important. Follow these 7 steps to design dashboards for better results.
Let’s see the effect in action. Use the following code to declare a
data.frame object containing profit for the mentioned two years:
ggplot2, you can use the
coord_cartesian(ylim = c(lower, upper)) to change the Y-axis range. Let’s set it to go from 99.5 to 105.5:
It looks like the difference is huge – easily 5-6 times higher than the year before. The chart doesn’t lie actually, but it doesn’t respect key data visualization principles. It’s easy to get the whole story wrong if you don’t look at the axis ticks.
The same chart looks nowhere near as impressive with the default Y-axis range:
Take-home point: Always read the axis ticks. Just because you’re obeying key data visualization principles, it doesn’t mean everyone else is.
Always Add Title and Axis Labels
A chart without a title and axis labels is pretty much useless. It might look great otherwise, but how can you know what you’re looking at? There’s no way to tell. Sure, you can describe the contents in the paragraph above, but that’s not a replacement. It’s only a supplement at best.
ggplot2 makes it easy to obey this key data visualization principle. You can use the
labs() function to add title, subtitle, caption, and axis labels, and you can use the
theme() function to style them:
Not all charts need a subtitle and a caption, but we added them just for the fun. Every chart you make should include a title and axis labels at least.
Additionally, be aware of proportioning in your visualizations. If you’re title and label texts are small, they might be overwhelmed by elements in the chart. Be cognizant of what you want readers to view and emphasize elements accordingly.
Choose Appropriate and Appealing Color Palettes
There’s nothing worse than spending hours making the best out of your data but failing to make the chart visually appealing. We get it – not everyone has an eye for design. If you’re a software engineer, it’s likely you find design and aesthetics a nightmare. Similarly, if you’re a graphics designer, you’re able to design great-looking visuals – but can you implement them in code?
That’s where choosing an appropriate color palette comes in. The coolors.co is used and loved by many when it comes to picking a color palette.
It’s mostly used for entire websites and brand identities, but there’s no reason you can’t pick a single color you like (or multiple), and use it in your data visualizations.
The second one – Prussian Blue looks promising. Specify the
fill parameter in the call to
geom_bar() to change the color:
Sometimes, a single color won’t work. If your dataset has a categorical feature (e.g., day of the week, gender, age group), you can use it to color the bars or different chart segments. Simply set the
fill parameter to the name of the dataset variable in the
ggplot() function call:
The selected column of this dataset has only two features, but you get the gist. Just don’t go around using color ramps willy-nilly or combining two scales for one trait – or at least don’t tell anyone we told you to do it.
Color is a key data visualization principle. Master it and use it wisely.
Ditch 3D Charts – 2D is Plenty Enough
Take a look at the following three charts – don’t worry, we didn’t create them, we just picked them from the Internet:
What do they all have in common? You’ve guessed it – they all look horrible. Depth has no place in most data visualizations, especially not in those aimed at business users and the general public. Also, you can’t embed 3D visualizations in publications.
You can use depth, or Z-axis, when analyzing data yourself. After all, you know best what works for you – but that’s where the story should end.
Most users find the third dimension confusing for data visualization, and we get that. It’s easy to distort the data and come up with wrong insights. After all, everything is a matter of perspective. Two dimensions are just enough for 99.9% of the cases. If you want to convey extra information, consider changing the size or color of graph elements to accommodate for extra variables.
Make Your Charts Interactive – Go the Extra Mile for Better Data Visualizations
Probably the most important key data visualization principle and component is interactivity. There’s nothing wrong with static charts, especially if you’re just getting into data visualization, but interactivity will set you apart from the crowd.
The idea is that something should happen when you click or hover over a chart element. With bar charts, the most common thing you can do is to display the counts of the selected category.
ggplot2 doesn’t support interactivity at this time. You’ll have to switch to some other alternative instead, like
plotly. The syntax is a bit different, but you’ll quickly get the hang of it. Their documentation is superb, and you’ll find everything you need there.
Here’s how to “redraw” the chart from the previous sections in Plotly:
You can see how detailed data is shown automatically as you hover over individual bars. What gets displayed can be tweaked, but more on that some other time.
Do you know what really sets your visualizations from the crowd? You’ve guessed it – dashboards – at least in the interactivity department. For demonstration’s sake, we’ll declare a new dataset consisting of budgets across two departments in a two-year time span. The end-user can select the department on the dashboard, and the chart gets redrawn instantly. Take a look:
Embedding your visualizations into dashboards is light years ahead of everything you can do with a static graphing library. It allows for the most flexibility for the end-user, which is the only thing that matters in the long run.
It’s safe to say interactivity is among the most important key data visualization principles of 2022 and beyond.
Summary of Key Data Visualization Principles
Data visualization is one of those things that looks easy, but in reality, it’s easy to get wrong. A small error like forgetting to add axis labels can cost you a lot in the long run, especially if you can’t add it afterward.
Today you’ve learned five key principles of data visualization and got hands-on experience of visualizing data in R – with
shiny. It’s a lot to process for a single article, but we hope you managed to follow along.
If you want to dive deeper into data visualization with R, look no further than our in-depth guides: