Matplotlib vs. ggplot: How to Use Both in R Shiny Apps

Reading time:

time

min

September 22, 2022

Data Science has (unnecessarily) divided the world into two halves - R users and Python users. Irrelevant of the group you belong to, there's one thing you have to admit - each language individually has libraries far superior to anything available in the alternative. For example, R Shiny is much easier for beginners than anything Python offers. But what about basic data visualization? That's where this Matplotlib vs. ggplot article comes in. Today we'll see how R and Python compare in basic data visualization. We'll compare their standard plotting libraries - Matplotlib and ggplot to see which one is easier to use and which looks better at the end. We'll also show you how to <b>include Matplotlib charts in R Shiny dashboards</b>, as that's been a common pain point for Python users. What's even better, the chart will react to user input. <blockquote>Want to use R and Python together? <a href="https://appsilon.com/use-r-and-python-together/" target="_blank" rel="noopener">Here are 2 packages you get you started</a>.</blockquote> Table of contents: <ul><li><a href="#the-basics">Matplotlib vs. ggplot - Which is Better for Basic Plots?</a></li><li><a href="#style">Matplotlib vs. ggplot - Which is easier to customize?</a></li><li><a href="#shiny-ggplot">How to Include ggplot Charts in R Shiny</a></li><li><a href="#shiny-matplotlib">How to Use Matplotlib Charts in R Shiny</a></li><li><a href="#summary">Summary of Matplotlib vs. ggplot</a></li></ul> <hr /> <h2 id="the-basics">Matplotlib vs. ggplot - Which is Better for Basic Plots?</h2> There's no denying that both Matplotlib and ggplot don't look the best by default. There's a lot you can change, of course, but we'll get to that later. The aim of this section is to compare Matplotlib and ggplot in the realm of unstyled visualizations. To keep things simple, we'll only make a scatter plot of the well-known <code>mtcars</code> dataset, in which X-axis shows miles per gallon and Y-axis shows the corresponding horsepower. <blockquote>Are you new to scatter plots? <a href="https://appsilon.com/ggplot-scatter-plots/" target="_blank" rel="noopener">Here's our complete guide to get you started</a>.</blockquote> There's not a lot you have to do to produce this visualization in R ggplot: <pre><code class="language-r">library(ggplot2) <br>ggplot(data = mtcars, aes(x = mpg, y = hp)) + geom_point()</code></pre> <img class="size-full wp-image-14932" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b7d34fcba09eb7527f5270_d02bf850_1-4.webp" alt="Image 1 - Basic ggplot scatter plot" width="1736" height="1320" /> Image 1 - Basic ggplot scatter plot It's a bit dull by default, but <b>is Matplotlib better?</b> The <code>mtcars</code> dataset isn't included in Python, so we have to download and parse the dataset from GitHub. After doing so, a simple call to <code>ax.scatter()</code> puts both variables on their respective axes: <pre><code class="language-python">import pandas as pd import matplotlib.pyplot as plt <br> mtcars = pd.read_csv("https://gist.githubusercontent.com/ZeccaLehn/4e06d2575eb9589dbe8c365d61cb056c/raw/898a40b035f7c951579041aecbfb2149331fa9f6/mtcars.csv", index_col=[0]) <br>fig, ax = plt.subplots(figsize=(13, 8)) ax.scatter(x=mtcars["mpg"], y=mtcars["hp"])</code></pre> <img class="size-full wp-image-14934" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b7d350a11449a24ce97a08_66a965ea_2-4.webp" alt="Image 2 - Basic matplotlib scatter plot" width="1548" height="934" /> Image 2 - Basic matplotlib scatter plot It would be unfair to call ggplot superior to Matplotlib, for the pure fact that the dataset comes included with R. Python requires an extra step. From the visual point of view, things are highly subjective. Matplotlib figures have a lower resolution by default, so the whole thing looks blurry. Other than that, declaring a winner is near impossible. <i>Do you prefer Matplotlib or ggplot2 default stylings?</i> Let us know in the comment section below. Let's add some styles to see which one is easier to customize. <h2 id="style">Matplotlib vs. ggplot - Which is easier to customize?</h2> To keep things simple, we'll modify only a couple of things: <ul><li>Change the point sizing by the <code>qsec</code> variable</li><li>Change the point color by the <code>cyl</code> variable</li><li>Add a custom color palette for three distinct color factors</li><li>Change the theme</li><li>Remove the legend</li><li>Add title</li></ul> In R ggplot, that boils down to adding a couple of lines of code: <pre><code class="language-r">ggplot(data = mtcars, aes(x = mpg, y = hp)) + geom_point(aes(size = qsec, color = factor(cyl))) + scale_color_manual(values = c("#3C6E71", "#70AE6E", "#BEEE62")) + theme_classic() + theme(legend.position = "none") + labs(title = "Miles per Gallon vs. Horse Power")</code></pre> <img class="size-full wp-image-14936" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b7d351003913ca45eb0c24_ac9febfd_3-3.webp" alt="Image 3 - Customized ggplot scatter plot" width="1744" height="1304" /> Image 3 - Customized ggplot scatter plot The chart now actually looks usable, both for reporting and dashboarding purposes. <b>But how difficult it is to produce the same chart in Python?</b> Let's take a look. For starters, we'll increase the DPI to get rid of the blurriness, and also remove the top and right lines around the figure. Changing point size and color is a bit trickier to do in Matplotlib, but it's just a matter of experience and preference. Also, Matplotlib doesn't place labels on axes by default - consider this as a pro or a con. We'll add them manually: <pre><code class="language-python">plt.rcParams["figure.dpi"] = 300 plt.rcParams["axes.spines.top"] = False plt.rcParams["axes.spines.right"] = False <br> fig, ax = plt.subplots(figsize=(13, 8)) ax.scatter( x=mtcars["mpg"], y=mtcars["hp"], s=[s**1.8 for s in mtcars["qsec"].to_numpy()], c=["#3C6E71" if cyl == 4 else "#70AE6E" if cyl == 6 else "#BEEE62" for cyl in mtcars["cyl"].to_numpy()] ) ax.set_title("Miles per Gallon vs. Horse Power", size=18, loc="left") ax.set_xlabel("mpg", size=14) ax.set_ylabel("hp", size=14)</code></pre> <img class="size-full wp-image-14938" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b7d3510d6008396df292ba_a3372e77_4-4.webp" alt="Image 4 - Customized matplotlib scatter plot" width="1686" height="1084" /> Image 4 - Customized matplotlib scatter plot The figures look almost identical, <b>so what's the verdict? </b>Is it better to use Python's Matplotlib or R's ggplot2? Objectively speaking, Python's Matplotlib requires more code to do the same thing when compared to R's ggplot2. Further, Python's code is harder to read, due to bracket notation for variable access and inline conditional statements. So, does ggplot2 take the win here? Well, no. If you're a Python user it will take you less time to create a chart in Matplotlib than it would to learn a whole new language/library. The same goes the other way. Up next, we'll see how easy it is to include this chart in an interactive dashboard. <h2 id="shiny-ggplot">How to Include ggplot Charts in R Shiny</h2> Shiny is an R package for creating dashboards around your data. It's built for R programming language, and hence integrates nicely with most of the other R packages - ggplot2 included. We'll now create a simple R Shiny dashboard that allows you to select columns for the X and Y axis and then updates the figure automatically. If you have more than 30 minutes of R Shiny experience, the code snippet below shouldn't be difficult to read: <pre><code class="language-r">library(shiny) library(ggplot2) <br>ui <- fluidPage( tags$h3("Scatter plot generator"), selectInput(inputId = "x", label = "X Axis", choices = names(mtcars), selected = "mpg"), selectInput(inputId = "y", label = "Y Axis", choices = names(mtcars), selected = "hp"), plotOutput(outputId = "scatterPlot") ) <br>server <- function(input, output, session) { data <- reactive({mtcars}) output$scatterPlot <- renderPlot({ ggplot(data = data(), aes_string(x = input$x, y = input$y)) + geom_point(aes(size = qsec, color = factor(cyl))) + scale_color_manual(values = c("#3C6E71", "#70AE6E", "#BEEE62")) + theme_classic() + theme(legend.position = "none") }) } <br>shinyApp(ui = ui, server = server)</code></pre> <img class="size-full wp-image-14940" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b7d352ff233005820f9152_3649c6c6_5.gif" alt="Image 5 - Shiny dashboard rendering a ggplot chart" width="902" height="682" /> Image 5 - Shiny dashboard rendering a ggplot chart Put simply, we're rerendering the chart every time one of the inputs changes. All computations are done in R, and the update is almost instant. Makes sense, since <code>mtcars</code> is a tiny dataset. But how about <b>rendering a Matplotlib chart in R Shiny?</b> Let's see if it's even possible. <h2 id="shiny-matplotlib">How to Use Matplotlib Charts in R Shiny</h2> There are several ways to <a href="https://appsilon.com/use-r-and-python-together/" target="_blank" rel="noopener">combine R and Python</a> - reticulate being one of them. However, we won't use that kind of bridging library today. Instead, we'll opt for a simpler solution - calling a Python script from R. The mentioned Python script will be responsible for saving a Matplotlib figure in JPG form. In Shiny, the image will be rendered with the <code>renderImage()</code> reactive function. Let's write the script - <code>generate_scatter_plot.py</code>. It leverages the <code>argparse</code> module to accept arguments when executed from the command line. As you would expect, the script accepts column names for the X and Y axis as command line arguments. The rest of the script should feel familiar, as we explored it in the previous section: <pre><code class="language-python">import argparse import pandas as pd import matplotlib.pyplot as plt <br> # Tweak matplotlib defaults plt.rcParams["figure.dpi"] = 300 plt.rcParams["axes.spines.top"] = False plt.rcParams["axes.spines.right"] = False <br># Get and parse the arguments from the command line parser = argparse.ArgumentParser() parser.add_argument("--x", help="X-axis column name", type=str, required=True) parser.add_argument("--y", help="Y-axis column name", type=str, required=True) args = parser.parse_args() <br># Fetch the dataset mtcars = pd.read_csv("https://gist.githubusercontent.com/ZeccaLehn/4e06d2575eb9589dbe8c365d61cb056c/raw/898a40b035f7c951579041aecbfb2149331fa9f6/mtcars.csv", index_col=[0]) <br># Create the plot fig, ax = plt.subplots(figsize=(13, 7)) ax.scatter( x=mtcars[args.x], y=mtcars[args.y], s=[s**1.8 for s in mtcars["qsec"].to_numpy()], c=["#3C6E71" if cyl == 4 else "#70AE6E" if cyl == 6 else "#BEEE62" for cyl in mtcars["cyl"].to_numpy()] ) <br># Save the figure fig.savefig("scatterplot.jpg", bbox_inches="tight")</code></pre> You can run the script from the command line for verification: <img class="size-full wp-image-14942" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b7d3532eafb5d24630785c_5470cb42_6-3.webp" alt="Image 6 - Running a Python script for chart generation" width="2136" height="1186" /> Image 6 - Running a Python script for chart generation If all went well, it should have saved a <code>scatterplot.jpg</code> to disk: <img class="size-full wp-image-14944" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01d1d824742cc351d1c0e_7-scaled.webp" alt="Image 7 - Scatter plot generated by Python and matplotlib" width="2560" height="1401" /> Image 7 - Scatter plot generated by Python and matplotlib Everything looks as it should, <b>but what's the procedure in R Shiny?</b> Here's a list of things we have to do: <ul><li>Replace <code>plotOutput()</code> with <code>imageOutput()</code> - we're rendering an image afterall</li><li>Construct a shell command as a reactive expression - it will run the <code>generate_scatter_plot.py</code> file and pass in the command line arguments gathered from the currently selected dropdown values</li><li>Use <code>renderImage()</code> reactive function to execute the shell command and load in the image</li></ul> It sounds like a lot, but it doesn't require much more code than the previous R example. Just remember to specify a <b>full path to the Python executable</b> when constructing a shell command. Here's the entire code snippet: <pre><code class="language-python">library(shiny) <br>ui <- fluidPage( tags$head( tags$style(HTML(" #scatterPlot > img { max-width: 800px; } ")) ), tags$h3("Scatter plot generator"), selectInput(inputId = "x", label = "X Axis", choices = names(mtcars), selected = "mpg"), selectInput(inputId = "y", label = "Y Axis", choices = names(mtcars), selected = "hp"), imageOutput(outputId = "scatterPlot") ) <br>server <- function(input, output, session) { # Construct a shell command to run Python script from the user input shell_command <- reactive({ paste0("/Users/dradecic/miniforge3/bin/python generate_scatter_plot.py --x ", input$x, " --y ", input$y) }) <br> # Render the matplotlib plot as an image output$scatterPlot <- renderImage({ # Run the shell command to generate image - saved as "scatterplot.jpg" system(shell_command()) # Show the image list(src = "scatterplot.jpg") }) } </code></pre> <img class="size-full wp-image-14946" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b7d3542b08605d95a77dbd_e1772c35_8.gif" alt="Image 8 - Shiny dashboard rendering a matplotlib chart" width="902" height="720" /> Image 8 - Shiny dashboard rendering a matplotlib chart The dashboard takes some extra time to rerender the chart, which is expected. After all, R needs to call a Python script which then constructs and saves the chart to the disk. It's an extra step, so the refresh isn't as instant as with ggplot2. <hr /> <h2 id="summary">Summary of Matplotlib vs. ggplot</h2> To conclude, you can definitely use Python's Matplotlib library in R Shiny dashboards. There are a couple of extra steps involved, but nothing you can't manage. If you're a heavy Python user and want to try R Shiny, this could be the fastest way to get started. What do you think of Matplotlib in R Shiny? What do you generally prefer - Matplotlib or ggplot2? Please let us know in the comment section below. Also, don't hesitate to reach out on Twitter if you use another approach to render Matplotlib charts in Shiny - <a href="https://twitter.com/appsilon" target="_blank" rel="noopener">@appsilon</a>. We'd love to hear your comments. <blockquote>R Shiny and Tableau? <a href="https://appsilon.com/r-shiny-shinytableau/" target="_blank" rel="noopener">Learn to create custom Tableau extensions from R Shiny</a>.</blockquote>

Matplotlib vs. ggplot: How to Use Both in R Shiny Apps

Open source, pharma, and AI insights - once a week.

Share Your Data Goals with Us