R and Excel: How to Combine the Best of Both Worlds
R and Excel go together like macaroni and cheese. There's no need to choose one over the other, as there are numerous packages and extensions that allow them to work together. Just because you have an <code>xlsx</code> file doesn't mean you can't analyze it with R. Likewise, you can write R code to populate and manipulate Excel spreadsheets.
Today you'll get a grasp of two tools used to manage communication between R and Excel. These are <a href="https://readxl.tidyverse.org/" target="_blank" rel="noopener noreferrer">readxl</a> and <a href="https://bert-toolkit.com/" target="_blank" rel="noopener noreferrer">Bert</a>. The first one is an R package used to read and manipulate Excel files, while the latter is a free extension that brings R functionality to Excel.
<blockquote>Psst - do you think Google Sheets is superior? You can easily <a href="https://appsilon.com/connect-r-to-google-sheets-using-googlesheets4/" target="_blank" rel="noopener">connect R to Google Sheets with googlesheets4</a>.</blockquote>
Table of contents:
<ul><li><a href="#readxl">Readxl: R Package for Reading Excel Files</a></li><li><a href="#bert">Bert: Connect Excel with R</a></li><li><a href="#summary">Summary of R and Excel</a></li></ul>
<hr />
<h2 id="readxl">Readxl: R Package for Reading Excel Files</h2>
The <code>readxl</code> package is among the easiest ones you can use to communicate between R and Excel. It's different from the alternative R packages because it has no external dependencies, meaning it's easy to install and use on all operating systems.
It supports both the <code>.xls</code> and <code>.xslx</code> file formats. So if you're working on some legacy spreadsheets, there's no need to convert them to a more modern, XML-based, file format beforehand.
You can install the package directly from CRAN using the <code>install.packages("readxl")</code> command, and then import it as any other library.
It comes bundled up with many example Excel files you can use. To list them all, call the <code>readxl_example()</code> function. We'll use the <i>datasets.xlsx</i> file in this section. It contains many datasets you're likely already familiar with - such as Iris and Mtcars. All are available as separate sheets.
To list all sheets of a given Excel file, use the following function call:
<script src="https://gist.github.com/darioappsilon/8a3896a2aa8c31e30264a5bc9b77f9ba.js"></script>
<img class="size-full wp-image-11574" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b2abc0122d296e6e276c67_1-1.webp" alt="Image 1 - Sheets available in an Excel file" width="1002" height="106" /> Image 1 - Sheets available in an Excel file
As mentioned before, each of the four datasets is stored on an individual sheet. You can read a specific sheet by specifying the <code>sheet</code> parameter:
<script src="https://gist.github.com/darioappsilon/a460898b90256e2f0ae3a6c789197a2e.js"></script>
<img class="size-full wp-image-11576" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b2abb3b0924dc32765fe1d_2-1.webp" alt="Image 2 - Mtcars dataset from an example Excel file" width="1502" height="656" /> Image 2 - Mtcars dataset from an example Excel file
But here's the best part - you can specify the row and column ranges in Excel-like syntax. The <code>range</code> parameter is used to specify, well, the range from which you want to extract data. The value of <code>range = "B1:E5"</code> means you want to extract data from the second to the fifth column, spanning from the row 1 to 5:
<script src="https://gist.github.com/darioappsilon/0b54608e2293b4a1bc534dec24f98f36.js"></script>
<img class="size-full wp-image-11578" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b2abb3719cbbf522f1a2bb_3-1.webp" alt="Image 3 - Specifying range when reading Excel files" width="1372" height="406" /> Image 3 - Specifying range when reading Excel files
Let's take the whole thing a step further and omit the <code>sheet</code> parameter - you can specify everything in <code>range</code> in a true Excel manner:
<script src="https://gist.github.com/darioappsilon/a53068284584f798fca97b17b7009386.js"></script>
<img class="size-full wp-image-11580" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b2abb45e32948c871ad16c_4-1.webp" alt="Image 4 - Specifying sheet and range in a single parameter" width="1128" height="462" /> Image 4 - Specifying sheet and range in a single parameter
Long story short, <code>readxl</code> is a truly powerful R package for reading Excel files. It allows you to read both modern and legacy Excel spreadsheets, and manipulate them using an Excel-like syntax. <b>But how can you go the other way around?</b> Let's explore that next.
<h2 id="bert">Bert: Connect Excel with R</h2>
Bert is a free tool for connecting Excel with R. It's designed to run (user-defined) R functions in spreadsheets. The only thing you have to do is write the function - Bert handles everything else for you.
<b>But here's a catch: </b>At this time, Bert is available only for Windows. We're running Windows 11 as an emulation on M1 Macbook (ARM) and experiencing unexpected crashes. These are likely due to the ARM architecture Bert didn't expect.
Now, Bert isn't the only option you have. Similar solutions exist, such as <i>RExcel</i>, but we haven't managed to find a download link due to their website being down. Adding insult to injury, RExcel is also available for Windows only.
To get started with Bert, head over to their <a href="https://bert-toolkit.com/download-bert" target="_blank" rel="noopener noreferrer">download page</a> and download the installed. It goes without saying, but you should have Excel and R installed already.
Once you have both Excel and Bert console running, use the following code to populate cells in the range from A1 to C4 with a constant value:
<script src="https://gist.github.com/darioappsilon/745178d268e7132efbe77780fa88e9fb.js"></script>
<img class="alignnone size-full wp-image-11582" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b2abecd976ed15bfb7af82_5.gif" alt="" width="844" height="626" />
It feels unnatural at first, but you've just used R code to populate an Excel spreadsheet!
From here, you can do much of what you'd normally do in R. For example, the code snippet below plots a histogram of the normal distribution:
<script src="https://gist.github.com/darioappsilon/d3e32c431f52b62a2285ebbd5784af01.js"></script>
<img class="alignnone size-full wp-image-11584" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b2ac0295cd83ed77c415f9_6.gif" alt="" width="844" height="626" />
The true power of Bert lies in using the custom R function in Excel. To do so, open an R file located under <code>C:\Users\username\Documents\BERT2\functions\functions.R</code> and add your custom functions. For example, the one we added sums two numbers:
<img class="size-full wp-image-11586" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b2ab9416d9269fbe0626ef_7-1.webp" alt="Image 7 - Adding custom R functions to Bert" width="776" height="684" /> Image 7 - Adding custom R functions to Bert
Bert will automatically link R code to Excel - you don't have to lift a finger. Once you save the file, use the function by writing <code>=R.</code> in any Excel cell:
<img class="alignnone size-full wp-image-11588" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b2ab95cf74263737a7a82a_8-1.webp" alt="" width="748" height="410" />
<hr />
<h2 id="summary">Summary of R Programming and Excel</h2>
And that's how you can connect Excel and R with Bert. <b>But should you?</b> Some tech-savvy people find Excel obsolete.
<blockquote>Do you think Excel is obsolete? <a href="https://appsilon.com/excel-is-obsolete-here-are-the-top-2-alternatives-from-r-and-python/" target="_blank" rel="noopener noreferrer">Consider these two alternatives from R and Python</a>.</blockquote>
Either way, reading and manipulating Excel files is a daily task of any R data analyst. The <code>readxl</code> package comes with everything you need to make communication as easy as possible. If you're an Excel expert and just starting with R, Bert might be a good option to ease the transition period.
<em>What are your thoughts on these R and Excel cross-communication options?</em> Let us know in our Shiny community.