FastAI in R: How to Train Deep Learning Models with FastAI

Estimated time:

time

min

It seems like it's getting easier and easier to get into deep learning, at least as a practitioner. Packages like <a href="https://github.com/fastai/fastai" target="_blank" rel="noopener">FastAI</a> are available to masses for both <a href="https://pypi.org/project/fastai/" target="_blank" rel="noopener">Python</a> and <a href="https://github.com/EagerAI/fastai" target="_blank" rel="noopener">R</a>, and in some simple scenarios they seem to provide a no-brainer solution for training deep learning models with as few lines of code as possible. Today we'll dive into FastAI in R, and you'll learn how to train an image classification model from scratch. Well, not from scratch, but with a pretrained network available in FastAI, which yields high accuracy with only a few training epochs. But first, we'll go over FastAI in general, so you can know what it can and what it can't do. <a href="https://github.com/Appsilon/fastai-in-r/">This repository</a> we prepared contains three examples of how FastAI can be leveraged from the perspective of an R user, with various degrees of involvement of Python code. You can find a brief presentation of the methods in our <a href="https://appsilon.com/fast-ai-in-r/">Fast AI in R blogpost</a>. Here we will explain the steps in some more detail. <blockquote>Looking to build a custom Deep Learning Model on top of an existing architecture? <a href="https://appsilon.com/transfer-learning-introduction/" target="_blank" rel="noopener">Look no further than Transfer Learning</a>.</blockquote> <h3>Table of contents:</h3><ul><li><strong><a href="#what-is-fastai">What is FastAI and Why Should You Care?</a></strong></li><li><strong><a href="#install">FastAI in R - How to Install FastAI</a></strong></li><li><strong><a href="#model">How to Train an Image Classification Model with FastAI in R</a></strong></li><li><strong><a href="#summary">Summing up FastAI in R</a></strong></li></ul> <hr /> <h2 id="what-is-fastai">What is FastAI and Why Should You Care?</h2> FastAI is an open-source library for deep learning that makes it easy to train highly-accurate neural network models. Needless to say, it moves the barrier of entry for practitioners even lower, which is a good thing. The library is built on top of PyTorch and provides a suite of high-level API functions for building and evaluating models in no time. Since FastAI is built on top of PyTorch, it has native Python support. <b>But what about R?</b> Well, there's a wrapper package we'll go over shortly, and it can do anything the Python version can since it mainly executes Python code below the surface. So, what can you do with FastAI? The library has a range of tools that make it easy to work with images, text, tabular data, time series, audio, GANs, and much more. There are also convenient functions built in for downloading various datasets, so you don't have to search the web and preprocess the data manually. Neat! <h3>Who's Behind FastAI?</h3> The founder of FastAI, <b>Jeremy Howard</b>, needs no introduction. He's made significant contributions to the field of deep learning, was CEO of Kaggle for a time, and is currently a faculty member at the University of San Francisco and the University of the Witwatersrand in South Africa. Howard co-founded FastAI with <b>Rachel Thomas</b>, who is also a founding director of the Center for Applied Data Ethics at the University of San Francisco. She was selected by Forbes magazine as one of the 20 most incredible women in artificial intelligence. In short, this means FastAI has more than solid foundations, and you can expect the library to be built and documented well. Want to learn more about FastAI? Here are some useful links: <ul><li><a href="https://www.fast.ai/" target="_blank" rel="noopener">FastAI homepage with recent blog posts</a></li><li><a href="https://course.fast.ai/" target="_blank" rel="noopener">Practical Deep Learning for Coders course</a></li><li><a href="https://www.amazon.com/Deep-Learning-Coders-fastai-PyTorch/dp/1492045527" target="_blank" rel="noopener">Deep Learning for Coders with FastAI and PyTorch book</a></li></ul> <h2 id="install">FastAI in R - How to Install FastAI</h2> <i><b>Note: </b>As of April 2023, we had trouble installing the R version of FastAI on Apple Silicon Macbooks. The procedure you're about to see works on any Intel Mac or Windows/Linux PC but isn't guaranteed to work on M1/M2 devices.</i> Installing the R version of FastAI is a bit more manual and tedious than the Python version. In Python, one pip-install command is enough, but for R, it's a different story. You first have to install Miniconda, then the <code>reticulate</code> package, create a virtual environment, and then install FastAI from GitHub. You don't need to worry since we'll guide you through every step. <h3>Install Miniconda and Create a Virtual Environment</h3> Miniconda is a lightweight version of Anaconda Python distribution, which is a popular package and environment manager for Python. As it turns out, you need to install Miniconda first if you want to use a FastAI wrapper in R. But even before Miniconda, you have to install R's <code>reticulate</code> package. It provides an interface between R and Python, allowing you to integrate code written in both languages on the same project. If you're a data scientist using both R and Python, <code>reticulate</code> is the package you should consider learning. The following code snippet uses the R console to install <code>reticulate</code>, and then uses the package to install Miniconda on your system: <pre><code class="language-bash">install.pakcages("reticulate") <br>reticulate::install_miniconda()</code></pre> Here's the output you should see: <img class="size-full wp-image-22363" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c7144121eacf4c1d694_Image-1-Installing-Miniconda-with-Reticulate.webp" alt="Image 1 - Installing Miniconda with Reticulate" width="1426" height="765" /> Image 1 - Installing Miniconda with Reticulate Once installed, you can create a new <b>virtual environment</b>. We've named ours <code>r-reticulate</code>, but feel free to be a bit more creative: <pre><code class="language-bash">reticulate::conda_create("r-reticulate")</code></pre> <img class="size-full wp-image-22365" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c725f7d5efe2b06f28f_Image-2-Creating-a-virtual-environment-with-Reticulate.webp" alt="Image 2 - Creating a virtual environment with Reticulate" width="1766" height="1082" /> Image 2 - Creating a virtual environment with Reticulate <blockquote>Want to learn more about R and Python integration with Reticulate? <a href="https://appsilon.com/use-r-and-python-together/" target="_blank" rel="noopener">We have just the article for you</a>.</blockquote> You now have both <code>reticulate</code> and a virtual environment configured, so the next step is to install FastAI. <h3>Install and Configure FastAI in R</h3> The recommended way to install FastAI in R is by getting it straight from the <a href="https://github.com/EagerAI/fastai" target="_blank" rel="noopener">GitHub repo</a>. Doing so requires an additional R package - <code>devtools</code>. Here's the R shell command to install that package first, and then to install FastAI from a repo: <pre><code class="language-bash">install.packages("devtools") <br>devtools::install_github("eagerai/fastai")</code></pre> You should see something similar to this in the console output: <img class="size-full wp-image-22367" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c73688498090c60aa3b_Image-3-Installing-the-R-package-from-GitHub-with-devtools.webp" alt="Image 3 - Installing the R package from GitHub with devtools" width="735" height="433" /> Image 3 - Installing the R package from GitHub with devtools You'd think that's all, but you'd be wrong. The R wrapper package might be installed, but that doesn't mean the underlying Python library and its dependencies are installed and configured correctly. Run these two commands to activate the previously configured <code>r-reticulate</code> virtual environment and to install Python dependencies for FastAI: <pre><code class="language-bash">reticulate::use_condaenv("r-reticulate", required = TRUE) fastai::install_fastai(gpu = FALSE, cuda_version = "11.6", overwrite = FALSE)</code></pre> The last command will pull a good amount of Python libraries, as shown in the following image: <img class="size-full wp-image-22369" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c75d1cb8e13a3dc4a51_Image-4-Installing-FastAI-Python-dependencies.webp" alt="Image 4 - Installing FastAI Python dependencies" width="1820" height="908" /> Image 4 - Installing FastAI Python dependencies Sit back because this will take a couple of minutes. Done? Great, join us in the following section. <h2 id="model">How to Train an Image Classification Model with FastAI in R</h2> The difficult part of this article was installing FastAI in R. That's over, which means you can now fully enjoy the high-level API functions FastAI has to offer. We'll start by downloading and exploring a dataset for image classification. <h3>Data Gathering and Exploration</h3> The process of data gathering typically boils down to searching the web for adequate pre-made datasets at best or creating the dataset from scratch at worst. Depending on the problem you're trying to solve, this step can typically take from minutes to months or even years in some domain-specific problems. Luckily for us, that's not the case with FastAI. There are dozens of built-in convenience functions that will download the data for you, and even automatically set the appropriate directory structure. But first, the library imports. Stick these two at the top of your R script: <pre><code class="language-r">library(magrittr) library(fastai)</code></pre> We'll use the <a href="https://www.robots.ox.ac.uk/~vgg/data/pets/" target="_blank" rel="noopener">Oxford-IIT Pet Dataset</a> which has 37 categories of pet images with roughly 200 images for each class. It may seem this is not enough for deep learning, and indeed, a model trained from scratch on this data will likely generalize poorly. <b>The good news is</b> - FastAI allows us to leverage transfer learning without breaking a sweat. But first, let's download the dataset. The following code snippet downloads the Oxford-IIT Pet dataset: <pre><code class="language-r"># Download data URLs_PETS()</code></pre> If the function call doesn't work for you, download and extract the .tgz file from this <a href="https://s3.amazonaws.com/fast-ai-imageclas/oxford-iiit-pet 113" target="_blank" rel="noopener">download link</a>. Either way, you're ready to move on to the next step. Now you have to define paths to the dataset, image, and annotation folders. The built-in <code>get_image_files()</code> function automatically scans the image directory and ensures they are formatted correctly. If there are no errors, the function returns all of the images in an array. The following code snippet defines paths to all folders, loads the images, and prints the contents of the first one: <pre><code class="language-r"># Folder paths path <- "/Users/dradecic/Desktop/oxford-iiit-pet" path_anno <- "/Users/dradecic/Desktop/oxford-iiit-pet/annotations" path_img <- "/Users/dradecic/Desktop/oxford-iiit-pet/images" fnames <- get_image_files(path_img) <br># One example fnames[1]</code></pre> Here's what you should see in the R console: <img class="size-full wp-image-22371" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c77d1cb8e13a3dc4b7b_Image-5-Path-to-a-single-image.webp" alt="Image 5 - Path to a single image" width="1432" height="106" /> Image 5 - Path to a single image We're on the right track, and the next step is to load these images with the <code>DataLoader</code> class. This one will load all of the JPG files in batches of 10, and resize them to a square size of 460x460 pixels. It also uses the ImageNet normalization values for red, green, and blue image channels. Finally, the code snippet plots one batch of images with their respective class names so we can see what we're dealing with: <pre><code class="language-r"># DataLoader dls = ImageDataLoaders_from_name_re( path = path, fnames = fnames, pat = "(.+)_\\d+.jpg$", item_tfms = Resize(size = 460), bs = 10, batch_tfms = list(Normalize_from_stats(imagenet_stats())) ) <br>dls %>% show_batch()</code></pre> The images in the batch are random, but expect to see something similar to this: <img class="size-full wp-image-22373" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c77950a34dfe6c6a975_Image-6-A-random-batch-of-images.webp" alt="Image 6 - A random batch of images" width="1198" height="1220" /> Image 6 - A random batch of images The dataset is now loaded correctly, which means we can proceed to model training. <h3>A Custom Pretrained Neural Network Architecture</h3> A big part of why FastAI in R works so great on a small dataset is transfer learning. Put simply, this technique leverages models trained on large, generic datasets, where especially the initial layers of the model learn to recognise patterns common in almost all image data (such as edges, corners, gradients of color etc.). This valuable knowledge serves as a foundation to learning specifics of a new (so called “downstream”) task. Apparently, for many cases which fall close to the standard problems solved by computer vision, even small datasets are enough to finetune the model to a given task.. There are numerous pre-built architectures you can use, many implemented directly in FastAI, and even more available when using packages like <a href="https://appsilon.com/tag/timm/">timm</a>. If you're into graphs, the following one describes the structure of one of the simplest choices the ResNet-18 architecture: <img class="size-full wp-image-22375" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c79f2c69fa7dba6fe6a_Image-7-ResNet-18-architecture-Credits-www.researchgate.net_.webp" alt="Image 7 - ResNet-18 architecture (Credits- www.researchgate.net)" width="850" height="389" /> Image 7 - ResNet-18 architecture (Credits- www.researchgate.net) The following code snippet instantiates a <code>cnn_learner</code>, which takes in a DataLoader, model architecture, and a list of metrics you want to keep track of while training: <pre><code class="language-r"># Model architecture learn = cnn_learner(dls = dls, arch = resnet18(), metrics = list(accuracy, error_rate()))</code></pre> Once you run the above code snippet, you should see the download progress bar in the R console. This just means R is downloading the ResNet-18 network: <img class="size-full wp-image-22377" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c7ab0558ac94e6c16c5_Image-8-Downloading-a-ResNet-18-model.webp" alt="Image 8 - Downloading a ResNet-18 model" width="1646" height="252" /> Image 8 - Downloading a ResNet-18 model Once done, you can proceed with model training. Let's do that next. <h3>Training a Model with FastAI in R</h3> We don't need to train the model for long since it leverages a pretrained architecture. A couple of epochs should be enough. The following code snippet trains the model on our custom dataset for 5 epochs: <pre><code class="language-r"># Fit learn %>% fit_one_cycle(n_epoch = 5)</code></pre> Depending on your hardware, this might take a while, from seconds per epoch to minutes. You'll see the R console output updated with each epoch, and it will look similar to this after the training finishes: <img class="size-full wp-image-22379" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c7c96aa879a9ff7b26d_Image-9-FastAI-in-R-training-progress.webp" alt="Image 9 - FastAI in R training progress" width="700" height="234" /> Image 9 - FastAI in R training progress In addition, you'll see your metrics on training and validation sets plotted automatically in the Charts panel of RStudio. Here's what it looks like on our end: <img class="size-full wp-image-22381" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c7d8c9c81ef1fbf6a0f_Image-10-Training-and-validation-metrics-during-training.webp" alt="Image 10 - Training and validation metrics during training" width="996" height="1105" /> Image 10 - Training and validation metrics during training You can see why we say the metrics are hit and miss. Both training and validation loss are decreasing, but accuracy and error rates seem static over time. The accuracy is reported to be only around 3%, which is far from the truth, as you'll see in the following section. <h3>Evaluating Model Predictions</h3> The easiest and also visually most appealing way to evaluate a classification algorithm is through a confusion matrix. It's an NxN matrix that shows correct classification on the top-left to bottom-right diagonal, false positives in the upper triangle, and false negatives on the bottom triangle. The following code snippet calculates the confusion matrix and plots it using the <code>highcharter</code> R package. Install it if you don't have it already (<code>install.packages("highcharter")</code>): <pre><code class="language-r"># Confusion matrix library(highcharter) <br>cm <- learn %>% get_confusion_matrix() <br>hchart(cm, label = TRUE) %>% hc_yAxis(title = list(text = "Actual")) %>% hc_xAxis(title = list(text = "Predicted"), labels = list(rotation = - 90))</code></pre> This is the resulting visualization: <img class="size-full wp-image-22383" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c7ee09dd71089990c7d_Image-11-Confusion-matrix.webp" alt="Image 11 - Confusion matrix" width="1717" height="1305" /> Image 11 - Confusion matrix This one is a bit difficult to look at due to 37 classes, but the main diagonal looks great. There are some misclassifications, but overall nothing we should worry about. After all, we've trained the model on one of the simplest architectures for only 5 epochs! FastAI in R also has a convenient function that allows you to visualize <i>top losses</i>, or instances your model had the hardest time predicting. Here's how to use it: <pre><code class="language-r"># Top losses interp = ClassificationInterpretation_from_learner(learn = learn) interp %>% plot_top_losses(k = 9, figsize = c(15, 11))</code></pre> And this is what it outputs in our case: <img class="size-full wp-image-22385" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b00c7f15f212141163a0dc_Image-12-Top-losses-of-our-classification-model.webp" alt="Image 12 - Top losses of our classification model" width="949" height="781" /> Image 12 - Top losses of our classification model In the example of the first image, the model predicted the class of <code>newfoundland</code> while the actual class was <code>american_pit_bull_terrier</code>. And that's how easy it is to use FastAI in R to train highly-accurate image classification models. Let's make a short recap next. <hr /> <h2 id="summary">Summing up FastAI in R</h2> If you're working on a deep learning problem that has to do with tabular data, time series, image classification, NLP, and even GANs, here is some good news for you - You don't have to start from scratch. Libraries like FastAI do all the heavy lifting for you and provide a plethora of convenience functions for training and evaluating deep learning models. The best thing - FastAI is available for both Python and R, so you won't have to change tech stacks in order to use it. Today you've learned how to train an image classification model with FastAI and transfer learning, and it took us maybe 15 lines of code to train a highly-accurate model on a stupidly small dataset. That's the power of libraries such as FastAI, so make sure to leverage it in your next machine learning project. <i>Would you like to see another example of FastAI in R?</i> Please let us know in the comment section below, or reach out on Twitter - <a href="https://twitter.com/appsilon?lang=en" target="_blank" rel="noopener">@appsilon</a>. We'd love to hear from you. <blockquote>FastAI for detecting Solar Panels from Orthophotos? <a href="https://appsilon.com/using-ai-to-detect-solar-panels-part-1/" target="_blank" rel="noopener">We did the research, you do the reading</a>.</blockquote>