Super Solutions for Shiny Apps #5: Automated Tests

Reading time:

time

min

October 22, 2019

<h2>TL;DR</h2> Describes the best practices for setting automated test architecture for Shiny apps. Automate and test early and often with unit tests, user interface tests, and performance tests. <h2>Best Practices for Testing Your Shiny App</h2> Even your best apps will break down at some point during development or during User Acceptance Tests. I can bet on this. It’s especially true when developing big, enterprize applications with the support of various team members and under a client’s deadlines pressure. It's best to find those bugs on the early side. Automated testing can assure your product quality. Investing time and effort in automated tests brings a huge return. It may seem like a burden at the beginning, but imagine an alternative: fixing the same misbehavior of the app for the third time e.g. when a certain button is clicked. What is worse, bugs are sometimes spotted after changes are merged to the master branch. And you have no idea which code change let the door open for the bugs, as no one checked particular functionality for a month or so. Manual testing is a solution to some extent, but I can confidently assume that you would rather spend testing time on improving user experience rather than looking for a missing comma in the code. How do we approach testing in Appsilon? We aim to organize our test structure according to the “pyramid” best practice: <img class="aligncenter wp-image-3011 size-medium" src="https://wordpress.appsilon.com/wp-content/uploads/2019/10/testing-pyramide-600x489.png" alt="testing pyramide" width="600" height="489" /> FYI there is also an anti-pattern called the “test-cone”. Even such tests architecture in the app I would consider a good sign, after all the app is (automatically) tested - which is unfortunately often not even the case. Nevertheless switching to the “pyramid” makes your tests more reliable and effective plus less time-consuming. <img class="aligncenter wp-image-3012 size-medium" src="https://wordpress.appsilon.com/wp-content/uploads/2019/10/anti-pattern-cone-405x500.png" alt="anti pattern test cone" width="405" height="500" />No matter how extensively you are testing or planning to test your app, take this piece of advice: start your working environment with automated tests triggered before merging any pull request (check tools like <a href="https://github.com/circleci" target="_blank" rel="noopener noreferrer"><i>CircleCI</i></a> for this). Otherwise, you would soon hate finding bugs caused by developers: “Aaaa, yeah, it’s on me, haven’t run the tests, but I thought that the change is so small and not related to anything crucial!” (I assume it goes without saying that no changes go into ‘master’ or ‘development’ branches without proper <a href="https://en.wikipedia.org/wiki/Distributed_version_control#Pull_requests" target="_blank" rel="noopener noreferrer">Pull Request</a> procedure and review). Let’s now describe in detail different types of tests: <h2><b>Unit Tests</b></h2> … are the simplest to implement and most low-level kinds of tests. The term refers to testing the behavior of functions based on the expected output comparison. It’s a case-by-case approach - hence the name. Implementing them will allow you to recognize all edge cases and understand the logic of your function better. Believe me - you will be surprised what your function can return when starting with unexpected input. This idea is pushed to the boundaries with the so-called <a href="https://technologyconversations.com/2013/12/20/test-driven-development-tdd-example-walkthrough/" target="_blank" rel="noopener noreferrer">Test-Driven Development (TDD)</a> approach. No matter if you're a fan or <a href="https://itnext.io/test-driven-development-is-dumb-fight-me-a38b3033280c" target="_blank" rel="noopener noreferrer">rather skeptical</a> at the end of the day you should have implemented good unit tests for your functions. How to achieve it in practice? The popular and well-known package <a href="https://testthat.r-lib.org/" target="_blank" rel="noopener noreferrer"><i>testthat</i></a> should be your weapon of choice. Add the <i>tests</i> folder in your source code. Inside it, add another folder <i>testthat</i> and a script <i>testthat.R</i>. The script’s only job will be to trigger all of your tests stored in <i>testthat</i> folder, in which you should define scripts for your tests (one script per functionality or single function - names should start with “test_” + some name that reflects the functionality or even just the name of the function). Start such a test script with <i>context(</i>) - write inside some text that will help you understand what the test included is about. Now you can start writing down your tests, one by one. Every test is wrapped with <i>test_that()</i> function, with the text info what is exactly tested followed by the test itself - commonly just calling the function with a set of parameters and comparing the result with the expected output, e.g. <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r"> result <- sum(2, 2) <br> expect_equal(result, 4) </code></pre> </figure> Continue adding tests for single function and scripts for all functions. Once it is ready, we can set the main <i>testthat.R</i> script. You can use there code: <i>test_check("yourPackageName") </i>for apps as packages or general <i>test_results <- test_dir("tests/testthat", reporter = "summary", stop_on_failure = TRUE)</i>. <h2><b>User Interface (UI) Tests</b></h2> The core of those tests is to compare the actual app behavior with what is expected to be displayed after various user actions. Usually, it is done by comparing screen snapshots with the reference images. The crucial part though is to set up the architecture to automatically perform human-user-like actions and take snapshots. Why are User Interface (UI) tests needed? It is common that in an app development project, all of the functions are work fine, yet the app still crashes. It might be for example due to the JS code that used to do the job but suddenly stopped working as the object that it is looking for appears with a slight delay on the screen in comparison to what was there before. Or the modal ID has been changed and clicking the button does not trigger anything now. The point is this: Shiny apps are much more than R code with all of the JS, CSS, browser dependencies, and at the end of the day what is truly important is whether the users get the expected, bug-free experience. The great folks from RStudio figured out a way to aid developers in taking snapshots. Check <a href="https://blog.rstudio.com/2018/10/18/shinytest-automated-testing-for-shiny-apps/" target="_blank" rel="noopener noreferrer">this article</a> to get more information on the <i>shinytest </i>package. It basically allows you to record the actions in the app and select when the snapshots should be created to be checked during tests. What is important <i>shinytest </i>saves the snapshots as the json files describe the content. It fixes the usual problem with comparing images of recognizing small differences in colors or fonts on various browsers as an error. The image is also generated to make it easy for the human eye to check if everything is OK. There is also an <a href="https://cran.r-project.org/web/packages/RSelenium/vignettes/basics.html" target="_blank" rel="noopener noreferrer"><i>RSelenium </i>package</a> worth mentioning. It connects R with Selenium Webdriver API for automated web browsers. It is harder to configure than <i>shinytest</i>, but it does the job. As <i>shinytest </i>is quite a new solution, in Appsilon we had already developed our internal architecture for tests. The solution is based on <a href="https://github.com/GoogleChrome/puppeteer" target="_blank" rel="noopener noreferrer">puppeteer</a> and <a href="https://github.com/garris/BackstopJS" target="_blank" rel="noopener noreferrer">BackstopJS</a>. The test scenarios are written in javascript, so it is quite easy to produce them. Plus <i>BackstopJS</i> has very nice-looking reports. I guess the best strategy would be to start with <i>shinytest</i> and if there are some problems with using it, switch to some other more general solution for web applications. <h2><b>Performance Tests</b></h2> Yes, Shiny applications can scale. They just need the appropriate architecture. Check our <a href="https://appsilon.com/how-we-built-a-shiny-app-for-700-users" target="_blank" rel="noopener noreferrer">case study</a> and <a href="https://appsilon.com/alternatives-to-scaling-shiny" target="_blank" rel="noopener noreferrer">architecture description</a> blog posts to learn how we are building large-scale apps. As a general rule, you should always check how your app is performing in extreme usage conditions. The source code should be <a href="https://support.rstudio.com/hc/en-us/articles/218221837-Profiling-with-RStudio">profiled</a> and optimized. The application’s heavy usage can be tested with RStudio’s recent package <a href="https://rstudio.github.io/shinyloadtest/" target="_blank" rel="noopener noreferrer"><i>shinyloadtest</i></a>. It will help you estimate how many users your application can support and where the bottlenecks are located. It is achieved by recording the “typical” user session and then replaying it in parallel on a huge scale. So, please test. Test automatically, early, and often. <img class="aligncenter size-full wp-image-3009" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b022bd019394b55cce7912_bugs-more.gif" alt="giant alien bugs from the Starship Troopers film" width="450" height="253" /> <p style="text-align: center;">Smash down all the bugs before they become big, strong, and dangerous insects!</p> <h2>Follow Appsilon Data Science on Social Media</h2> Follow <a href="https://twitter.com/appsilon">@Appsilon</a> on Twitter! Follow us on <a href="https://www.linkedin.com/company/appsilon">LinkedIn</a>! Don’t forget to sign up for our <a href="https://appsilon.com/blog/">newsletter</a>. And try out our R Shiny <a href="https://appsilon.com/opensource/" target="_blank" rel="noopener noreferrer">open source</a> packages!

Have questions or insights?

Engage with experts, share ideas and take your data journey to the next level!

Stop Struggling with Outdated Clinical Data Systems

Join pharma data leaders from Jazz Pharmaceuticals and Novo Nordisk in our live podcast episode as they share what really works when building modern, compliant Statistical Computing Environments (SCEs).

Save My Spot

Is Your Software GxP Compliant?

Download a checklist designed for clinical managers in data departments to make sure that software meets requirements for FDA and EMA submissions.

Get the Checklist

Ensure Your R and Python Code Meets FDA and EMA Standards

A comprehensive diagnosis of your R and Python software and computing environment compliance with actionable recommendations and areas for improvement.