Alternatives to Scaling Shiny with RStudio Connect or Custom Architecture
Shiny is a great tool for fast prototyping, but what about scaling? There are many alternatives to scaling Shiny apps. When a data science team creates a Shiny app, sometimes it becomes very popular. From that point, this app becomes a tool used in production by many people. It should be reliable and work fast for many concurrent users. There are many ways to optimize a Shiny app like <a href="https://appsilon.com/an-example-of-how-to-use-the-new-r-promises-package/" target="_blank" rel="noopener noreferrer">using promises</a> for non-blocking access, <a href="https://rstudio.github.io/profvis/" target="_blank" rel="noopener noreferrer">profvis</a> for finding bottlenecks, and <a href="https://shiny.rstudio.com/articles/modules.html" target="_blank" rel="noopener noreferrer">shiny modules</a> for well-structured code, etc. Regardless, the main aspect you should focus on is how you serve the application. Continue reading to learn a few alternatives to scaling Shiny apps. <blockquote>Want to track how users use your R Shiny App? <a href="https://appsilon.com/monitoring-r-shiny-user-adoption/">Consider these 3 options for monitoring user adoption</a>.</blockquote> <em><strong>Updated</strong>: Dec 20, 2022.</em> <hr /> <h2>Why is Scaling Shiny tricky?</h2> Scaling R Shiny applications is hard, and there are two main reasons why: <ul><li>R is a single-threaded programming language</li><li>The language itself is just slow - it was designed for data analysts with convenience in mind</li></ul> <h3>R is single-threaded</h3> This means all users connected to one R process will block each other. <u><a href="https://en.wikipedia.org/wiki/Thread_(computing)#Multithreading" target="_blank" rel="noopener noreferrer">Multithreading allows for application responsiveness</a>,</u> and by design, R doesn’t. You have to use workarounds to provide it. For example, you can do this by serving multiple instances of the app. Also, the <u><a href="https://rstudio.github.io/promises/articles/intro.html" target="_blank" rel="noopener noreferrer">promises package</a></u> can be used to improve responsiveness, but it is still fundamentally different than promises in JavaScript. JavaScript promises are different thanks to the event loop mechanism, which features fully asynchronous I/O and worker threads. <h3>R is slow - It was designed for convenient data analysis, not for web apps</h3> R language was created to make data analysis faster. It proved its power, and that’s why it is the tool of choice for many data scientists. However, a faster analysis doesn’t mean better performance. Data analysis in R is fast because of convenient syntax and the amazing amount of useful statistical packages, but the language execution time itself is slow. A detailed explanation of this is covered by Hadley Wickham in <u><a href="http://adv-r.had.co.nz/Performance.html" target="_blank" rel="noopener noreferrer">his article about R performance</a></u>. There are also many benchmarks on <u><a href="https://www.sas.upenn.edu/~jesusfv/comparison_languages.pdf" target="_blank" rel="noopener noreferrer">what the speed of R</a></u> is. We are excited by alternative implementations of the R language, which aim to improve performance. One of them is <u><a href="https://github.com/oracle/fastr" target="_blank" rel="noopener noreferrer">Oracle FastR</a></u> based on <u><a href="https://medium.com/graalvm/faster-r-with-fastr-4b8db0e0dceb" target="_blank" rel="noopener noreferrer">GraalVM</a></u>, and another is JVM-based <u><a href="http://www.renjin.org/" target="_blank" rel="noopener noreferrer">Renjin</a></u>. Currently, they are still evolving, but we believe their time will come soon. <h2>Four (or five) Alternatives to Scaling Shiny Apps</h2> Many people ask us what the effective options are for serving a Shiny app, especially for a large number of users. The industry-standard tools for enterprise Shiny deployment are RStudio products. We recommend these to our clients, as we believe them to be the best option. You can go either with RStudio Shiny Server Pro and which costs $11,950/year for 20 concurrent users or R Studio Connect for $14,995/year. Both of them are mature, powerful solutions, which we often recommend to our clients where R Studio Connect is our first choice due to its easy setup and push-button deployment. Before you make a decision on how to scale your app to multiple users, it is important to understand what your needs are: <ul><li>What is the app initialization time?</li><li>Does it load a lot of data into memory on startup?</li><li>What is the app complexity?</li><li>How heavy are the calculations?</li><li>What is the expected behavior of users?</li></ul> Answering these questions will help you choose the best fit for your use case. Next, let's see what options are available to you. <h3>Shiny Server Open Source</h3> <a href="https://github.com/rstudio/shiny-server" target="_blank" rel="noopener noreferrer">Shiny Server Open Source</a> is limited to one R process per app, which potentially can serve multiple user sessions (connections to the app). This is totally fine for apps that don’t have many users, but it doesn’t work well for apps that will have large amounts of users. When you have many users, they all will be served by one process and will inevitably block each other. This architecture is visualized by the following diagram: <img class="size-full wp-image-13251" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b022899bd6df84c7f6713a_1.webp" alt="Image 1 - Shiny Server Open Source Architecture" width="522" height="326" /> Image 1 - Shiny Server Open Source Architecture <h3>RStudio Connect (now Posit Connect)</h3> <blockquote><strong>Note</strong>: <em>As of January 1, 2021, Posit has <a href="https://docs.posit.co/other/ssp/?utm_medium=referral&utm_source=appsilon&utm_campaign=article" target="_blank" rel="noopener">discontinued the sale of Shiny Server Pro</a>. Previous versions of this post and others on our blog reference Shiny Server Pro. The equivalent (significantly improved 'equivalent') is Posit Connect - formerly RStudio Connect.</em></blockquote> In contrast, with <a href="https://posit.co/products/enterprise/connect/?utm_medium=referral&utm_source=appsilon&utm_campaign=article" target="_blank" rel="noopener noreferrer">Posit Connect</a> you can have multiple R processes per app. This means that many concurrent users can be distributed between separate processes and are served more efficiently. As there is no limitation on the number of processes, you can make use of all your machine resources. With this paid solution from RStudio, you can configure a strategy on how resources should be handled with the "utilization_scheduler" parameter. For example, you can set: <ul><li>The maximum R process capacity, i.e. the number of concurrent users per single R process;</li><li>The maximum number of R processes per single app;</li><li>When the server should spawn a new R process, e.g. when existing processes reach 90% of their capacity.</li></ul> Below you can see an example situation: In this scenario, five users want to access the app. We will initialize three worker processes, and users are distributed between them. Posit Connect is a go-to solution as it offers multiple features with a click of a button, managing apps. authorization, scheduling, distribution, and security options that are unavailable anywhere else. <img class="size-full wp-image-13253" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0228a1fa967cc088a1c4c_2.webp" alt="Image 2 - RStudio Connect Architecture" width="600" height="336" /> Image 2 - Posit (RStudio) Connect Architecture <h3>ShinyProxy</h3> ShinyProxy from OpenAnaltyics is an open-source alternative for serving Shiny apps. Its architecture is based on docker containers, which isolate the app’s environment. The key difference is that ShinyProxy starts a new app instance for each new user - a big drain on memory and a potential issue for scaling. The architecture is straightforward but requires additional support to maintain which may prove costly down the line. This is how it looks on a diagram: <img class="size-full wp-image-13255" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0228b3eb2cd82480a6ab0_3.webp" alt="Image 3 - ShinyProxy Architecture" width="889" height="380" /> Image 3 - ShinyProxy Architecture <h3>Custom architectures</h3> Sometimes deploying a Shiny app requires a custom solution. Let’s say we have a Shiny app that takes 60 seconds to initialize, and that time amount is deemed as an unacceptable wait time for a user. In this situation, starting an app instance for each user is not the way to go. At Appsilon, we create custom architectures for Shiny apps. In most cases, we recommend RStudio Connect because it is often the right tool for a given task. However, there are exceptions that require bespoke solutions. We developed such a solution for one of our clients and have used it to deploy several production apps. To do this we created a scalable architecture for a Shiny app that takes a long time to initialize and performs a lot of heavy computations. This approach uses docker containers that serve the application using Shiny Server Open Source. In front of application instances, there is a load balancer that distributes the traffic. The difference between our product and ShinyProxy is that there are N pre-initialized containers that wait for user connections. The number of containers is configured by the app admin and can be auto-adjusted. The advantage of this approach is that the app is served instantly. Users do not have to wait for the app initialization. It is also worth noting that this custom architecture supports SSL connection, authentication, and other enterprise requirements. <hr /> <h2>Cheat Sheet: Alternatives to scaling Shiny apps</h2> We hope this article on the possible solutions for scaling Shiny apps to many concurrent users was helpful to you. If you’d like to chat about a specific use case, you can contact us through our <a href="https://appsilon.com/shiny/" target="_blank" rel="noopener noreferrer">Shiny page</a>. Below, we have compiled the information shared above into a comparison cheat sheet diagram: <img class="size-full wp-image-13257" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0228dc497a144bd9e102e_4.webp" alt="Image 4 - Possible Architectures for Scaling Shiny Apps" width="1024" height="714" /> Image 4 - Possible Architectures for Scaling Shiny Apps; ShinyServer Pro discontinued; RStudio Connect is now Posit Connect We hope the diagram helps, but you can always reach out to Appsilon if you or your team need additional instructions. <blockquote>Want to go the extra mile? <a href="https://appsilon.com/how-to-deploy-rstudio-connect-into-local-kubernetes-cluster/">Here's how to deploy RStudio Connect into a Kubernetes Cluster</a>.</blockquote>