CRAN and the Isoband Incident - Is Your Project at Risk and How to Fix It
The R community had a recent scare with the <a href="https://twitter.com/cjvanlissa/status/1577552826561171457" target="_blank" rel="nofollow noopener">isoband package risking archival on CRAN</a>. The reason why this incident made waves is that isoband is a ggplot2 dependency and when a package gets removed from CRAN all other packages that depend on it get removed as well (<a href="https://cran.r-project.org/web/packages/policies.html" target="_blank" rel="nofollow noopener">see CRAN policy</a>). If isoband fell, ggplot2 would be at risk. And this would cascade with the removal of even more packages. In total, the removal of isoband would lead to the removal of 4747 packages. <img class="size-full wp-image-16083 aligncenter" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01ceadb456a59ff5b9cef_Caspar-van-Lissa-social-post-CRAN-ggplot2-dependency.webp" alt="Caspar van Lissa social post CRAN ggplot2 dependency and the isoband incident" width="595" height="396" /> But the <b>isoband issue appears to be resolved by maintainers of the package</b> (see relevant <a href="https://github.com/wilkelab/isoband/issues/33#issuecomment-1270766150" target="_blank" rel="nofollow noopener">issue</a>) and a newer version is available to download (<a href="https://cran.r-project.org/web/checks/check_results_isoband.html" target="_blank" rel="nofollow noopener">CRAN’s checks</a> don’t show any errors). This isn't the end of the story. It could happen again, but there is a solution to mitigate risks - RStudio Package Manager. Table of Contents: <ul><li><a href="#why">Why was isoband set to be archived?</a></li><li><a href="#dependency">R packages and their dependencies</a></li><li><a href="#rspm">How RStudio Package Manager helps mitigate risk</a></li><li><a href="#conclusion">Conclusion</a></li></ul> <hr /> <h3 id="why">Why was isoband set to be archived?</h3> The main issue is related to the missing std:: in testthat C++ headers code used for Catch unit testing framework (<a href="https://github.com/r-lib/testthat/issues/1687" target="_blank" rel="nofollow noopener">#1687</a>,<a href="https://github.com/r-lib/testthat/issues/1694" target="_blank" rel="nofollow noopener"> #1694</a>) - the code which is being copied into isoband (<a href="https://github.com/tidyverse/ggplot2/issues/5006#issuecomment-1268034828" target="_blank" rel="nofollow noopener">source</a>). If you’re curious, you can check if a dependency you use depends on isoband with: <code>pak::pkg_deps_explain("package_name", "isoband")</code> <img class="size-full wp-image-16087 aligncenter" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01cebd29d4bcb968564a9_package-dependency-pkg_deps_explain-output.webp" alt="isoband package dependency - pkg_deps_explain output for ggplot2 and rhino" width="500" height="114" /> <h2 id="dependency">R packages and their dependencies </h2> Developers often build their solutions upon other packages. We don’t need to reinvent the wheel for most of our functionality. And in doing so, we can speed up the process of software development. <blockquote>Is your Shiny app slow? <a href="https://appsilon.com/scaling-and-infrastructure-why-is-my-shiny-app-slow/" target="_blank" rel="noopener">Learn how to leverage your front end, extract computations, and use databases</a>.</blockquote> In the R world, we can download packages from CRAN (Comprehensive R Archive Network), a central software repository full of useful and ready-to-use libraries. This rich environment of packages makes it easy to quickly develop projects with everything from machine learning to statistics and visualizations. <h3>Why is the Isoband Incident important for R developers?</h3> The isoband incident highlights the risk associated with depending on public infrastructure that you don’t have control of. It boils down to a dependence on packages and an interconnected ecosystem of libraries resulting in mass archiving. As it turned out, just one package posed a huge threat to the R ecosystem. <blockquote>Concerned with security? <a href="https://appsilon.com/why-use-rstudio-connect-authentication/" target="_blank" rel="noopener">Set up RStudio Connect authentication and protect your Shiny applications</a>.</blockquote> All libraries that depend (directly or indirectly) on it would theoretically become archived on CRAN (around 4500 packages or ~25% of all CRAN packages) - as they began failing automated checks. Among them, one of the most popular packages - ggplot2. Imagine your team not being able to install ggplot2 or being unable to deploy dashboards that require ggplot2 installed. <h3>The risks of public infrastructure</h3> Being dependent on other packages comes with risks. When package developers received an email indicating the archival of isoband (and the 4747 packages mentioned above) because of unsolved CRAN issues, one of these risks bubbled to the surface. FYI you can spot these issues on the <a href="https://web.archive.org/web/20221005110020/https://cran.r-project.org/web/checks/check_results_isoband.html" target="_blank" rel="nofollow noopener">archived check result summary</a>. There are, however, <b>other risks</b> when relying on public infrastructure: <ol><li style="font-weight: 400;" aria-level="1">What would happen if CRAN was down and you weren’t able to download packages?</li><li style="font-weight: 400;" aria-level="1">What if CRAN decides to delete a package that your existing code relies on?</li><li style="font-weight: 400;" aria-level="1">What if someone publishes a new malicious version of a package that you might accidentally download? (A situation like this happened in the ruby community)</li></ol> But it’s not all doom and gloom. There are steps you can take and solutions you can implement to ensure your project remains safe. <h2 id="rspm">How RStudio Package Manager helps mitigate risks of public infrastructure</h2> Fortunately, there is RStudio Package Manager - a product that you can use to take control of your package infrastructure. <h3>Feel confident even when CRAN is down</h3> <a href="https://docs.rstudio.com/rspm/admin/getting-started/configuration/#quickstart-cran" target="_blank" rel="nofollow noopener">RStudio Package Manager allows you to host your own repository with CRAN packages</a>. Therefore if CRAN were to go down, you would always have your own working mirror. This means your team can continue working without worrying about the public infrastructure. Even with connectivity issues or network restrictions, <a href="https://support.rstudio.com/hc/en-us/articles/360009982293-Does-RStudio-Package-Manager-require-internet-access-#:~:text=RStudio%20Package%20Manager%20can%20be,access%20to%20RStudio%20Package%20Manager." target="_blank" rel="nofollow noopener">R clients using the Package Manager do not need internet access</a>, just access to the Package Manager. <h3>Stop relying on policies that are outside of your control</h3> CRAN checks can lead to packages getting removed from CRAN. This might lead to uncomfortable surprises at unexpected moments. <a href="https://docs.rstudio.com/rspm/admin/appendix/source-details/#cran-snapshot-source" target="_blank" rel="nofollow noopener">RStudio Package Manager allows you to host your own CRAN snapshots</a> - which means you can have a copy of CRAN from a specific date. If a package gets removed tomorrow, you can use a CRAN snapshot from a time when that package was still available. <img class="size-full wp-image-16089 aligncenter" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01cec04abfeee1f03e8d7_rstudio-package-manager-repository-data-selection-lock-package-data.webp" alt="rstudio package manager repository data selection view with option to lock package data" width="1230" height="569" /> The freeze mechanism would enable you to mitigate the effects of something like the isoband incident. You can still download archived packages on CRAN from your centralized solution (RSPM). <h3>Stay secure and compliant by using curated CRAN sources</h3> There were instances in the past where a malicious user took over an open-source dependency and published a new version containing malicious code. You might also have compliance constraints that restrict packages with specific licenses. In the end, you don’t have control over a situation where a package maintainer might decide to change their package’s license from MIT to AGPL. <blockquote>Moving to the cloud? <a href="https://appsilon.com/deploying-rstudio-workbench-to-aws-using-terraform/" target="_blank" rel="noopener">Learn how to deploy RStudio Workbench to AWS using Terraform</a>.</blockquote> <a href="https://docs.rstudio.com/rspm/admin/appendix/source-details/#curated-cran-source" target="_blank" rel="nofollow noopener">RStudio Package Manager allows you to host curated CRAN sources</a> where administrators can create and update approved subsets of CRAN packages. That way you can make sure that only secured and legally compliant packages are available to your team. <h2 id="conclusion">Summing up the Isoband Incident, risks with CRAN, and RSPM</h2> Using public infrastructure that hosts open-source packages comes with risks. The package repository might go down. Malicious updates to packages may occur. Or packages become altogether removed. However, all of those are manageable with the right tooling. That's why we recommend <b>RStudio Package Manager</b>. Take advantage of all the benefits that <b>open source</b> provides <b>without sacrificing reliability, security, and compliance</b>. <blockquote>Not sure if RStudio Connect is for you? <a href="https://appsilon.com/rstudio-connect-as-a-solution-for-remote-data-science-teams/" target="_blank" rel="noopener">See why remote Data Science Teams should be using Connect</a>.</blockquote> And in case you missed it above, <b>yes the isoband issue seems to be resolved</b> by the maintainers (see relevant <a href="https://github.com/wilkelab/isoband/issues/33#issuecomment-1270766150" target="_blank" rel="noopener">issue</a>) and a newer version is available to download. They responded quickly and saved a lot of potential trouble and headache for the community. <img class="wp-image-16085 size-full" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01cede9613f3f72cb0e72_modern-digital-infrastructure-source_xkcd-comics.webp" alt="modern digital infrastructure-source_xkcd comics" width="385" height="489" /> Image credit (<a href="https://xkcd.com/2347/" target="_blank" rel="nofollow noopener"> xkcd comics)</a> As open source contributors ourselves, we know the R community wouldn't be where it is without the "random person in Nebraska", but it's a big world with lots of room for mistakes. Don't rely on the actions of a few for the security of your projects. Use the tools available to you from RStudio and secure your project(s) today. If you're not sure where to begin, <a href="http://appsilon.com/rstudio-certified-partner/#contact" target="_blank" rel="noopener">reach out to us</a>. <a href="https://appsilon.com/rstudio-certified-partner/" target="_blank" rel="noopener">Appsilon is an RStudio Certified Partner</a>. We can help with end-to-end service, from installation and configuration to training, support, and maintenance of the RStudio (Posit) Team Suite. We can help you implement best practices and open-source solutions for RStudio (Posit) products, and make it all work in your unique business case. <hr /> This article was co-written by Appsilon R Shiny Developer <a href="https://appsilon.com/author/ryszard/" target="_blank" rel="noopener">Ryszard Szymański</a> and Infrastructure Engineer Arkadiusz Kalandyk.