Join the Shiny Community every month at Shiny Gatherings

Exploring R Package Validation in Life Sciences: Appsilon’s Collaboration with the R Validation Hub


There is a growing momentum towards the adoption of R for end-to-end clinical trial reporting. One of the main challenges of this adoption is ensuring the validation of the packages used in the clinical reporting pipeline, along with the generation of their corresponding validation documentation.

In this article, we will explore the strategies employed for risk assessment, the development of innovative tools such as the {riskassessment} app, and the broader implications of these endeavours for the Life Sciences.

TL;DR

  • There’s a growing use of R for clinical trial reporting, with a focus on package validation and documentation.
  • R Validation Hub is actively developing an online repository for R package validation in pharmaceutical regulatory settings.
  • A Validation Framework for R packages is being created by two main groups: R Validation Hub and Phuse, focusing on different aspects of package validation.
  • The risk assessment of R packages is essential in the life sciences industry to prevent code errors that could lead to incorrect conclusions in clinical trials.
  • The {riskmetric} R package is used for assessing R packages, and R Validation Hub has released the {riskassessment} app to automate this process.
  • The future of R package risk assessment looks promising, with pharmaceutical companies developing algorithms for package installation based on risk assessments.

Table of Contents


The R Validation Hub

The R Validation Hub is leading the development of an online repository for R package validation in accordance with regulatory requirements. Their mission is “to support the adoption of R within a bio pharmaceutical regulatory setting”.

This group was started by the PSI AIMS Special Interest Group and rapidly grew following the inaugural R/Pharma Conference as well as the R/Medicine conference, which holds a strong participation of the members from the R Validation Hub.

Building a Strong Foundation for R Packages Validation

Two main groups are driving the creation of a Validation Framework; on one hand, the R Validation Hub is focused on assessing and managing risk for public R packages, specifically on contributed packages on The Comprehensive R Archive Network (CRAN). On the other hand, the Phuse R Package Validation Framework is targeting the validation of R packages being developed.

Risk assessment

The number of open-source R packages tailored for the Life Sciences Industry, specifically for handling the data from Clinical Trials and those packages that aid in the visualization and data exploration of omics data, is rapidly increasing; all these packages need to be risk-assessed in order to ensure their reliability and accuracy.

The term risk in this setting refers to the possibility of having errors in the code that could generate inaccurate calculations, which would eventually lead to the wrong conclusions, for example, in assessing the safety and efficacy of a new drug.

riskmetric

One of the approaches from the R Validation Hub is to use the {riskmetric} R package to risk-assess contributed R packages.

The assessment criteria of the {riskmetric} package includes:

Flowchart Illustrating the Risk Assessment Process for New R Packages in Clinical Trials Reporting

  1. Unit testing metrics – includes unit test coverage and composite coverage of dependencies.
  2. Documentation metrics – availability of vignettes, news tracking, example(s), and return object description for exported functions.
  3. Community engagement – number of downloads, availability of the code in a public repository, formal bug tracking and user interaction.
  4. Maintainability and reuse – number of active contributors, author/maintainer contacts, and type of license.

riskassessment

To augment the utility of this package, the R Validation Hub has released the {riskassessment} app. Together, these contributions aim to provide a strong foundation for validation within highly regulated industries, such as the pharmaceutical industry.

This application was recognized as the “Best App” during the Shiny Conf 2023. A demo version of the application is also available.

How does the app perform risk assessment?

Once the packages have been uploaded for risk assessment, various metrics are used. In this example, we will risk-assess the {tidyverse} package.

Maintenance Metrics

Maintenance Metrics Overview for R Package Evaluation

Community Usage Metrics

Community Usage Metrics for an R Package with Download Trends

The organization using the application is responsible for adapting the review process according to their requirements. Once the review process is completed, a final report is automatically generated and downloaded.

The ultimate goal of this application is to automate the risk assessment process. The final assessment results in a value that ranges from 0 (no risk) to 1 (highest risk). The organization using the application is responsible for adapting their own risk threshold. The risk threshold is currently being agreed upon among multiple pharmaceutical companies.

In this example, the risk assessment of the {tidyverse} package resulted in a value of 0.2, which generally indicates a low risk for this package.

Appsilons’ contribution to the {riskassessment} application

We understand the need for an automated application that risk assesses R packages utilized in clinical trials. We decided to join the R Validation Hub in this specific initiative and contribute as much as we can to the development of this application.

During our collaboration, talented R/Shiny developers joined the current development team. Besides, we decided to move forward with sharing our ideas on how we could improve the application.

During this iteration, we focused our efforts on overviewing the code, the database schema, CICD integrations and the reactivity of the application. We acted on a feature task of a new card and plotted the downloading trend.

We added a card on community usage metrics for a 12-month trend and added a linear trend on the plot of a maximum of 24-month period.

Yearly Download Trend Analysis

Monthly Download Statistics and Trend Line for the ‘stringr’ R Package

Future perspectives on R package risk assessment

Given the recent developments related to R package validation and risk assessment, the possibility of having an automated way to analyze the risk associated with a given package seems reachable in the near future.

Various pharmaceutical companies are starting to adopt the R Validation Hub packages and are developing their own algorithms for installing packages from CRAN into their computing environments. The algorithms developed take into account values given by the {riskmetric} package and test coverage. These types of validation algorithms seem to be robust and work efficiently.

Novartis, Merck and Roche have shared their own use cases where they show the algorithms that they are using to mitigate risks when installing new R packages from CRAN. These use cases are available on this repository.

Ready to elevate your clinical trial data analysis? Explore cutting-edge R packages tailored for pharmaceutical research.

Appsilon for Life Sciences

Our recent collaboration with the R Validation hub team on the {riskassessment} app demonstrates our commitment to promoting the adoption of R throughout the Life Sciences industry.

Because our company is passionate about the development of innovative technology and data analysis tools for accelerating Research and Development, we are here to assist you on your unique journey.

This article was co-authored by Life Sciences Innovation Lead, Ismael Rodriguez.