Exploring R Package Validation in Life Sciences: Appsilon's Collaboration with the R Validation Hub
There is a growing momentum towards the <strong>adoption of R for end-to-end clinical trial reporting.</strong> One of the main challenges of this adoption is ensuring the validation of the packages used in the <a href="https://appsilon.com/clinical-trial-r-package-quality-and-validation/" target="_blank" rel="noopener">clinical reporting</a> pipeline, along with the generation of their corresponding validation documentation. In this article, we will explore the strategies employed for risk assessment, the development of innovative tools such as the <strong>{riskassessment} app</strong>, and the broader implications of these endeavours for the Life Sciences. <h3>TL;DR</h3><ul><li>There’s a growing use of R for clinical trial reporting, with a focus on package validation and documentation.</li><li>R Validation Hub is actively developing an online repository for R package validation in pharmaceutical regulatory settings.</li><li>A Validation Framework for R packages is being created by two main groups: R Validation Hub and Phuse, focusing on different aspects of package validation.</li><li>The risk assessment of R packages is essential in the life sciences industry to prevent code errors that could lead to incorrect conclusions in clinical trials.</li><li>The {riskmetric} R package is used for assessing R packages, and R Validation Hub has released the {riskassessment} app to automate this process.</li><li>The future of R package risk assessment looks promising, with pharmaceutical companies developing algorithms for package installation based on risk assessments.</li></ul> <h3>Table of Contents</h3><ul><li><a href="#validation-hub">The R Validation Hub</a></li><li><a href="#package-validation">Building a Strong Foundation for R Packages Validation</a></li><li><a href="#risk-assessment">Risk assessment</a></li><li><a href="#assessment-app">How does the app perform risk assessment?</a></li><li><a href="#our-contribution">Appsilons’ contribution to the {riskassessment} application</a></li><li><a href="#future-perspectives">Future perspectives on R package risk assessment</a></li><li><a href="#conclusion">Appsilon for Life Sciences</a></li></ul> <hr /> <h2 id="validation-hub">The R Validation Hub</h2> The <a href="https://www.pharmar.org/about/" target="_blank" rel="noopener noreferrer">R Validation Hub</a> is leading the development of an online repository for R package validation in accordance with regulatory requirements. Their mission is “<a href="https://www.pharmar.org/about/" target="_blank" rel="noopener noreferrer">to support the adoption of R within a bio pharmaceutical regulatory setting</a>”. This group was started by the <a href="https://psiweb.org/sigs-special-interest-groups/aims" target="_blank" rel="noopener noreferrer">PSI AIMS Special Interest Group</a> and rapidly grew following the inaugural R/Pharma Conference as well as the R/Medicine conference, which holds a strong participation of the members from the R Validation Hub. <h3 id="package-validation">Building a Strong Foundation for R Packages Validation</h3> Two main groups are driving the creation of a Validation Framework; on one hand, the R Validation Hub is focused on assessing and managing risk for public R packages, specifically on contributed packages on The <a href="https://cran.r-project.org/" target="_blank" rel="noopener noreferrer">Comprehensive R Archive Network (CRAN)</a>. On the other hand, the Phuse R Package Validation Framework is targeting the validation of R packages being developed. <h4 id="risk-assessment">Risk assessment</h4> The number of open-source R packages tailored for the Life Sciences Industry, specifically for handling the data from Clinical Trials and those packages that aid in the visualization and data exploration of omics data, is rapidly increasing; all these packages need to be risk-assessed in order to ensure their reliability and accuracy. The term <strong>risk</strong> in this setting refers to the <strong>possibility of having errors in the code</strong> that could generate inaccurate calculations, which would eventually lead to the wrong conclusions, for example, in assessing the safety and efficacy of a new drug. <img class=" wp-image-21918" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b019874367c4d011901b26_tg_image_1240973356.webp" alt="riskmetric" width="213" height="247" /> riskmetric One of the approaches from the R Validation Hub is to use the {<a href="https://github.com/pharmaR/riskmetric" target="_blank" rel="noopener norefferer">riskmetric</a>} R package to risk-assess contributed R packages. The assessment criteria of the {<a href="https://github.com/pharmaR/riskmetric" target="_blank" rel="noopener noreferrer">riskmetric</a>} package includes: <img class="wp-image-21920 size-full" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b019892c0171ee6e593984_tg_image_3747366625.webp" alt="Flowchart for assessing R package accuracy with decision points including package classification, purpose, maintenance status, usage, and meeting requirements for inclusion within an environment." width="1600" height="576" /> Flowchart Illustrating the Risk Assessment Process for New R Packages in Clinical Trials Reporting <ol><li><strong>Unit testing metrics</strong> - includes unit test coverage and composite coverage of dependencies.</li><li><strong>Documentation metrics</strong> - availability of vignettes, news tracking, example(s), and return object description for exported functions.</li><li><strong>Community engagement</strong> - number of downloads, availability of the code in a public repository, formal bug tracking and user interaction.</li><li><strong>Maintainability and reuse</strong> - number of active contributors, author/maintainer contacts, and type of license.</li></ol> <img class=" wp-image-21922" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0198aec1e575073db31b7_tg_image_128721054.webp" alt="riskassessment" width="213" height="246" /> riskassessment To augment the utility of this package, the R Validation Hub has released the {<a href="https://pharmar.github.io/riskassessment/" target="_blank" rel="noopener noreferrer">riskassessment</a>} app. Together, these contributions aim to provide a strong foundation for validation within highly regulated industries, such as the pharmaceutical industry. <strong>This application was recognized as the “Best App” during the <a href="https://www.youtube.com/watch?v=gsWc_oSTb9c" target="_blank" rel="noopener">Shiny Conf 2023</a>.</strong> A <a href="https://rinpharma.shinyapps.io/riskassessment/" target="_blank" rel="noopener noreferrer">demo version</a> of the application is also available. <h4>How does the app perform risk assessment?</h4> Once the packages have been uploaded for risk assessment, various metrics are used. In this example, we will risk-assess the <a href="https://www.tidyverse.org/" target="_blank" rel="noopener noreferrer">{tidyverse}</a> package. <strong>Maintenance Metrics</strong> <img class="size-full wp-image-21924" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0198b8cff4e0613420c11_tg_image_845069085.webp" alt="napshot of R package maintenance metrics evaluation with checks for vignettes, news files, website, maintainer, source control, documentation, bug closure rate, license, and test coverage." width="1167" height="835" /> Maintenance Metrics Overview for R Package Evaluation <strong>Community Usage Metrics</strong> <img class="size-full wp-image-21926" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0198dd57f969632ca267f_tg_image_1684525074.webp" alt="Analytics dashboard showing community usage metrics for an R package with time since the first and latest version releases, and the total number of downloads over the past 12 months." width="1207" height="687" /> Community Usage Metrics for an R Package with Download Trends The organization using the application is responsible for adapting the review process according to their requirements. Once the review process is completed, a final report is automatically generated and downloaded. The ultimate goal of this application is to automate the risk assessment process. The final assessment results in a value that ranges from 0 (no risk) to 1 (highest risk). The organization using the application is responsible for adapting their own risk threshold. The risk threshold is currently being agreed upon among multiple pharmaceutical companies. In this example, the risk assessment of the {tidyverse} package resulted in a value of 0.2, which generally indicates a low risk for this package. <h2 id="our-contribution">Appsilons’ contribution to the {riskassessment} application</h2> We understand the need for an automated application that risk assesses R packages utilized in clinical trials. We decided to join the R Validation Hub in this specific initiative and contribute as much as we can to the development of this application. During our collaboration, talented R/Shiny developers joined the current development team. Besides, we decided to move forward with sharing our ideas on how we could improve the application. During this iteration, we focused our efforts on overviewing the code, the database schema, CICD integrations and the reactivity of the application. We acted on a feature task of a new card and plotted the downloading trend. We added a card on community usage metrics for a 12-month trend and added a linear trend on the plot of a maximum of 24-month period. <img class="size-full wp-image-21928" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0198dd5a0d56b3968df5d_tg_image_36661096.webp" alt="Graphical representation of a download trend metric showing 29,244 downloads for an item in the last 12 months." width="1600" height="913" /> Yearly Download Trend Analysis <img class="size-full wp-image-21930" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0198f93c80263c782f60d_tg_image_4160169442.webp" alt="ine graph depicting the number of monthly downloads for the 'stringr' R package, including version release markers and a linear trend line." width="1600" height="690" /> Monthly Download Statistics and Trend Line for the 'stringr' R Package <h2 id="future-perspectives">Future perspectives on R package risk assessment</h2> Given the recent developments related to R package validation and risk assessment, the possibility of having an automated way to analyze the risk associated with a given package seems reachable in the near future. Various pharmaceutical companies are starting to adopt the R Validation Hub packages and are developing their own algorithms for installing packages from CRAN into their computing environments. The algorithms developed take into account values given by the {riskmetric} package and test coverage. These types of validation algorithms seem to be robust and work efficiently. Novartis, Merck and Roche have shared their own use cases where they show the algorithms that they are using to mitigate risks when installing new R packages from CRAN. These use cases are available on this <a href="https://github.com/pharmaR/case_studies" target="_blank" rel="noopener noreferrer">repository</a>. <blockquote>Ready to elevate your clinical trial data analysis? <a href="https://appsilon.com/pharmaceutical-and-clinical-trial-data-analysis-packages/" target="_blank" rel="noopener">Explore cutting-edge R packages tailored for pharmaceutical research</a>.</blockquote> <h3 id="conclusion">Appsilon for Life Sciences</h3> Our recent collaboration with the R Validation hub team on the {riskassessment} app demonstrates our commitment to promoting the adoption of R throughout the Life Sciences industry. Because our company is passionate about the development of innovative technology and data analysis tools for accelerating <a href="https://www.appsilon.bio/" target="_blank" rel="noopener">Research and Development</a>, we are here to assist you on your unique journey. This article was co-authored by Life Sciences Innovation Lead, <a href="https://appsilon.com/author/ismael/" target="_blank" rel="noopener">Ismael Rodriguez</a>.