Join the Shiny Community every month at Shiny Gatherings

Exploring Machine Learning-Derived Data in Life Sciences with Shiny Applications

Machine learning (ML) is gaining widespread popularity in the life sciences. Crafting intuitive user interfaces speeds up data exploration and offers modernized ways to present analyses and outcomes from various ML models. For quite some time, R Shiny has been the go-to choice for researchers looking to create top-notch visualizations, effectively aiding in understanding the insights generated from applying machine learning to burgeoning healthcare datasets.

Let’s take a look at a few impressive R Shiny applications that seamlessly incorporate machine learning and help explore data derived from applying machine learning algorithms.

These applications provide great value in the following biomedical fields:

  1. shinyDeepDR: Precision Oncology
  2. Paris app: Identification of synthetic lethality
  3. Netpredictor: Drug target network analysis
  4. PSRR: Small molecules-MicroRNAs (miRNAs) regulation pairs prediction
  5. OposSOM-Browser: ML based interactive exploration of OMICS data
  6. MAPD_Shiny: Protein degraders targets identification
  7. CANDI: Enhancing Radiographic Annotation and Diagnosis

These examples showcase how Shiny applications can accelerate the integration of machine learning into life sciences.

Unlocking Precision Oncology with R Shiny and Machine Learning

shinyDeepDR: A user-friendly R Shiny app for predicting drug response of cancer using deep learning.

This application allowed researchers to make their machine learning model (DeepDR) more accessible to biomedical researchers with limited programming skills. In this application, users can upload mutation and gene expression profiles of a cancer sample and predict the cell or tumor response to 265 FDA approved and investigational compounds covered by the GDSC (Genomics of Drug Sensitivity in Cancer) project. 

Also, you can visualize, search, and filter prediction results from 265 compounds; perform downstream analyses such as statistical tests and link the compounds to different databases, including PubChem.

Their model helped to predict tumors’ drug response by considering multiple baseline profiles of mutations and gene expression of one cancer cell or tumor. This model contains 3 deep neural networks (DNNs):

  1. A mutation encoder pre-trained using a large pan-cancer dataset (The Cancer Genome Atlas; TCGA)
  2. A pre-trained expression encoder
  3. A drug response predictor network integrating the first two subnetworks

This research was published in 2019 and was the first DNN model to translate pharmacogenomics features generated from in vitro screening to help predict the response of tumors.

Machine Learning in Shiny Apps for Identification of Synthetic Lethality (SL)

PARIS (PAn-canceR Inferred Synthetic lethalities) is a machine learning approach to identify synthetic lethality (SL), a cancer vulnerability. Synthetic lethality occurs when a genetic interaction between two perturbations leads to cell death. This approach combines CRISPR viability screens with genomics and transcriptomics data from the Cancer Dependency Map. 

Two Shiny applications were developed to intuitively browse and visualize the complex data, allowing users to filter gene pairs, cohorts (mutation and/or expression), applying thresholds, selecting different importance score methods (Gini, permutation raw, corrected Gini) and provides a filtering step based on the Pearson correlation coefficient. 

The two apps can be found here:

PARIS_DDR_vs_DDR: A Shiny App to visualize cancer vulnerabilities predicted by PARIS among DNA Damage Repair genes. 

PARIS_DDR_vs_ALL: A Shiny App to visualize cancer vulnerabilities predicted by PARIS between DNA Damage Repair genes and the entire genome.

Shiny App Discoveries

First, the algorithm was used to discover cancer vulnerabilities among known DNA damage repair (DDR) genes and later to search for new SL pairs in the entire genome. 

Ultimately, the Shiny app allowed them to easily browse through the precomputed SL networks and navigate the data by applying relevant filters. 

This application helped to identify a previously uncharacterized vulnerability of CDKN2A-deficient cells to TYMS depletion and to identify a dependency between the aldehyde dehydrogenase ALDH2 and the BRCA-interacting protein BRIP1. Their results suggested that BRIP1 could be a potential therapeutic target in approx. 30% of all tumors that express low levels of ALDH2.

Enhancing Drug Target Network Analysis with Shiny and Machine Learning

Netpredictor: A Shiny application capable of performing drug-target network analysis and prediction of missing links. Identifying these missing links between drugs and their targets offers insights into other effects, such as polypharmacology and off-target effects of chemical compounds in complex biological systems.

The Shiny application is available on github.

Shiny Results

The R Shiny web application here uses a simple intuitive user interface with dynamic filters and real-time exploratory analysis. The creators integrated additional R packages, Javascript libraries and CSS for further customization.

This application assists users in searching for the top-K shortest paths between interactome and user enrichment analysis for disease, pathways, and ontologies. Both the R package and the Shiny app are available under the GPL-2 open-source license and are freely available to download.

The creators of this Shiny application concluded that such a rapid development environment helps researchers without a programming background to quickly develop rich visualizations. Furthermore, they plan to update the algorithms in the package in the near future, continuing to support scientists with more streamlined data analysis.

Shiny and ML for Small Molecules-MicroRNAs (miRNAs) Regulation Pairs Prediction

PSRR: The {PSRR} (Prediction of SM-miRNA Regulation pairs) is a Shiny application that facilitates the rapid prediction of the regulation between MicroRNAs (miRNAs) and small molecules by diverse developed models that use machine learning. This is a free, publicly available web-based Shiny application.

Open Source Impact

Predicting the regulation between these two elements could achieve implications for identifying potential therapeutic targets for anti-tumor drug development. Given the nature of miRNAs to interact with their specific target (mRNAs) for translation repression, they might function as oncogenes (oncomiRs) or tumor suppressors (TSmiRs).

By applying a random forest algorithm, they were able to outperform four machine learning algorithms by achieving AUC values of 0.911 for the up-regulation model and 0.896 for the down-regulation model on testing datasets. Their down-regulation and up-regulation models yielded accuracy values of 0.91 and 0.90 on independent validation pairs. When they tested their model in a case study, their model showed highly-reliable results by confirming all top 10 predicted regulation pairs as experimentally validated pairs. They made available the predictions of the final model through the PSRR web server application.

Interactive Exploration of OMICS Data with a Shiny Application

oposSOM-Browser: This Shiny application is a novel tool that provides interactive exploration of high-dimensional omics data and associated phenotypes. It is comprehensive and provides machine learning based open-source data analysis capabilities by combining functionalities such as diversity analysis, biomarker selection, function mining, etc.

Interactive Data

The oposSOM-Browser facilitates interactive exploration of individual gene and gene set profiles, molecular “portrait landscapes,” linked phenotype diversity, and activation patterns within signaling pathways. The data that is available in this browser consists of five transcriptome data sets of cancer (melanomas, B-cell lymphomas, gliomas) and peripheral blood (sepsis and healthy individuals).

The application is available on the ‘Leipzig Health Atlas,’ a collaborative platform for sharing publications, biomedical data, models, and software tools in health research.

Protein Degraders Targets Identification R Shiny Application

MAPD_Shiny: This Shiny app incorporates protein-intrinsic features, MAPD (Model-based Analysis of Protein Degradability) predictions, E2 accessibility of ubiquitination sites, ligandability, and disease associates, which aids researchers to select rational targets for further development of protein degraders.

Solving Problems with R and Shiny

Targeted protein degradation (TPD) has swiftly emerged as a treatment approach to eliminate previously intractable proteins by leveraging the cell’s innate protein degradation mechanisms. Nonetheless, the creation of TPD compounds predominantly relies on trial-and-error methodologies. Recent comprehensive TPD investigations into the kinome have unveiled significant degradation variations among kinases, even with comparable drug-target interactions, implying the presence of unexplained factors influencing degradability.

To solve these problems, researchers at Harvard (Wubing Zhang and X. Shirley Liu) developed a machine learning model (MAPD) to predict protein degradability from protein features such as post-translational modifications, protein stability, protein expression and protein-protein interactions. They created an interactive web platform to enable the prioritization of degradable proteins by scientists and the public.

Combining Machine Learning and R Shiny in Enhancing Radiographic Annotation and Diagnosis

CANDI: CANDI-RAD (Computer-Aided Note and Diagnosis Interface radiograph annotation dashboard) and CANDI-CAD (Computer-Aided Note and Diagnosis Interface Computer-Aided Diagnosis) Shiny applications were developed for annotating radiographs and evaluating computer-aided diagnosis.

Deep Learning, Shiny, and Open Access Research

The researchers identified the need for decision systems that use deep learning. Deep learning offers superior performance but it requires more training data. The algorithms that are currently used are not very effective in improving radiologists interpretations, even when such algorithms for Computer-Aided Diagnosis (CAD) have been used for decades. 

The CANDI-RAD application provides multimodal patient and image data to obtain training and testing data. Additionally, it serves as an evaluation application facilitating randomized controlled trials (RCTs) on human enhancement with algorithms.

The manuscript that presents these two open access CANDI web applications for collaboratively addressing the annotation and evaluation barrier to translating Deep Learning.

CANDI allows for a collaborative annotation of radiographs and helps evaluate how algorithms alter human interpretation. This application collects classification, segmentation, and image captioning training data; the evaluation app randomizes the availability of CAD tools that facilitate clinical trials in the field of radiology.

Discover Appsilon’s Machine Learning Solutions

As your trusted technology partner in the life sciences, Appsilon collaborates closely with your team to accelerate groundbreaking discoveries through the application of machine learning and advanced exploration software. Our dedicated team is ready to help you explore your data with innovative tools and modern technologies.

Together, we can drive innovation that transforms lives.