Appsprints & Distributed Teams: R Shiny App Development
Appsprints, analogous to hackathons, are like a rite of passage for every Appsilonian. It's the proverbial swiss army knife that can be used for pretty much anything like onboarding new team members, team building exercises, and PoCs for the sales cycle. <blockquote>Overcome inefficiencies of remote work with better <a href="https://appsilon.com/rstudio-connect-as-a-solution-for-remote-data-science-teams/" target="_blank" rel="noopener">tooling for remote data science teams from Posit</a>.</blockquote> We at Appsilon, come from various backgrounds and live in just about every corner of the world. This distributed team setup can pose a challenge in more ways than one. But with appsprints, we've discovered some amazing benefits. Let's explore an example below. TOC: <ul><li><a href="#app">The App</a></li><li><a href="#distributed">Distributed Team Challenges</a></li><li><a href="#technical">Technical Challenges</a></li><li><a href="#result">Appsprint Result</a></li><li><a href="#improvements">Room for Improvement</a></li><li><a href="#summary">Summary</a></li><li><a href="#tips">Tips for Distributed Teams</a></li></ul> <hr /> <h2 id="app">The Shiny App behind the appsprint</h2> As part of our <a href="https://appsilon.com/data-for-good/" target="_blank" rel="noopener">Data for Good (D4G) initiative</a>, we wanted to create an app that better visualized the correlation between human mortality rates and a common cause on a global scale. Air pollution is one such detriment that doesn't abide by borders and is, unfortunately, a consistent problem across the globe. <blockquote>You can explore the live <a href="https://connect.appsilon.com/respiratory_disease_app_sprint/" target="_blank" rel="noopener">Respiratory Disease App</a> built with love from Appsilon.</blockquote> We decided to explore the relationship between the Particulate Matter (PM2.5) pollution index metric and the mortality rate (defined as deaths per 100,000) from respiratory illnesses. In order to properly visualize the effect, we chose to display the data over the years through a map and graph visualizations. <blockquote>Tackling climate change with technology. Is it possible? See how <a href="https://appsilon.com/tag/data-for-good/" target="_blank" rel="noopener">Appsilon leverages technology to change course</a>!</blockquote> The planning phase consisted of assigning the main tasks to each team member. This was done by considering each individual's strengths and experience. For this app, we had 3 team members. Two of us are R/Shiny developers, while the third is an infrastructure engineer. So naturally, we split the tasks among ourselves according to what we could handle best. Basically, we broke down the app development cycle into these main areas (in no particular order): <ul><li>Setting up {rhino} structure</li><li>Prepping the dataset and setting up metrics for visualization</li><li>Creating the visualization modules and related functions</li><li>Styling the app</li><li>Testing and deployment</li></ul> On paper, this looked like the best way to go about it. But there was more to it than we expected. <h2 id="distributed">Appsprint challenges with distributed teams</h2> One of the biggest challenges we faced was time zones. We were distributed across 3 countries. Separated by 3 hours between each adjacent timezone. We learned that although planning tasks seems straightforward, the devil is in the details. Properly planning tasks and delegating them with the same amount of attention is one of the most important things when working with distributed teams. <blockquote>Appsilon is a proud <a href="https://appsilon.com/rstudio-certified-partner/" target="_blank" rel="noopener">Posit (RStudio) Full Service Certified Partner</a>. We help implement and scale Posit products to improve your data science needs.</blockquote> Working asynchronously is 'part and parcel' of Appsilon's working ideology. Each team member had their own working schedule, and hours when it was ideal to work. Overlooking these simple, but essential details in our planning caused some problems near the end of our 5-day long appsprint. Including the realization that some data was missing on the second to last day! But these were preventable if we had just planned appropriately according to our time zones. <h2 id="technical">Technical challenges for the Respiratory Shiny App</h2> The technical challenges you face in appsprints really depend on your project. In our case, it boiled down to implementing Appsilon's {rhino} package. It was the first time anyone on our 3-person team had used it! But we also faced some infrastructure and dataset issues along the way. <blockquote>Do you love Shiny, but want to build an app in Python? We recreated the same <a href="https://appsilon.com/pyshiny-demo/" target="_blank" rel="noopener">Respiratory Diseases app demo with PyShiny</a>.</blockquote> If you're worried about facing these challenges just remember, you're supposed to be challenged. The purpose of an appsprint is to grow in a short amount of time. <h3>{rhino} and Shiny app structure</h3> {rhino} is an amazing tool that ensures a very structured approach to creating Shiny applications. Despite the numerous advantages {rhino} gives to us, the disadvantages of our inexperience seem to compound as we jumped in expecting to force it to work the way we've always structured our Shiny apps as individuals. Reading through the documentation and getting help from our more experienced colleagues helped us power through. And along the way, we came up with suitable solutions for our problems. When in doubt - ask for help! <img class="alignnone size-full wp-image-15666" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01d122f2a080db43eb2be_rhino-app-file-structure-1.webp" alt="rhino app file structure" width="434" height="672" /> <h4>Main structure</h4> The app is simply structured, with a main.R (acting as a global.R + server.R + ui.R). It also contains two main modules (map.R and plot.R) that hold the map and graph visualizations respectively. These two modules call on a few separate R scripts (draw_map.R, make_plot.R, transformations.R) that process the logic for generating the related Leaflet maps and Plotly graphs. <h4>Info</h4> There is an additional "info.R" module that shows a modal dialog with extra information regarding the app, while "strings.R" and "components.R" offload some UI components to separate functions for reusability. <h4>Going out in style</h4> Finally, the styling for the app is done in the main.scss file. The main.scss automatically generates the required app.min.css file by running the rhino::build_sass() function provided by the {rhino} package. It's simple and easy! <blockquote>Is your app collecting dust? Add a breath of fresh air by <a href="https://appsilon.com/how-to-use-tailwindcss-in-shiny/" target="_blank" rel="noopener">using TailwindCSS in Shiny</a>!</blockquote> <h3>CI workflow with {rhino}</h3> The cool thing about {rhino} is that it has most CI-related things already set up for your project. One of these is CI GitHub Workflow (found in the .github/workflows directory) which enables you to write code - the right way - from the beginning. It helps you do this by linting and running tests with every push to the repository. This allows anyone to easily run a CI process without deep technical knowledge about GitHub Workflows. If you've never worked with continuous integration or continuous deployment, we've covered <a href="https://appsilon.com/build-a-ci-cd-pipeline-for-shiny-apps/" target="_blank" rel="noopener">how to build a CI/CD pipeline for Shiny apps</a>. The intro will give you a brief on the pros of having a CI/CD workflow. In the workflow, there is a step that handles system dependencies (in this case Linux packages) needed by specific R packages. Additional Linux packages may need to be added to the workflow to prevent the CI stage from failing. <pre><code> - name: Setup system dependencies run: > sudo apt-get install --yes libcurl4-openssl-dev </code></pre> In our case, we faced a problem with a newly added R package {<a href="https://cran.r-project.org/web/packages/terra/index.html" target="_blank" rel="nofollow noopener">terra</a>}. Terra needed specific system dependencies in order to work. Further investigation showed that we need not one, but several Linux packages installed in order to get it right. And so, we updated our CI GitHub Workflow accordingly: <pre><code> - name: Setup system dependencies run: > sudo apt-get install --yes libcurl4-openssl-dev libgdal-dev gdal-bin libproj-dev proj-data proj-bin libgeos-dev </code></pre> After the modification, we pushed the code to GitHub, triggering the workflow and.. our CI run succeeded! 🥳 <h3>Data gaps and data handling during Shiny appsprints</h3> While we were working on the application, we realized that we were missing some of the data. In a normal project timeline, this would be a minor setback, but with appsprints, the clock is ticking. And the dataset we had was missing some countries along with a gap in the PM2.5 data. <h4>Missing countries</h4> In our original dataset from the World Bank, there were some major countries missing. Frankly, this wouldn't be good for any visualization no matter if it was PoC or not. Our first step was to find another data source (we eventually found the OECD dataset) and merge it with our original dataset. But this was not without its own set of problems; there was a difference in data for countries that overlapped in the same years. There was no reasonable way we could have merged the data, and this led to a unique feature of the app - the Dataset Select. This allows the user to switch between the two data sources with a flip of a switch. Proving that sometimes, the best features come from a simple problem. <img class="wp-image-15668 aligncenter" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01d13c094bb3c4e866a3a_world-bank-and-oecd-data-overlap-in-appsprint-1.webp" alt="world bank and oecd data overlap in appsprint" width="500" height="375" /> <h4>Missing PM2.5</h4> Another problem we faced was that both datasets did not have the PM2.5 data for certain years. Upon exploring further, we realized that this data was recorded at a five-year interval between 1990-2010, and only from that point on was it available for each year. This simple observation showed us that prior to 2010, annual data recording was either not feasible (due to technological, financial, or human factors) or that it wasn't taken as seriously as in our current data-centric present. In any case, having more data gave us some insight into how quickly the problem of pollution on our planet has worsened or at the very least, how concern for this data had increased. <h2 id="result">R Shiny appsprint result</h2> The final app has a clean look, with a simple navbar-sidebar layout that highlights Appsilon brand colors. The custom navigation bar at the top has a simple design, with each element (besides the Appsilon logo and the Dataset Select switch) being a simple action button disguised as a navbar element. This allows us to trigger the intended visualization modules by using a simple observeEvent(). The "i" icon triggers the modal dialog popup that shows additional information about the app. The Dataset Select switch allows the user to switch the selected dataset between the World Bank and OECD data. This is reflected in both the Map and Graph visualizations as you can compare the earlier-mentioned missing data between the two datasets. <img class="size-full wp-image-15664 aligncenter" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01d15db456a59ff5bbaa3_appsilon-r-shiny-appsprint-map-respiratory-diseases-dashboard-1.webp" alt="appsilon r shiny appsprint map - respiratory diseases dashboard" width="1600" height="770" /> The Map visualization module uses the Leaflet package to visually show the correlation between the mortality rate (denoted in circles) versus the PM2.5 index (shown in the highlighted countries). Viewing through the map can give one a rough idea of how these two metrics correlate with one another in different parts of the world over the years. <img class="size-full wp-image-15662 aligncenter" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b01d1648b83fcd2e544396_appsilon-r-shiny-appsprint-graph-respiratory-diseases-dashboard-1.webp" alt="appsilon r shiny appsprint graph - respiratory diseases dashboard" width="1600" height="780" /> The Graph visualization module uses Plotly to draw out the graphs. The cool thing about the visualizations here is that you can stack multiple countries and compare the metrics from each country across a predefined range of years! <h2 id="improvements">Making improvements to Shiny apps following appsprints</h2> Appsprints in R and Shiny are great for learning and honing your skills as well as showcasing what <em>could be</em> very quickly. But unless you are a master planner and dream of electric sheep, there will be room for improvement. One of the main improvements we'd like to implement in the future is the efficiency and speed of drawing out the map and graphs. The app as it stands now generates the visualizations with a visible lag while processing the data. This could be significantly improved by spending more time optimizing the performance of the app. Another aspect would be the general UI/UX of the app. The time constraint and the previous issues we mentioned resulted in this aspect of the app falling by the wayside as development went on. As such, we just managed to cobble together some SASS/CSS code (that goes against pretty much every UI/UX coding best practice out there!) to give it the existing look and feel. <blockquote>Have a little more time to develop UI/UX? Follow these <a href="https://appsilon.com/ux-design-of-shiny-apps-7-steps-to-design-dashboards-people-love/" target="_blank" rel="noopener">7 steps to design better Shiny dashboards</a>.</blockquote> <h2 id="summary">Summing up R Shiny development in appsprints</h2> As fun and challenging as this appsprint was from a technical stance, we hope that in creating this Shiny app we can paint a better picture of the effects air pollution has on us. And to see the growing issue across the globe. Shiny is an excellent tool for sharing data and information and this topic is one that hits close to everyone. Extrapolating the data with the observations from this app created more questions than answers - which for us is a success. We've just started to explore a different landscape and now we have a tool to tap into the data. <h3>Recap of appsprint challenges and solutions: R Shiny development and distributed teams</h3> The technical challenges we faced during the development were mainly related to our lack of experience with {rhino}. We overcame this through inter-team discussions, reading the documentation, and asking more experienced colleagues for help. Our main challenge, as a distributed team, was in our staggered time zoning. Understanding this and taking into account our working hours (and getting used to asynchronous communication) was key to tackling this. You should treat your short-term appsprints just like you would a regular project (i.e. make sure everyone has a healthy work-life balance!). <h3 id="tips">Tips for distributed teams learned from an R Shiny development appsprint</h3> The most important thing to remember for distributed teams is to have tasks well planned and thought out ahead of time. Take into account time zone differences and working hour preferences. A few brainstorming sessions at the beginning to thoroughly dissect and pick apart the challenges of specific parts of the final product will prevent unexpected problems down the line. Particularly towards the end of the development phase! Another tip is to make sure to play to each team member's strengths and weaknesses. Be open and honest with what you are capable of doing and what knowledge you are lacking. Then work together to form tasks that are suitable for each team member's experience level. Doing something that you're good at will enable you to quickly complete those tasks and assist with other parts of the development process. All in all, making for a smooth workflow, a happy team, and a quality product. <hr /> This blog was co-authored by the Respiratory Disease appsprint team members: Deepansh Khurana, Arek Kalandyk, and Fabian Hee.