rstudio::conf 2020 Takeaways: Updates to tidyeval, Shiny/Sass, shinymeta, parallel processing in R, and more…
The RStudio Conference in San Francisco last month was an amazing experience, and I will definitely attend again next year in Orlando. In this post, Pedro Coutinho Silva and I will share some packages and trends from the conference that might interest members of the R community who were unable to attend.
I’ll start with Joe Cheng’s presentation about styling Shiny apps. Joe demonstrated how the Sass package can improve workflows. It was important for me to see that beautiful user interface in Shiny has become a significant topic. I was particularly excited about the option
which automatically adjusts the colors of your ggplot to fit your Shiny dashboard theme. This is interesting because the plot is an image that can’t be styled with CSS rules. Joe Cheng received a great reaction from the audience when he demonstrated how it works. He set the plot option to true, and the plot magically changed its style to that of the whole application, which was styled with Sass. From a data scientists’ perspective, this is super useful because they don’t have to think about styling a specific image.
Pedro: Another small detail from Sass — which I think makes it a good package for data scientists — is that it allows you to pass variables to Sass. So you can pass variables that you have in your R code to Sass and use those values to style some color and other attributes. This is the detail that makes R/Sass something that data scientists will actually want to use. They already know how to call that function because they’ve called functions like this many times.
Pawel: The tidyeval package is very useful. The concept of metaprogramming is powerful and with the recent improvements in the package, especially the
(“curly curly” operator), it is much more convenient to use. For a quick introduction into the tidyeval concept I recommend Hadley Wickham’s video: https://www.youtube.com/watch?v=nERXS3ssntw
Pedro: Tidyeval — they did something really powerful here. In the past their approach was perhaps too complicated for data scientists. So people weren’t actually using it because it was too involved. This time around, they didn’t create a whole new package; instead, they changed something major that people were complaining about. They changed tidyeval into a tool that is really easy to use, even for people who don’t have a technical background. The big difference is that before, tidyeval was this really strange set of functions with an overly complicated syntax. It made no sense to data scientists because it looked complex, but if you instead just say “oh, you need to put your variable into something like this,” then people understand it. Now it’s much easier to use.
Pawel: I also liked the update from RStudio about version 1.3 of their IDE. Two enhancements caught my attention: (1) they added features for accessibility (especially for users with partial blindness), and (2) we will now have a portable configuration file.
Putting R in production was a popular topic at the conference. The T-Mobile team shared their story about building machine learning models and how they test performance. They created their own package called loadtest. They also created a website to chronicle their work: putrinprod.com. I like that productionizing R has become an important topic, only partly because our team at Appsilon excels in productionization and creating new packages.
I also enjoyed the reproducibility topic addressed by the shinymeta package. I like the whole concept of the ability to reproduce calculations on server side logic in Shiny apps. With shinymeta, you grab the code of specific server reactives and you can easily run this code outside of the Shiny runtime. I need a hackathon to explore it a bit!
Ursa Labs shared a benchmark for reading and writing from Apache Parquet files. In their benchmark, Apache Parquet files could be read almost as fast as Feather files but the Parquet files were 30x smaller. In their comparison, Feather files were 4GB and Parquet files were 100MB. This is a massive difference. I remember in previous projects that we switched to Feather because reading the files was super fast, but the disadvantage was that Feather files were always prohibitively large. So this is an interesting development.
Pedro: At the conference, I also learned about the package Chromote. It allows you to run a Chrome session in R. You can start a new application, such as a Shiny dashboard. You can run specific scenarios and take a screenshot of the state of your application. This could be really cool and useful for testing.
Finally, I was interested in an e-poster session for the PAWS package. This is an abstraction of the API for AWS. From an R console you can manage your AWS resources. For example, you can start a new EC2 instance or a new batch job — basically everything you usually do with Ansible. This is super useful for data scientists who work mainly with R.
It was an exciting conference to say the least. I’m looking forward to the next rstudio::conf in Orlando on January 18-21!
Thanks for reading. For more, follow me on Twitter.