Why R? Warsaw 2019 Recap

Reading time:

time

min

October 4, 2019

I’m after an exhausting yet exciting weekend with the <a href="http://whyr.pl/2019/">WhyR?</a> conference. That was the third edition, this year held in Warsaw, and it is nice to see how it grows each year. The Appsilon Data Science team really appreciates the initiative and the professionalism of organisation, thus we decided to be sponsors and prepare two talks during the conference! As it is getting bigger, this time the initiator and founder of the WhyR? foundation Marcin Kosiński (<a href="https://twitter.com/kosinski_rblog">@kosinski_rblog</a>) was helped by Michał Burdukiewicz (<a href="https://twitter.com/burdukiewicz">@burdukiewicz</a>) and Piotr Wójcik - head of the Data Science Lab at the Faculty of Economic Sciences of the University of Warsaw - the venue of this year’s conference. It was a nice nostalgic trip for me personally, as I graduated in econometrics and computer science at this faculty. I’m glad that since then the place was renovated and conference participants may enjoy it fully. <h2>WhyR Workshops</h2> The conference started for me at Friday’s workshop session. Participants can choose from a variety of topics, including deep learning, C++ in R, or Explainable AI (XAI). As I’m more focused on Shiny development these days, I wanted to catch up with data science so I chose “machine learning pipelines with <a href="https://github.com/mlr-org/mlr3">{mlr3}</a>” by Jakob Richter (<a href="https://twitter.com/jak0br">@jak0br</a>) and Patrick Schratz (<a href="https://twitter.com/pjs_228">@pjs_228</a>). That was super useful! I was not aware of how the <a href="https://github.com/ropensci/drake">{drake}</a> package can simplify the project workflow. <blockquote class="twitter-tweet"> <p dir="ltr" lang="en">Thank you <a href="https://twitter.com/pjs_228?ref_src=twsrc%5Etfw">@pjs_228</a> for great workshop on `drake` package during <a href="https://twitter.com/whyRconf?ref_src=twsrc%5Etfw">@whyRconf</a> ! ? ? Super smooth with using `usethis::use_course("mlr-org/mlr3-learndrake")` for sharing the materials ?I would love to see it in use on all <a href="https://twitter.com/hashtag/rstats?src=hash&ref_src=twsrc%5Etfw">#rstats</a> conferences, lectures and workshops ?</p> — Marcin Dubel (@DubelMarcin) <a href="https://twitter.com/DubelMarcin/status/1177534544897085440?ref_src=twsrc%5Etfw">September 27, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> In the afternoon I joined the spatial data analysis workshop conducted by an expert in the field, Jakub Nowosad (<a href="https://twitter.com/jakub_nowosad/">@jakub_nowosad</a>), who showed us the great features of the <a href="https://cran.r-project.org/web/packages/tmap/vignettes/tmap-getstarted.html">{tmap}</a> package. If you’re interested in that field check the recent book on <a href="https://geocompr.robinlovelace.net/">geocomputation with R</a> available for free online. <blockquote class="twitter-tweet"> <p dir="ltr" lang="en">?️Thank you <a href="https://twitter.com/jakub_nowosad?ref_src=twsrc%5Etfw">@jakub_nowosad</a> for showing `tmap` package on <a href="https://twitter.com/whyRconf?ref_src=twsrc%5Etfw">@whyRconf</a> ?? It's basically <a href="https://twitter.com/hashtag/ggplot?src=hash&ref_src=twsrc%5Etfw">#ggplot</a> for maps! I remember doing spatial analysis back in 2013 and it is great to see such progress in maps visualisation tools! ? <a href="https://twitter.com/hashtag/rstats?src=hash&ref_src=twsrc%5Etfw">#rstats</a> <a href="https://t.co/plkXnHG3ah">https://t.co/plkXnHG3ah</a></p> — Marcin Dubel (@DubelMarcin) <a href="https://twitter.com/DubelMarcin/status/1177577242899243011?ref_src=twsrc%5Etfw">September 27, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> There was also one special workshop that lasted the whole day, “modern Generalized Additive Models” by Matteo Fasiolo (<a href="https://twitter.com/fasiolo1985">@fasiolo1985</a>). The course comes highly recommended, I wish I bilocated to be there! <h2>WhyR Lectures</h2> <h3>Saturday</h3> The keynotes were all great, as well as some of the regular talks. There were some presentations prepared by the students, and it was great to see such ambitious projects they’re delivering and how they're launching their conference careers. Let me go through the most interesting presentations. The first keynote was a strong kick-off by <a href="https://www.researchgate.net/profile/Marvin_Wright2">Marvin Wright</a>, the author of the <a href="https://github.com/imbs-hl/ranger">{ranger}</a> package. In the era of deep neural networks and the accompanying image recognition and deep fake videos hype it was super useful to be reminded that random forests are a simple yet powerful tool for down to earth data analysis. It deals greatly with noisy, high dimensional data and provides some interpretability with variables importance metrics. On the other hand, in production solutions there might be problems with performance - the predictions from random forest models are generated quite slowly. I loved the “mythbusters” format of the talk - going through the opinions on the random forests and confirming/denying them based on rigorous analysis. Another great keynote was given by Jakub Nowosad (<a href="https://twitter.com/jakub_nowosad/">@jakub_nowosad</a>) on the current challenges in the field of geo analysis, including map distortion. Did you know that our world without all of its water is potato-shaped? :)<img class="wp-image-2867 size-full" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b022c3b90a6b94c7bdd523_rotating_3d_globe_Geoid_height.webp" alt="MATLAB script for 3D visualizing geodata on a rotating globe" width="780" height="650" /> Source: <a href="https://www.asu.cas.cz/~bezdek/vyzkum/rotating_3d_globe/index.php">https://www.asu.cas.cz/~bezdek/vyzkum/rotating_3d_globe/index.php</a> <h3>Appsilon Talk</h3> The most important event at the Saturday for the Appsilon team was Dr. Ken Benoit (<a href="https://twitter.com/kenbenoit?lang=en">@kenbenoit</a>) of the London School of Economics and Damian Rodziewicz's (<a href="https://twitter.com/d_rodziewicz?lang=en">@D_Rodziewicz</a>) talk about the <a href="https://quanteda.io/">{quanteda}</a> package for textual analysis. The package itself is a great tool, but I really admire Ken Benoit's idea to share the possibilities that it gives to non-R users, or even non-programmers, via a user-friendly Shiny app prepared by the Appsilon team. Stay tuned for news about the upcoming release! <img class="size-full wp-image-2868" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b022c4386245cb4e120eef_Quanteda-presentation-WhyR-Warsaw-2019-e1570180482442.webp" alt="Quanteda presentation WhyR Warsaw 2019" width="3648" height="5472" /> Dr. Kenneth Benoit and Damian Rodziewicz present <a href="https://github.com/quanteda/quanteda">quanteda</a> at WhyR <h3>Sunday</h3> The day started with a not only excellent but also important talk by Steph Locke (<a href="https://twitter.com/thestephlocke?lang=en">@TheStephLocke</a>) about data scientists’ responsibilities to society. It's important to remember that at the end of the day, models and predictions may affect actual human lives. One smart take-away: when presenting model performance to business people, use not only metrics, but also demos of real use cases and check if the decision makers are fine with the models' decisions. Steph Locke's speech was further expanded by Appsilon member Olga Mierzwa-Sulima (<a href="https://twitter.com/olga_mie?lang=en">@olga_mie</a>) in her talk about traits of world class scientists. Getting the results into useful solutions is a key factor! The talk was really well received and Olga got lots of questions. I guess the whole community is eager to see her give another speech ASAP. <blockquote class="twitter-tweet"> <p dir="ltr" lang="en">Good takeout from <a href="https://twitter.com/olga_mie?ref_src=twsrc%5Etfw">@olga_mie</a> talk at <a href="https://twitter.com/hashtag/whyR2019?src=hash&ref_src=twsrc%5Etfw">#whyR2019</a>: "A model without application is useless"</p> — Colin Fay ? (@_ColinFay) <a href="https://twitter.com/_ColinFay/status/1178226191272165376?ref_src=twsrc%5Etfw">September 29, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> A similar approach was supported by Wit Jakuczun (<a href="https://twitter.com/witjakuczun?lang=en">@WitJakuczun</a>). Always be deploying! Delivering production-ready solutions creates the value for the client, not just a model, but also the environment, tests, deployment, and continuous integration. Also worth mentioning is for sure Colin Gillespie’s (<a href="https://twitter.com/csgillespie?lang=en">@csgillespie</a>) talk about secure R code. The whole audience laughed at people who get hacked by really simple tricks. Don’t be the one! Since this talk I will check whether I spell “bioconductor” correctly three times each time! A great warning is that the biggest threat to any system is ourselves. I also enjoyed the talks given by Theo Roe (<a href="https://twitter.com/theojrivers1?lang=en">@theoJRivers1</a>), who showed us an amazing app about analyzing the water metrics in British rivers, as well as the presentation by Pablo Maldonado that featured video analysis to improve softball player performance. Last but not least, the organisation was great, the coffee tasted good, the lunches were nice, and the party was an excellent opportunity for networking. I’m eagerly awaiting next year’s conference! Thanks for reading! If you have photos or links to your presentations, add it to the comments below and/or ping me on Twitter <a href="https://twitter.com/DubelMarcin">@dubelmarcin</a>. You can catch Damian Rodziewicz (<a href="https://twitter.com/D_Rodziewicz">D_Rodziewicz</a>) at <a href="http://summit.datamass.io/">DataMass Gdansk</a> presenting on "How to efficiently use huge satellite imagery datasets with Machine Learning." If you are interested in Shiny application development, you can also check out my series <a href="https://appsilon.com/super-solutions-for-shiny-architecture-1-of-5-using-session-data/">Super Solutions for Shiny Architecture.</a> And don't forget to sign up for our newsletter!

Have questions or insights?

Engage with experts, share ideas and take your data journey to the next level!

Stop Struggling with Outdated Clinical Data Systems

Join pharma data leaders from Jazz Pharmaceuticals and Novo Nordisk in our live podcast episode as they share what really works when building modern, compliant Statistical Computing Environments (SCEs).

Save My Spot

Is Your Software GxP Compliant?

Download a checklist designed for clinical managers in data departments to make sure that software meets requirements for FDA and EMA submissions.

Get the Checklist

Ensure Your R and Python Code Meets FDA and EMA Standards

A comprehensive diagnosis of your R and Python software and computing environment compliance with actionable recommendations and areas for improvement.

Book the Audit

Why R? Warsaw 2019 Recap

Have questions or insights?

Stop Struggling with Outdated Clinical Data Systems

Is Your Software GxP Compliant?

Ensure Your R and Python Code Meets FDA and EMA Standards

Share Your Data Goals with Us