Can we use a neural network to generate Shiny code?

By:
Dominik Krzemiński
August 12, 2019

Many <a href="https://www.businessmodelsinc.com/machines/" target="_blank" rel="nofollow noopener noreferrer">news reports</a> scare us with machines taking over our jobs in the not too distant future. Common examples of take-over targets include professions like truck drivers, lawyers and accountants. In this article we will explore how far machines are from replacing us (R programmers) in writing Shiny code. Spoiler alert: you should not be worried about your obsolescence right now. You will see in a minute that we’re not quite there yet. I’m just hoping to show you in an entertaining way some easy applications of a simple model of a recurrent neural network implemented in an <a href="https://keras.rstudio.com/" target="_blank" rel="nofollow noopener noreferrer">R version</a> of <a href="https://keras.io/" target="_blank" rel="nofollow noopener noreferrer">Keras</a> Let’s formulate our problem once again precisely: <strong><i>we want to generate Shiny code character by character with a neural network</i></strong>. <h2><b>Background</b></h2> To achieve that we would need a <b>recurrent neural network</b> (RNN). By definition such a network does a pretty good job with <b>time series</b>. Right now you might be asking yourself, what?  We defined our problem as a text mining issue; where is temporal dependency here?! Well, imagine a programmer typing characters on his/her keyboard, one by one, every <b>time step</b>. It would also be nice if our network captured long-range dependencies such as, for instance, a curly bracket in the 1021st line of code that can refer to a “for” loop from  line 352 (that would be a long loop though). Fortunately, RNNs are perfect for that because they can (in theory) memorize the influence of a signal from the distant past to a present data sample. I will not get into details on how recurrent neural networks work here, as I believe that there are a lot of fantastic resources <a href="https://colah.github.io/posts/2015-08-Understanding-LSTMs/" target="_blank" rel="nofollow noopener noreferrer">online elsewhere.</a> Let me just briefly mention that some of the regular recurrent networks suffer from a <i>vanishing gradient</i> problem. As a result, networks with such architectures are notoriously difficult to train. That’s why machine learning researchers started looking for more robust solutions. These are provided by a gating mechanism that helps to teach a network long-term dependencies. The first such solution was introduced in 1997 as a <b>Long Short Term Memory</b> neuron (LSTM). It consists of three gates: <i>input</i>, <i>forget</i> and <i>output,</i> that together prevent the gradient from vanishing in further time steps. A simplified version of LSTM that still achieves good performance is the <b>Gated Recurrent Unit</b> (GRU) introduced in 2014. In this solution, <i>forget</i> and <i>input</i> gates are merged into one <i>update</i> gate. In our implementation we will use a layer of GRU units. Most of my code relies on an excellent example from Chapter 8 in <a href="https://www.manning.com/books/deep-learning-with-r">Deep Learning with R</a> by François Chollet. I recommend this book wholeheartedly to everyone interested in practical basics of neural networks. Since I think that François can explain to you his implementation better than I could , I’ll just leave you with it and get to the part I modified or added. <h2><b>Experiment</b></h2> Before we get to the model, we need some training data. As we don’t want to generate just  any code, but specifically Shiny code , we need to find enough training samples. For that, I scraped the data mainly from this official <a href="https://github.com/rstudio/shiny-examples/">shiny examples repository</a> and added some of our <a href="https://github.com/Appsilon/shiny.semantic/tree/develop/examples">semantic examples</a>. As a result I generated 1300 lines of Shiny code. Second, I played with  several network architectures and looked for a balance between speed of training, accuracy and model complexity. After some experiments, I found a suitable network for our purposes: <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r">model &lt;- keras_model_sequential() %&gt;%  layer_gru(units = 100, input_shape = c(maxlen, length(chars))) %&gt;%  layer_dense(units = length(chars), activation = "softmax")   </code></pre> </figure> (BTW If you want to find out more about Keras in R, I invite you to take a look at a nice introduction by<a href="https://appsilon.com/ship-recognition-in-satellite-imagery-part-i/"> Michał</a>). <p style="text-align: left;">I trained the above model for 50 epochs with a learning rate of 0.02. I experimented with different values of a temperature parameter too. Temperature is used to control the randomness of a prediction by scaling the logits (output of a last layer) before applying the softmax function. To illustrate, let’s  have a look at the output of the network predictions with temperature = 0.07.</p> <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r"> nput$n)  })  # gen dorre   out: t", " ras &lt;-         ss)    },            # l imat")  })  }  #    tageerl  itht oimang =               shndabres(h4t 1")    }) sses$ypabs viog hthewest onputputpung w do panetatstaserval = 1alin  hs &lt;----- geo verdpasyex(")  }) send tmonammm(asera d ary vall wa  g   xb =1iomm(dat_ngg( ----dater(  # fu t coo    ------  1ang aoplono----i_dur d"),                           o tehing 1    ch        mout   = cor;")o})     &lt;- t     &lt;-         coan t                         d  i </code></pre> </figure> and with temperature = 1: <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r"> filectinput &lt;- ren({        # goith an htmm the oblsctr th verichend dile distr(input$dateretcaption$print_om &lt;- ren({      th cond filen(io outputs  # the ion tppet chooww.h vichecheckboartcarp" = show(dy),                               simptect = select)  })  })    # funutput$datetable &lt;- ren({    heag(heig= x(input$obr))  })  ) )  function({  suiphed =  simplenter = "opter")            )  ) ) </code></pre> </figure> I think that both examples are already quite impressive, given the limited training data we had. In the first case, the network is more confident about its choices but also quite prone to repetitions (many spaces follow spaces, letters follow letters and so on). The latter, from a long, loooong distance looks way closer to Shiny code. Obviously, it’s still gibberish, but look! There is a nice function call <span style="color: #0099f9;">heag(heig= x(input$obr))</span>, object property <span style="color: #0099f9;">input$obr</span>, comment <span style="color: #0099f9;"># goith</span> and even variable assignment <span style="color: #0099f9;">filectinput &lt;- ren({</span>. Isn’t that cool? Let’s have a look now at the evolution of training after 5 epochs: <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r"> finp &lt;- r ctived = "text",                                    "dachintion &lt;- pepristexplet({ ut &lt;- fendertext({    ftable('checkbs cutpanel changlis input dowcter selecter base bar ---- </code></pre></figure> 10 epochs: <figure class="highlight"><pre class="language-r"><code class="language-r" data-lang="r"> # data   &lt;- pasht(brch(null)      ]   ),        # input$c a couten quift to col c( expctfrlren beteracing changatput: pp--))    }) </code></pre></figure> 20 epochs: <figure class="highlight"><pre class="language-r"><code class="language-r" data-lang="r"> fine asc i) {      foutput("apputc"in"),    text(teat(input$contrs)  # th  render a number butt summaryerver pation ion tre  # chapte the gendate ion. bhthect.hate whtn hblo   </code></pre></figure> As you can see, after each training the generated text becomes increasingly structured. <h2>Final Thoughts</h2> I appreciate that some of you might not be as impressed as I was. Frankly speaking, I almost hear all of these Shiny programmers saying: “Phew… my job is secure then!" Yeah, yeah, sure it is... For now! Remember that these models will probably improve over time. I  challenge you to play with different architectures and train some better models based on this example. And for completeness, here’s the code I used to generate the fake Shiny code above: <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r">library(keras) library(stringr) <br>path &lt;- "shinyappstextdata.dat"                       # input data path text &lt;- tolower(readChar(path, file.info(path)$size)) # loading data <br># ------------------  Data preprocessing <br>maxlen &lt;- 20 step &lt;- 3 <br>text_indexes &lt;- seq(1, nchar(text) - maxlen, by = step) sentences &lt;- str_sub(text, text_indexes, text_indexes + maxlen - 1) next_chars &lt;- str_sub(text, text_indexes + maxlen, text_indexes + maxlen) <br>cat("Number of sequences: ", length(sentences), "\n") <br>chars &lt;- unique(sort(strsplit(text, "")[[1]])) cat("Unique characters:", length(chars), "\n") <br>char_indices &lt;- 1:length(chars) names(char_indices) &lt;- chars <br>cat("Vectorization...\n") x &lt;- array(0L, dim = c(length(sentences), maxlen, length(chars))) y &lt;- array(0L, dim = c(length(sentences), length(chars))) <br>for (i in 1:length(sentences)) {  sentence &lt;- strsplit(sentences[[i]], "")[[1]]  for (t in 1:length(sentence)) {    char &lt;- sentence[[t]]    x[i, t, char_indices[[char]]] &lt;- 1  }  next_char &lt;- next_chars[[i]]  y[i, char_indices[[next_char]]] &lt;- 1 } <br># ------------------  RNN model training <br>model &lt;- keras_model_sequential() %&gt;%  layer_gru(units = 100, input_shape = c(maxlen, length(chars))) %&gt;%  layer_dense(units = length(chars), activation = "softmax") <br>optimizer &lt;- optimizer_rmsprop(lr = 0.02) model %&gt;% compile(  loss = "categorical_crossentropy",  optimizer = optimizer )   <br>model %&gt;% fit(x, y, batch_size = 128, epochs = 50) <br># ------------------  Predictions evaluation <br>sample_next_char &lt;- function(preds, temperature = 1.0) {  preds &lt;- as.numeric(preds)  preds &lt;- log(preds) / temperature  exp_preds &lt;- exp(preds)  preds &lt;- exp_preds / sum(exp_preds)  which.max(t(rmultinom(1, 1, preds))) } <br>nr_of_character_to_generate &lt;- 500 <br> start_index &lt;- sample(1:(nchar(text) - maxlen - 1), 1)   seed_text &lt;- str_sub(text, start_index, start_index + maxlen - 1) <br>temperature &lt;- 1.0 cat(seed_text, "\n") generated_text &lt;- seed_text for (i in 1:nr_of_character_to_generate) {    sampled &lt;- array(0, dim = c(1, maxlen, length(chars)))  generated_chars &lt;- strsplit(generated_text, "")[[1]]  for (t in 1:length(generated_chars)) {    char &lt;- generated_chars[[t]]    sampled[1, t, char_indices[[char]]] &lt;- 1  }    preds &lt;- model %&gt;% predict(sampled, verbose = 0)  next_index &lt;- sample_next_char(preds[1,], temperature)  next_char &lt;- chars[[next_index]]    generated_text &lt;- paste0(generated_text, next_char)  generated_text &lt;- substring(generated_text, 2)    cat(next_char) } </code></pre> </figure> You can find me on Twitter <a href="https://twitter.com/dokatox" target="_blank" rel="nofollow noopener noreferrer">@dokatox</a>

Have questions or insights?

Engage with experts, share ideas and take your data journey to the next level!
Explore Possibilities

Share Your Data Goals with Us

From advanced analytics to platform development and pharma consulting, we craft solutions tailored to your needs.

Talk to our Experts
r
tutorials
shiny dashboards
ai&research