How environments work in R and what lazy evaluation is

Estimated time:
time
min

Knowledge of the way how R evaluates expressions is crucial to avoid hours of staring at the screen or hitting unexpected and difficult bugs. We’ll start with an example of an issue I came accross a few months ago when using the <code class="highlighter-rouge">purrr::map</code> function. To simplify, the issue I had: <label class="margin-toggle" for="wat">⊕</label><input id="wat" class="margin-toggle" type="checkbox" /><span class="marginnote"><img class="fullwidth" src="/blog-old/assets/article_images/2017-10-30-r-function-evaluation/wat.jpg" /> wat</span> <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">makePrintFunction</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">index</span><span class="p">)</span> <span class="p">{</span>  <span class="k">function</span><span class="p">()</span> <span class="p">{</span>    <span class="n">print</span><span class="p">(</span><span class="n">index</span><span class="p">)</span>  <span class="p">}</span> <span class="p">}</span> <br><span class="n">printFunctions</span> <span class="o">&lt;-</span> <span class="n">lapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">makePrintFunction</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="n">printFunctions</span><span class="p">[[</span><span class="m">2</span><span class="p">]]()</span> <span class="c1"># 2 </span> <span class="n">printFunctions</span> <span class="o">&lt;-</span> <span class="n">purrr</span><span class="o">::</span><span class="n">map</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">,</span>                             <span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">makePrintFunction</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="n">printFunctions</span><span class="p">[[</span><span class="m">2</span><span class="p">]]()</span> <span class="err">#</span> <span class="m">3</span></code></pre> </figure> Since I came across the issue, <code class="highlighter-rouge">purrr::map</code> <a href="https://github.com/tidyverse/purrr/commit/b041e7897bc882037b7b5044a53e585c217a9b5a" target="_blank" rel="noopener noreferrer">has changed</a> and this example no longer applies. To simulate it, let’s use a simplified implementation of <code class="highlighter-rouge">map</code> function. You should be able to just copy-paste the code in this article and run it: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">map</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">range</span><span class="p">,</span> <span class="n">functionToApply</span><span class="p">)</span> <span class="p">{</span>  <span class="n">result</span> <span class="o">&lt;-</span> <span class="n">vector</span><span class="p">(</span><span class="s2">"list"</span><span class="p">,</span> <span class="nf">length</span><span class="p">(</span><span class="n">range</span><span class="p">))</span>  <span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="k">in</span> <span class="n">range</span><span class="p">)</span> <span class="p">{</span>    <span class="n">result</span><span class="p">[[</span><span class="n">i</span><span class="p">]]</span> <span class="o">&lt;-</span> <span class="n">functionToApply</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>  <span class="p">}</span>  <span class="nf">return</span><span class="p">(</span><span class="n">result</span><span class="p">)</span> <span class="p">}</span> <br><span class="n">makePrintFunction</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">index</span><span class="p">)</span> <span class="p">{</span>  <span class="k">function</span><span class="p">()</span> <span class="p">{</span>    <span class="n">print</span><span class="p">(</span><span class="n">index</span><span class="p">)</span>  <span class="p">}</span> <span class="p">}</span> <br><span class="n">printFunctions</span> <span class="o">&lt;-</span> <span class="n">lapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">makePrintFunction</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="n">printFunctions</span><span class="p">[[</span><span class="m">2</span><span class="p">]]()</span> <span class="c1"># 2 </span> <span class="n">printFunctions</span> <span class="o">&lt;-</span> <span class="n">map</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">makePrintFunction</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="n">printFunctions</span><span class="p">[[</span><span class="m">2</span><span class="p">]]()</span> <span class="err">#</span> <span class="m">3</span></code></pre> </figure> <hr /> <strong>How to fix that?</strong> If you don’t already know to fix that issue, you’ll quickly find out. This is quite a common problem and the solution is to use the <code class="highlighter-rouge">force</code> function as follows: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">makePrintFunction</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">index</span><span class="p">)</span> <span class="p">{</span>  <span class="n">force</span><span class="p">(</span><span class="n">index</span><span class="p">)</span>    <span class="k">function</span><span class="p">()</span> <span class="p">{</span>    <span class="n">print</span><span class="p">(</span><span class="n">index</span><span class="p">)</span>  <span class="p">}</span> <span class="p">}</span> <br><span class="n">printFunctions</span> <span class="o">&lt;-</span> <span class="n">lapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">makePrintFunction</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="n">printFunctions</span><span class="p">[[</span><span class="m">2</span><span class="p">]]()</span> <span class="c1"># 2 </span> <span class="n">printFunctions</span> <span class="o">&lt;-</span> <span class="n">map</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">makePrintFunction</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="n">printFunctions</span><span class="p">[[</span><span class="m">2</span><span class="p">]]()</span> <span class="err">#</span> <span class="m">2</span></code></pre> </figure> <hr /> <strong>It works! But … why?</strong> This could be a great moment to just carry on - the problem is solved. You’ve heard about lazy evaluation and know that <code class="highlighter-rouge">force()</code> is useful in fixing such issues. But then again, what does lazy evaluation mean in this context? Let’s take a look at the magical <code class="highlighter-rouge">force</code> function. It consists of two lines: <label class="margin-toggle" for="Huh?">⊕</label><input id="Huh?" class="margin-toggle" type="checkbox" /><span class="marginnote"><img class="fullwidth" src="/blog-old/assets/article_images/2017-10-30-r-function-evaluation/whaa.jpg" /> Huh?</span> <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">force</span> <span class="c1"># function (x) # x # &lt;bytecode: 0x18e0920&gt; </span><span class="err">#</span> <span class="o">&lt;</span><span class="n">environment</span><span class="o">:</span> <span class="n">namespace</span><span class="o">:</span><span class="n">base</span><span class="o">&gt;</span></code></pre> </figure> Wait, what’s going on here? Does this mean that I can simply call <code class="highlighter-rouge">index</code> instead of <code class="highlighter-rouge">force(index)</code> and it will still work? <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">makePrintFunction</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">index</span><span class="p">)</span> <span class="p">{</span>  <span class="n">index</span>    <span class="k">function</span><span class="p">()</span> <span class="p">{</span>    <span class="n">print</span><span class="p">(</span><span class="n">index</span><span class="p">)</span>  <span class="p">}</span> <span class="p">}</span> <br><span class="n">printFunctions</span> <span class="o">&lt;-</span> <span class="n">lapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">makePrintFunction</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="n">printFunctions</span><span class="p">[[</span><span class="m">2</span><span class="p">]]()</span> <span class="c1"># 2 </span> <span class="n">printFunctions</span> <span class="o">&lt;-</span> <span class="n">map</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">makePrintFunction</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="n">printFunctions</span><span class="p">[[</span><span class="m">2</span><span class="p">]]()</span> <span class="err">#</span> <span class="m">2</span></code></pre> </figure> <hr /> <strong> Let’s get to the bottom of this</strong> There are two factors that cause the issue we are facing. The first one is lazy evaluation. The second is the way environments work in R. <strong>Lazy evaluation</strong> The way R works is that it doesn’t evaluate an expression when it is not used. Let’s take a look at an example that you can find in Hadley’s book <a href="http://adv-r.had.co.nz/Functions.html" target="_blank" rel="noopener noreferrer">http://adv-r.had.co.nz/Functions.html</a>: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">f</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{</span>  <span class="m">42</span> <span class="p">}</span> <br><span class="n">f</span><span class="p">(</span><span class="n">stop</span><span class="p">(</span><span class="s2">"This is an error!"</span><span class="p">))</span> <span class="c1"># 42 </span> <span class="n">f</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{</span>  <span class="n">force</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>  <span class="m">42</span> <span class="p">}</span> <br><span class="n">f</span><span class="p">(</span><span class="n">stop</span><span class="p">(</span><span class="s2">"This is an error!"</span><span class="p">))</span> <span class="err">#</span> <span class="n">Error</span> <span class="k">in</span> <span class="n">force</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="o">:</span> <span class="n">This</span> <span class="n">is</span> <span class="n">an</span> <span class="n">error</span><span class="o">!</span></code></pre> </figure> Another useful example to better understand that expressions are evaluated at the moment they are used: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">printLabel</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">label</span> <span class="o">=</span> <span class="n">toupper</span><span class="p">(</span><span class="n">x</span><span class="p">))</span> <span class="p">{</span>    <span class="n">x</span> <span class="o">&lt;-</span> <span class="s2">"changed"</span>    <span class="n">print</span><span class="p">(</span><span class="n">label</span><span class="p">)</span> <span class="p">}</span> <br><span class="n">printLabel</span><span class="p">(</span><span class="s2">"original"</span><span class="p">)</span> <span class="c1"># CHANGED </span> <span class="n">printLabel</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">label</span> <span class="o">=</span> <span class="n">toupper</span><span class="p">(</span><span class="n">x</span><span class="p">))</span> <span class="p">{</span>    <span class="n">force</span><span class="p">(</span><span class="n">label</span><span class="p">)</span>      <span class="n">x</span> <span class="o">&lt;-</span> <span class="s2">"changed"</span>    <span class="n">print</span><span class="p">(</span><span class="n">label</span><span class="p">)</span> <span class="p">}</span> <br><span class="n">printLabel</span><span class="p">(</span><span class="s2">"original"</span><span class="p">)</span> <span class="err">#</span> <span class="n">ORIGINAL</span></code></pre> </figure> <label class="margin-toggle" style="font-size: 0.8em; text-decoration: underline;" for="promises-note"><i class="fa fa-sticky-note" aria-hidden="true"></i> sticky note</label><input id="promises-note" class="margin-toggle" type="checkbox" /><span class="marginnote">Please note that promises mentioned here are something different than promises package used to handle concurrent computations. </span> These semantics are described in R language definition <a href="https://cran.r-project.org/doc/manuals/r-patched/R-lang.html#Argument-evaluation" target="_blank" rel="noopener noreferrer">R language definition</a>: <blockquote>The mechanism is implemented via promises. When a function is being evaluated the actual expression used as an argument is stored in the promise together with a pointer to the environment the function was called from. When (if) the argument is evaluated the stored expression is evaluated in the environment that the function was called from. Since only a pointer to the environment is used any changes made to that environment will be in effect during this evaluation. The resulting value is then also stored in a separate spot in the promise. Subsequent evaluations retrieve this stored value (a second evaluation is not carried out).</blockquote> <hr /> <strong>How environments work</strong> Every function object has an environment assigned when it is created. Let’s call it environment A. When the function is invoked, a new environment is created and used in the function call. This new environment inherits from environment A. <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">a</span> <span class="o">&lt;-</span> <span class="m">1</span> <span class="n">f</span> <span class="o">&lt;-</span> <span class="k">function</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="p">{</span>  <span class="n">a</span> <span class="o">&lt;-</span> <span class="n">a</span> <span class="o">+</span> <span class="m">1</span>  <span class="n">a</span>                 <span class="c1"># &lt;-- debug here </span><span class="p">}</span> <br><span class="n">f</span><span class="p">(</span><span class="m">5</span><span class="p">)</span> <br><span class="c1"># Browse[2]&gt; environment(f) # &lt;environment: R_GlobalEnv&gt; # # Browse[2]&gt; environment(f)[["a"]] # [1] 1 # # Browse[2]&gt; environment() # &lt;environment: 0x3fa2db0&gt; # # Browse[2]&gt; environment()[["a"]] </span><span class="err">#</span> <span class="p">[</span><span class="m">1</span><span class="p">]</span> <span class="m">6</span></code></pre> </figure> This is what the environments hierarchy is at this point: <img src="/blog-old/assets/article_images/2017-10-30-r-function-evaluation/f_environment.png" alt="Environments hierarchy" /> <hr /> <strong>How does our example work without force</strong> <iframe src="https://docs.google.com/presentation/d/e/2PACX-1vSKFPfZYFTrwk80iVABBOR-FFhHMkRWqmTNK3F7Nh4tHI2HKqX2whfhdWtz_nbZuDCWye_ieY89twAG/embed?start=false&amp;loop=false&amp;delayms=3000" width="100%" height="400px" frameborder="0" allowfullscreen="allowfullscreen"></iframe> Environment 0x3fa2db2 inherits from mpfEnv and points to <code class="highlighter-rouge">index</code> variable which is stored in 0x3fa2db0. <code class="highlighter-rouge">index</code> variable is not going to be copied to environment 0x3fa2db2 until it is used there. <hr /> <strong>How does our example work with force</strong> <iframe src="https://docs.google.com/presentation/d/e/2PACX-1vTrd_zP2jASGGStgE3afxNNoiroWOVp_QHkbriEgs58ARDdvENemU1iMoEGReUPUbRIcY6NnDrUmrDr/embed?start=false&amp;loop=false&amp;delayms=3000" width="100%" height="400px" frameborder="0" allowfullscreen="allowfullscreen"></iframe> <hr /> You shouldn’t come across this issue while using most high-order functions: R 3.2.0 (2015) changelog: <ul><li>Higher-order functions such as the apply functions and Reduce()now force arguments to the functions they apply in order toeliminate undesirable interactions between lazy evaluation andvariable capture in closures. This resolves PR#16093.</li></ul> Purrr issue fixed in March 2017: https://github.com/tidyverse/purrr/issues/191 <strong>I hope this knowledge will save you some time if you stumble upon such issues in the future.</strong> <hr /> Until next time!

Contact us!
Damian's Avatar
Damian Rodziewicz
Head of Sales
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
r
tutorial
infrastructure